WEBVTT WEBVTT Kind: captions Language: en 00:00:00.570 --> 00:00:05.380 Principal component analysis is a statistical technique that is related to a factor analysis 00:00:05.380 --> 00:00:08.760 and commonly confused with the factor analysis. 00:00:08.760 --> 00:00:15.089 What principal component analysis does it tries to summarize the variables into smaller 00:00:15.089 --> 00:00:20.310 set of sums - weighted sums of the variables called components. 00:00:20.310 --> 00:00:27.009 And it's a more data relaxed technique concerned about how we can reduce the number of variables 00:00:27.009 --> 00:00:29.500 without deleting information from the data. 00:00:29.500 --> 00:00:36.300 It doesn't answer the question what do the indicators have in common - at least not directly. 00:00:36.300 --> 00:00:41.910 It's not a very useful technique for assessing measurement models because in principal component 00:00:41.910 --> 00:00:46.409 analysis it considers all variances in the data. 00:00:46.409 --> 00:00:51.640 In factor analysis only the common variance is considered. 00:00:51.640 --> 00:00:58.050 What that means is that a principal component analysis also tries to explain the unreliability 00:00:58.050 --> 00:01:03.670 of the indicators whereas in factor analysis we try to take the unreliability and other 00:01:03.670 --> 00:01:10.060 unique aspects of the indicators and eliminate those so that we can extract what is common 00:01:10.060 --> 00:01:11.780 between the indicators. 00:01:11.780 --> 00:01:19.189 In practice if you use a factor loading as an estimate of indicator reliability - that 00:01:19.189 --> 00:01:21.750 is ok with some assumptions. 00:01:21.750 --> 00:01:27.909 If you use the component loading in as an estimate of individual indicator reliability 00:01:27.909 --> 00:01:33.049 then reliability is severely overestimated. 00:01:33.049 --> 00:01:39.130 The same thing if you apply so called Harmon's single factor test to assess whether one factor 00:01:39.130 --> 00:01:45.500 can explain the intercorrelation in the data and that would be evidence of common method 00:01:45.500 --> 00:01:48.040 problem applying a component analysis. 00:01:48.040 --> 00:01:54.840 The factor analysis will in practically never indicate that you have a common method variance 00:01:54.840 --> 00:01:57.170 problem even if you actually do. 00:01:57.170 --> 00:02:00.640 So this is not a substitute for a factor analysis. 00:02:00.640 --> 00:02:05.860 It's not a factor analysis technique and it's a data summary technique instead. 00:02:05.860 --> 00:02:06.860 So why do... 00:02:06.860 --> 00:02:09.580 It's not very useful one with measurement. 00:02:09.580 --> 00:02:12.990 So why do people use principal component analysis? 00:02:12.990 --> 00:02:21.540 The reason is that when you use SPSS and you do a factor analysis from the menu - you get 00:02:21.540 --> 00:02:23.890 the dialogue that looks like that. 00:02:23.890 --> 00:02:29.890 Then when you check on the factor extractor button here - it gives you different factor 00:02:29.890 --> 00:02:31.080 analysis techniques. 00:02:31.080 --> 00:02:34.410 So it can estimate the factor model in different ways. 00:02:34.410 --> 00:02:37.580 The default is to do principal component analysis. 00:02:37.580 --> 00:02:40.630 And that's not a factor analysis technique. 00:02:40.630 --> 00:02:45.700 There are the others whether you use principal axis factor in maximum likelihood or minimal 00:02:45.700 --> 00:02:51.370 residual - it doesn't matter but because they all estimate the factor analysis model. 00:02:51.370 --> 00:02:55.700 Principal component analysis is not a factor analysis model because it doesn't discover 00:02:55.700 --> 00:02:59.650 underlined dimensions instead it summarizes the data. 00:02:59.650 --> 00:03:05.180 There are really no good reasons to use principal component analysis in social science research 00:03:05.180 --> 00:03:09.350 because a factor analysis can be used to summarize data. 00:03:09.350 --> 00:03:14.510 So if you just want to summarize your indicators with a smaller number of summed variables 00:03:14.510 --> 00:03:21.010 weighted sums - then factor analysis and principal component analysis will give you a pretty 00:03:21.010 --> 00:03:22.480 similar solutations. 00:03:22.480 --> 00:03:27.850 If you want to assess whether underlying dimension explains the data - then factor analysis will 00:03:27.850 --> 00:03:33.090 give you the correct solution undercertain assumptions - principal component analysis 00:03:33.090 --> 00:03:34.090 will not. 00:03:34.090 --> 00:03:39.950 So it's a good rule never to use principal component analysis in your own research and 00:03:39.950 --> 00:03:45.310 if you see someone using a principal component analysis or not recording which factor analysis 00:03:45.310 --> 00:03:51.680 technique they applied and using SPSS then it's a good idea to question the authors choices.