WEBVTT
WEBVTT
Kind: captions
Language: en
00:00:00.570 --> 00:00:05.380
Principal component analysis is a statistical
technique that is related to a factor analysis
00:00:05.380 --> 00:00:08.760
and commonly confused with the factor analysis.
00:00:08.760 --> 00:00:15.089
What principal component analysis does it
tries to summarize the variables into smaller
00:00:15.089 --> 00:00:20.310
set of sums - weighted sums of the variables
called components.
00:00:20.310 --> 00:00:27.009
And it's a more data relaxed technique concerned
about how we can reduce the number of variables
00:00:27.009 --> 00:00:29.500
without deleting information from the data.
00:00:29.500 --> 00:00:36.300
It doesn't answer the question what do the
indicators have in common - at least not directly.
00:00:36.300 --> 00:00:41.910
It's not a very useful technique for assessing
measurement models because in principal component
00:00:41.910 --> 00:00:46.409
analysis it considers all variances in the
data.
00:00:46.409 --> 00:00:51.640
In factor analysis only the common variance
is considered.
00:00:51.640 --> 00:00:58.050
What that means is that a principal component
analysis also tries to explain the unreliability
00:00:58.050 --> 00:01:03.670
of the indicators whereas in factor analysis
we try to take the unreliability and other
00:01:03.670 --> 00:01:10.060
unique aspects of the indicators and eliminate
those so that we can extract what is common
00:01:10.060 --> 00:01:11.780
between the indicators.
00:01:11.780 --> 00:01:19.189
In practice if you use a factor loading as
an estimate of indicator reliability - that
00:01:19.189 --> 00:01:21.750
is ok with some assumptions.
00:01:21.750 --> 00:01:27.909
If you use the component loading in as an
estimate of individual indicator reliability
00:01:27.909 --> 00:01:33.049
then reliability is severely overestimated.
00:01:33.049 --> 00:01:39.130
The same thing if you apply so called Harmon's
single factor test to assess whether one factor
00:01:39.130 --> 00:01:45.500
can explain the intercorrelation in the data
and that would be evidence of common method
00:01:45.500 --> 00:01:48.040
problem applying a component analysis.
00:01:48.040 --> 00:01:54.840
The factor analysis will in practically never
indicate that you have a common method variance
00:01:54.840 --> 00:01:57.170
problem even if you actually do.
00:01:57.170 --> 00:02:00.640
So this is not a substitute for a factor analysis.
00:02:00.640 --> 00:02:05.860
It's not a factor analysis technique and it's
a data summary technique instead.
00:02:05.860 --> 00:02:06.860
So why do...
00:02:06.860 --> 00:02:09.580
It's not very useful one with measurement.
00:02:09.580 --> 00:02:12.990
So why do people use principal component analysis?
00:02:12.990 --> 00:02:21.540
The reason is that when you use SPSS and you
do a factor analysis from the menu - you get
00:02:21.540 --> 00:02:23.890
the dialogue that looks like that.
00:02:23.890 --> 00:02:29.890
Then when you check on the factor extractor
button here - it gives you different factor
00:02:29.890 --> 00:02:31.080
analysis techniques.
00:02:31.080 --> 00:02:34.410
So it can estimate the factor model in different
ways.
00:02:34.410 --> 00:02:37.580
The default is to do principal component analysis.
00:02:37.580 --> 00:02:40.630
And that's not a factor analysis technique.
00:02:40.630 --> 00:02:45.700
There are the others whether you use principal
axis factor in maximum likelihood or minimal
00:02:45.700 --> 00:02:51.370
residual - it doesn't matter but because they
all estimate the factor analysis model.
00:02:51.370 --> 00:02:55.700
Principal component analysis is not a factor
analysis model because it doesn't discover
00:02:55.700 --> 00:02:59.650
underlined dimensions instead it summarizes
the data.
00:02:59.650 --> 00:03:05.180
There are really no good reasons to use principal
component analysis in social science research
00:03:05.180 --> 00:03:09.350
because a factor analysis can be used to summarize
data.
00:03:09.350 --> 00:03:14.510
So if you just want to summarize your indicators
with a smaller number of summed variables
00:03:14.510 --> 00:03:21.010
weighted sums - then factor analysis and principal
component analysis will give you a pretty
00:03:21.010 --> 00:03:22.480
similar solutations.
00:03:22.480 --> 00:03:27.850
If you want to assess whether underlying dimension
explains the data - then factor analysis will
00:03:27.850 --> 00:03:33.090
give you the correct solution undercertain
assumptions - principal component analysis
00:03:33.090 --> 00:03:34.090
will not.
00:03:34.090 --> 00:03:39.950
So it's a good rule never to use principal
component analysis in your own research and
00:03:39.950 --> 00:03:45.310
if you see someone using a principal component
analysis or not recording which factor analysis
00:03:45.310 --> 00:03:51.680
technique they applied and using SPSS then
it's a good idea to question the authors choices.