WEBVTT

WEBVTT
Kind: captions
Language: en

00:00:00.060 --> 00:00:03.810
Factor analysis answers the question&nbsp;
of what indicators have in common.

00:00:03.810 --> 00:00:10.410
In an exploratory factor analysis you start by&nbsp;
first extracting all common variants from the&nbsp;&nbsp;

00:00:10.410 --> 00:00:17.340
data and then you take that out from the data&nbsp;
- then you extract another factor and so on.

00:00:17.340 --> 00:00:23.400
This process leads the factor analysis&nbsp;
solution that is often uninterpretable&nbsp;&nbsp;

00:00:23.400 --> 00:00:28.650
because most of the indicators will load&nbsp;
highly on the first factor and then have&nbsp;&nbsp;

00:00:28.650 --> 00:00:32.070
a mixture of positive and negative&nbsp;
loadings on the remaining factors.&nbsp;&nbsp;

00:00:32.070 --> 00:00:38.010
To make the factor analysis results more&nbsp;
interpretable we do a factor rotation.

00:00:38.010 --> 00:00:45.270
What the factor rotation achieves is that it&nbsp;
takes the indicators first so let's assume we&nbsp;&nbsp;

00:00:45.270 --> 00:00:56.790
have here six indicators and items 1 2 3 vary&nbsp;
together. Items 4 5 &amp; 6 vary together. So this&nbsp;&nbsp;

00:00:56.790 --> 00:01:03.750
is the score of person number 1 and this is the&nbsp;
score of person number 2 here. And when we do a&nbsp;&nbsp;

00:01:03.750 --> 00:01:11.280
factor analysis - first we extract first factor&nbsp;
then the factor analysis starts from the origin&nbsp;&nbsp;

00:01:11.280 --> 00:01:19.350
here and it asks the question which direction&nbsp;
the data are. And it will indicate that all&nbsp;&nbsp;

00:01:19.350 --> 00:01:25.680
the data are to that direction here. So all&nbsp;
the data are to the right and down a bit.

00:01:25.680 --> 00:01:33.630
Then the next thing that we do is that we&nbsp;
eliminate the influence of this factor. So&nbsp;&nbsp;

00:01:33.630 --> 00:01:39.990
we basically sift all these observations&nbsp;
sideways so that they have a zero value&nbsp;&nbsp;

00:01:39.990 --> 00:01:47.520
on this variation - this factor - and then we&nbsp;
extract another factor asking which direction&nbsp;&nbsp;

00:01:47.520 --> 00:01:55.590
the observations are or the variables are - then&nbsp;
the answer is that they are either up or down.

00:01:55.590 --> 00:02:03.900
So they are orthogonal to this factor here&nbsp;
and with two persons we can use these two&nbsp;&nbsp;

00:02:03.900 --> 00:02:10.980
factors to pinpoint the location of each&nbsp;
indicator. So this indicator item one&nbsp;&nbsp;

00:02:10.980 --> 00:02:17.070
is this much along the first factor and&nbsp;
then that much along the second factor.

00:02:17.070 --> 00:02:22.290
This also indicates that the first factor shows&nbsp;
the overall direction and the second factor is&nbsp;&nbsp;

00:02:22.290 --> 00:02:29.760
usually a positive or a negative depending on&nbsp;
which direction we go to that factor - do we&nbsp;&nbsp;

00:02:29.760 --> 00:02:35.550
go up or down. And the problem of course is&nbsp;
that if we have to summarize - if we want to&nbsp;&nbsp;

00:02:35.550 --> 00:02:41.130
summarize this data - then we would say that&nbsp;
this group of indicators in this direction&nbsp;&nbsp;

00:02:41.130 --> 00:02:47.280
and the other group is in that direction - so&nbsp;
the factor analysis really doesn't reflect that&nbsp;&nbsp;

00:02:47.280 --> 00:02:53.160
dimensionality even if it allows us to summarize&nbsp;
these indicators give them more coordinates.

00:02:53.160 --> 00:02:58.980
So the problem is that the first&nbsp;
factor explains a little bit of&nbsp;&nbsp;

00:02:58.980 --> 00:03:03.630
every indicator and then the second factor&nbsp;
has positive or negative loadings and they&nbsp;&nbsp;

00:03:03.630 --> 00:03:07.980
don't really explain where the data are&nbsp;
in a way that is easier to interpret.

00:03:07.980 --> 00:03:13.890
The purpose of a factor loading is that we&nbsp;
try to reorient the factor analysis solution&nbsp;&nbsp;

00:03:13.890 --> 00:03:22.440
so that indicators load highly on one factor&nbsp;
and one factor only. So we try to maximize&nbsp;&nbsp;

00:03:22.440 --> 00:03:28.290
each indicators largest factor loading&nbsp;
and minimize all other factor loadings.

00:03:28.290 --> 00:03:34.110
It also makes the variances more&nbsp;
equal. So here the first factor&nbsp;&nbsp;

00:03:34.110 --> 00:03:39.990
explains on or here the second factor&nbsp;
actually explains more variation than&nbsp;&nbsp;

00:03:39.990 --> 00:03:42.570
the first factor because all the&nbsp;
indicators are in this direction.

00:03:42.570 --> 00:03:49.350
So there are different techniques and the&nbsp;
techniques are in two variants. We have&nbsp;&nbsp;

00:03:49.350 --> 00:03:54.420
oblique and orthogonal rotation. Oblique rotation&nbsp;
maintains the factors that they're uncorrelated.&nbsp;&nbsp;

00:03:54.420 --> 00:04:02.220
So we kind of take the factor solution here&nbsp;
and then we rotate it around the zero axis&nbsp;&nbsp;

00:04:02.220 --> 00:04:07.500
like that. So we rotate those two arrows so&nbsp;
that they point more toward the clusters of&nbsp;&nbsp;

00:04:07.500 --> 00:04:14.250
the observations. Like so. So we rotate it a&nbsp;
bit about 45 degrees or a bit less and then&nbsp;&nbsp;

00:04:14.250 --> 00:04:20.730
now the first factor points the direction of the&nbsp;
first items 1 2 3 and the second factor points to&nbsp;&nbsp;

00:04:20.730 --> 00:04:29.850
items 4 5 and 6. But these factors still don't&nbsp;
point exactly to where the items are because&nbsp;&nbsp;

00:04:29.850 --> 00:04:34.380
we are constraining that the factors must be&nbsp;
uncorrelated. So this is a 90 degree angle.

00:04:34.380 --> 00:04:40.830
When we relax that assumption we can&nbsp;
actually draw the lines. So that the&nbsp;&nbsp;

00:04:40.830 --> 00:04:46.230
factors are correlated when this factor is&nbsp;
higher then this factor can be higher as&nbsp;&nbsp;

00:04:46.230 --> 00:04:53.400
well and now the arrows point that the first&nbsp;
three items are in this direction the second&nbsp;&nbsp;

00:04:53.400 --> 00:04:57.540
are three items are in that direction&nbsp;
and that's the idea of factor rotation.&nbsp;&nbsp;

00:04:57.540 --> 00:05:04.860
So you are reorient the factor analysis&nbsp;
to make it a more simpler to interpret.

00:05:04.860 --> 00:05:11.640
So do you have to understand what exactly&nbsp;
the factor of this and that does? The answer&nbsp;&nbsp;

00:05:11.640 --> 00:05:16.080
is no because there is a simple&nbsp;
rule of thumb that you can apply.

00:05:16.080 --> 00:05:23.610
The rule of thumb is that always&nbsp;
use Oblimion rotation because it's&nbsp;&nbsp;

00:05:23.610 --> 00:05:28.290
theoretically the most appealing for&nbsp;
many scenarios and particularly it is&nbsp;&nbsp;

00:05:28.290 --> 00:05:33.600
an oblique rotation. If your factors&nbsp;
are supposed to represent constructs&nbsp;&nbsp;

00:05:33.600 --> 00:05:37.950
that are correlated - which is the&nbsp;
case if we make a theory about those&nbsp;&nbsp;

00:05:37.950 --> 00:05:44.280
constructs - then constraining the factors to&nbsp;
be uncorrelated - it doesn't make any sense.

00:05:44.280 --> 00:05:51.900
Very much rotation is often the default and&nbsp;
it's an oblique rotation so you should never&nbsp;&nbsp;

00:05:51.900 --> 00:05:58.260
use that one. The reason why varymax&nbsp;
is the default is because of history.&nbsp;&nbsp;

00:05:58.260 --> 00:06:05.100
Factor analysis has decades of history and&nbsp;
when the factor analysis was introduced we&nbsp;&nbsp;

00:06:05.100 --> 00:06:10.290
really didn't have computers so people&nbsp;
were doing hand calculations and the&nbsp;&nbsp;

00:06:10.290 --> 00:06:16.440
varymax rotation is much simpler to calculate&nbsp;
than the Oblimin rotation. But nowadays the&nbsp;&nbsp;

00:06:16.440 --> 00:06:24.600
computer will do these both of these for you&nbsp;
instantaneously so the amount of computation&nbsp;&nbsp;

00:06:24.600 --> 00:06:28.890
it's a non issue. You should really go with&nbsp;
the direct oblimin instead of anything else.

00:06:28.890 --> 00:06:39.120
And when you look at articles they actually&nbsp;
report that oblimin is used. This is a pretty&nbsp;&nbsp;

00:06:39.120 --> 00:06:45.840
nice way of reporting a factor analysis from&nbsp;
this information systems research paper. So&nbsp;&nbsp;

00:06:45.840 --> 00:06:51.510
the authors report that they conducted&nbsp;
exploratory factor analysis. They did&nbsp;&nbsp;

00:06:51.510 --> 00:06:56.970
oblimin rotation. They also explained why they&nbsp;
did oblimin rotation because they want to have&nbsp;&nbsp;

00:06:56.970 --> 00:07:01.770
the factors to be correlated and you only&nbsp;
need one sentence and two lines for that.&nbsp;&nbsp;

00:07:01.770 --> 00:07:06.540
So that's really nice way of reporting that&nbsp;
you actually did factor analysis correctly.