WEBVTT

WEBVTT
Kind: captions
Language: en

00:00:00.060 --> 00:00:02.850
We will now take a look at estimation of&nbsp;&nbsp;

00:00:02.850 --> 00:00:06.240
factor models and particularly the&nbsp;
confirmatory factor analysis model.

00:00:06.240 --> 00:00:12.450
This is a important to understand because&nbsp;
sometimes your factor analysis results&nbsp;&nbsp;

00:00:12.450 --> 00:00:17.910
indicate that the model doesn't fit the data.&nbsp;
And that is indicated by the chi-square statistic&nbsp;&nbsp;

00:00:17.910 --> 00:00:24.690
then you have to understand what to do And to&nbsp;
understand what to do you have to understand&nbsp;&nbsp;

00:00:24.690 --> 00:00:30.840
what the factor analysis actually does and what&nbsp;
kind of relationships it models in the data.

00:00:30.840 --> 00:00:35.370
So let's take a look at how confirmatory&nbsp;
factor analysis models are estimated.

00:00:35.370 --> 00:00:41.430
The idea in confirmatory factor analysis model&nbsp;
estimation is that you apply tracing rules. So&nbsp;&nbsp;

00:00:41.430 --> 00:00:46.470
this is the same thing that you apply in&nbsp;
mediation models or in regression model&nbsp;&nbsp;

00:00:46.470 --> 00:00:52.050
if you estimate it from a correlation&nbsp;
matrix. We have a factor model here&nbsp;&nbsp;

00:00:52.050 --> 00:01:00.510
and we can specify that the correlations&nbsp;
between a1 and a2 a1 and b1 and a1 with&nbsp;&nbsp;

00:01:00.510 --> 00:01:05.160
itself - which is the variance - are&nbsp;
functions of these moral parameters.

00:01:05.160 --> 00:01:10.350
We use the Phi letter - Greek letter&nbsp;
Phi - for factor correlation that's&nbsp;&nbsp;

00:01:10.350 --> 00:01:15.750
a convention and then we use lamda us for&nbsp;
factor loading. That's also a convention.&nbsp;&nbsp;

00:01:15.750 --> 00:01:18.630
And these all lambdas are different&nbsp;
lambdas. So they have different values.

00:01:18.630 --> 00:01:28.080
So correlation between a1 and a2 is whatever&nbsp;
different paths we can go from a1 to a2. So we&nbsp;&nbsp;

00:01:28.080 --> 00:01:34.410
can go up here and then we go down and that's&nbsp;
one path and there are no other paths from a1&nbsp;&nbsp;

00:01:34.410 --> 00:01:39.840
to a2. So we multiply everything along&nbsp;
the way. So we have one factor loading&nbsp;&nbsp;

00:01:39.840 --> 00:01:47.520
and then we have another factor loading&nbsp;
and that's the lambda a 1 lambda a2 and&nbsp;&nbsp;

00:01:47.520 --> 00:01:51.570
that's the correlation a1 a2 assuming&nbsp;
that these are standardized estimates.

00:01:51.570 --> 00:02:02.640
Then a1 b1 is calculated similarly. The path&nbsp;
is - we take from a1 to a then we take the&nbsp;&nbsp;

00:02:02.640 --> 00:02:12.360
correlation and then we take b to b1. So&nbsp;
that's the correlation with a1 and b1.

00:02:13.330 --> 00:02:20.470
The variation of a1 - we have two different ways&nbsp;
to go somewhere and come back. So we can go to A&nbsp;&nbsp;

00:02:20.470 --> 00:02:28.900
and come back and we're going to go to the error&nbsp;
term E and come back. So that's the variants of a.

00:02:28.900 --> 00:02:37.330
And how we estimate this model again is that&nbsp;
then we calculate a model correlation with all&nbsp;&nbsp;

00:02:37.330 --> 00:02:43.120
indicators and we try to adjust the model so&nbsp;
that the correlations match the observed data.

00:02:43.120 --> 00:02:52.720
Here we have a positive decrease of freedom.&nbsp;
So we are estimating all together 13 different&nbsp;&nbsp;

00:02:52.720 --> 00:02:58.060
things from the data. So we have six factor&nbsp;
loadings. We have six error terms and then&nbsp;&nbsp;

00:02:58.060 --> 00:03:06.670
we have one correlation. So six plus six plus&nbsp;
one is 13 and we have 21 units of information&nbsp;&nbsp;

00:03:06.670 --> 00:03:14.920
because we have 21 unique elements in correlation&nbsp;
matrix of 6 indicators. So we have 6 variances&nbsp;&nbsp;

00:03:14.920 --> 00:03:23.050
and then we have 15 unique correlations. So&nbsp;
these don't count because they're not unique.

00:03:23.050 --> 00:03:29.530
The degree of freedom is 8 which means that&nbsp;
we have a positive decrease of freedom and&nbsp;&nbsp;

00:03:29.530 --> 00:03:35.080
the model is then overestimated. Over&nbsp;
identified. That means that we cannot&nbsp;&nbsp;

00:03:35.080 --> 00:03:42.640
typically solve it exactly. So we cannot find&nbsp;
a set of model implied correlations for these&nbsp;&nbsp;

00:03:44.380 --> 00:03:48.790
correlations so that every correlation&nbsp;
would match the observed correlation.

00:03:48.790 --> 00:03:57.370
So we cannot solve it. We have to just&nbsp;
find a way to quantify the difference&nbsp;&nbsp;

00:03:57.370 --> 00:04:02.350
between the implied correlation and that&nbsp;
observed correlation We could take a sum&nbsp;&nbsp;

00:04:02.350 --> 00:04:08.050
of squares which would be the unweighted least&nbsp;
squares estimator. Typically we take a weighted&nbsp;&nbsp;

00:04:08.050 --> 00:04:14.140
sum of these implied correlations minus the&nbsp;
observed correlations and a particular set of&nbsp;&nbsp;

00:04:14.140 --> 00:04:20.680
weights produces the maximum likelihood&nbsp;
estimator for this particular model.

00:04:20.680 --> 00:04:30.220
So the idea is that we find the model parameters&nbsp;
so that the implied correlations are as close&nbsp;&nbsp;

00:04:30.220 --> 00:04:36.040
to the object correlations as possible.&nbsp;
To do that there are some other things&nbsp;&nbsp;

00:04:36.040 --> 00:04:42.220
that we need to consider before you can&nbsp;
actually estimate the model. That relates&nbsp;&nbsp;

00:04:42.220 --> 00:04:46.690
to identification and scale setting&nbsp;
that I'll describe in the next video.