WEBVTT

00:00:00.180 --> 00:00:06.270
This video will introduce you the&nbsp;
regression analysis assumptions,&nbsp;&nbsp;

00:00:06.270 --> 00:00:12.030
or specifically, the assumptions that&nbsp;
least squares estimation principle assumes.

00:00:12.030 --> 00:00:15.150
So the idea of least square estimation&nbsp;&nbsp;

00:00:15.150 --> 00:00:19.860
or regression model is that we&nbsp;
have one dependent variable y.

00:00:19.860 --> 00:00:23.250
And in this example, we have&nbsp;
one independent variable x.

00:00:23.250 --> 00:00:28.140
And we draw a line through the middle of&nbsp;
the data the scatter plot of the data.

00:00:28.140 --> 00:00:30.930
And regression analysis assumes that&nbsp;&nbsp;

00:00:30.930 --> 00:00:34.770
these observations are equally&nbsp;
spread out around this line.

00:00:34.770 --> 00:00:37.500
So that the dispersion of observations the&nbsp;&nbsp;

00:00:37.500 --> 00:00:42.120
same here, as is the dispersion of&nbsp;
observations here are on the line.

00:00:42.120 --> 00:00:46.590
So they each individual observation in&nbsp;
our data, falls somewhere on this line,&nbsp;&nbsp;

00:00:46.590 --> 00:00:50.730
some go exactly the line some&nbsp;
go a bit further from the line.

00:00:50.730 --> 00:00:54.630
We also assume that our,&nbsp;
when we know that x is one,&nbsp;&nbsp;

00:00:54.630 --> 00:01:00.420
then the values of y are normally&nbsp;
distributed here on the regression line.

00:01:00.420 --> 00:01:04.050
So that's basically a summary of the assumptions.

00:01:04.050 --> 00:01:09.330
And now we will take a look at&nbsp;
specific parts of those assumptions.

00:01:09.330 --> 00:01:13.650
Before we do so we have to talk a&nbsp;
bit about what the assumptions mean,&nbsp;

00:01:13.650 --> 00:01:14.970
because there are some misconceptions.

00:01:14.970 --> 00:01:19.920
For example, sometimes students in my classes say&nbsp;&nbsp;

00:01:19.920 --> 00:01:24.030
that an estimation technique requires&nbsp;
that the data normally distributed,

00:01:24.030 --> 00:01:28.980
and they think it implies that an&nbsp;
estimation technique can be applied&nbsp;&nbsp;

00:01:28.980 --> 00:01:33.390
when the data are not normal.
That has two problems.

00:01:33.390 --> 00:01:37.350
First of all, we rarely make assumptions&nbsp;
about the distribution of observed data.

00:01:37.350 --> 00:01:44.850
And second, the fact that an assumption doesn't&nbsp;&nbsp;

00:01:44.850 --> 00:01:49.230
hold exactly doesn't mean that the&nbsp;
estimator is immediately useless.

00:01:49.230 --> 00:01:52.800
Let's start with examples&nbsp;
of models and estimators.

00:01:52.800 --> 00:01:54.720
So we understand what assumptions mean.

00:01:54.720 --> 00:01:56.520
So here's the regression model.

00:01:56.520 --> 00:02:01.200
It's that y is a weighted sum of x's&nbsp;
that observe independent variables,&nbsp;&nbsp;

00:02:01.200 --> 00:02:04.470
plus some error term u that&nbsp;
the model doesn't explain.

00:02:04.470 --> 00:02:07.890
Then we have estimates and principles,&nbsp;&nbsp;

00:02:07.890 --> 00:02:11.040
how do we choose the betas,
which set of betas is the best.

00:02:11.040 --> 00:02:17.490
And one good rule is the OLS rule,&nbsp;
minimize the sum of squared residuals.

00:02:17.490 --> 00:02:20.010
So we choose the betas here.

00:02:20.010 --> 00:02:22.140
So that the sum of squared residuals,&nbsp;&nbsp;

00:02:22.140 --> 00:02:28.200
what is the difference between the observed&nbsp;
value y and the fitted value from the betas,

00:02:28.200 --> 00:02:33.810
is as small as possible so that that's&nbsp;
what we are me discussing this part.

00:02:33.810 --> 00:02:36.570
But that's not the only way of&nbsp;
estimating a regression model.

00:02:36.570 --> 00:02:40.590
For example, we could use weighted least squares.

00:02:40.590 --> 00:02:44.580
So weighted least squares is the same as OLS,&nbsp;

00:02:44.580 --> 00:02:48.360
except that instead of minimizing&nbsp;
a sum of squared residuals,&nbsp;

00:02:48.360 --> 00:02:54.000
we minimize the weighted sum of squared&nbsp;
residuals or sum of weighted squared residuals.

00:02:54.000 --> 00:02:59.430
The idea of weighted least squares&nbsp;
is that some observations provide&nbsp;&nbsp;

00:02:59.430 --> 00:03:03.210
us more information about the&nbsp;
regression line goes than others.

00:03:03.210 --> 00:03:08.580
And in some scenarios, the weighted&nbsp;
least squares is better than OLS.

00:03:08.580 --> 00:03:10.830
To understand what those scenarios are,&nbsp;&nbsp;

00:03:10.830 --> 00:03:14.730
we have to understand the assumptions that&nbsp;
but that's not all, we have also others.

00:03:14.730 --> 00:03:19.020
So there's feasible generalized least squares,&nbsp;&nbsp;

00:03:19.020 --> 00:03:24.840
which is the same as weighted least squares&nbsp;
that estimates the weights from the data.

00:03:24.840 --> 00:03:29.010
So that makes a bit less assumptions that weighted&nbsp;
least squares and there are trade offs in that.

00:03:29.010 --> 00:03:34.140
We have also our interative&nbsp;
weighted least squares or IRLS.

00:03:34.140 --> 00:03:39.300
The idea of IRLS is that it&nbsp;
weights, the residuals interatively.

00:03:39.300 --> 00:03:45.000
And the weights from for the next interation&nbsp;
are based on the previous interation.

00:03:45.000 --> 00:03:50.220
And this is a good technique when you have outlier&nbsp;
observations that I talk about in another video.

00:03:50.220 --> 00:03:56.250
So all of these techniques can be used in&nbsp;
different scenarios, they all work reasonably&nbsp;&nbsp;

00:03:56.250 --> 00:04:01.800
well, in some conditions, in some conditions one&nbsp;
of these rules is clearly better than others.

00:04:01.800 --> 00:04:04.440
To understand that we have to&nbsp;
understand the assumptions.

00:04:04.440 --> 00:04:09.360
Also the models, we can use different models.

00:04:09.360 --> 00:04:13.860
So it's, the regression model is&nbsp;
not necessarily the best model.
 &nbsp;

00:04:13.860 --> 00:04:20.460
For example, we could instead of regression model,
we could apply a generalized linear model which&nbsp;&nbsp;

00:04:20.460 --> 00:04:25.050
takes the fitted values for regression&nbsp;
analysis applies a function there.

00:04:25.050 --> 00:04:29.220
And then it doesn't make the assumption&nbsp;
that observations are normally distributed.&nbsp;

00:04:29.220 --> 00:04:32.310
So that's, that's one alternative model.

00:04:32.310 --> 00:04:37.410
So you can choose either alternative model&nbsp;
or alternative estimator, when your data&nbsp;&nbsp;

00:04:37.410 --> 00:04:42.750
doesn't really fit into the model estimates&nbsp;
combination that you're you're planning to use.

00:04:42.750 --> 00:04:45.630
Here's another one, this is a multi level model.

00:04:45.630 --> 00:04:50.568
And this would be applicable when you&nbsp;
have for example, longitudinal data.

00:04:50.568 --> 00:04:53.010
So you have multiple&nbsp;
observations for each company,&nbsp;

00:04:53.010 --> 00:04:56.430
and many companies in the data&nbsp;
and you assume that there are&nbsp;&nbsp;

00:04:56.430 --> 00:05:00.480
some constant differences between&nbsp;
companies that persist over time.

00:05:00.480 --> 00:05:03.780
And then you would use that&nbsp;
kind of model because you are&nbsp;&nbsp;

00:05:03.780 --> 00:05:08.340
in violation of the random sampling&nbsp;
assumption in regression analysis.

00:05:08.340 --> 00:05:12.270
So there are different&nbsp;
things that that you can use,&nbsp;

00:05:12.270 --> 00:05:18.330
I recommend always as default option to go&nbsp;
with regression analysis and OLS estimation,&nbsp;

00:05:18.330 --> 00:05:22.080
if you have a good reason to use something else,
then do that.

00:05:22.080 --> 00:05:27.990
But start with OLS and regression model,
because it will tell you something about the&nbsp;&nbsp;

00:05:27.990 --> 00:05:31.920
data that you didn't know before estimation.
And it's quick to calculate.

00:05:31.920 --> 00:05:34.230
Then you go to more complicated things,&nbsp;

00:05:34.230 --> 00:05:41.220
if specific assumptions of OLS don't&nbsp;
really fit into your research scenario.

00:05:41.220 --> 00:05:48.720
Okay, so the assumptions are something that we do&nbsp;
so assumptions are required for certain proofs.

00:05:48.720 --> 00:05:55.710
So, when we say that the OLS requires that&nbsp;
the error term is normally distributed,&nbsp;

00:05:55.710 --> 00:06:01.350
it means that it has been proven&nbsp;
that OLS is consistent, unbiased,&nbsp;&nbsp;

00:06:01.350 --> 00:06:03.390
efficient, and the estimates are normal,&nbsp;

00:06:03.390 --> 00:06:06.390
when among other assumptions, the&nbsp;
error term is normally distributed.

00:06:06.390 --> 00:06:10.140
So certain proofs require these assumptions.

00:06:10.140 --> 00:06:15.540
If we can't assume certain proofs, certain&nbsp;
things, then the proof can't be done.

00:06:15.540 --> 00:06:18.150
So, if the error term is not normally distributed,&nbsp;&nbsp;

00:06:18.150 --> 00:06:26.670
then we cannot prove that the OLS&nbsp;
estimator is on unbiased in small samples.

00:06:26.670 --> 00:06:28.800
It could be but we can't prove it.

00:06:28.800 --> 00:06:36.660
So these assumptions imply one important&nbsp;
thing and they don't imply another thing.

00:06:36.660 --> 00:06:44.640
So what they do imply is that the estimator is&nbsp;
useful when we are close to this ideal conditions.

00:06:44.640 --> 00:06:51.360
So regression analysis assumes that the&nbsp;
relationships in the data are linear,&nbsp;

00:06:51.360 --> 00:06:55.290
if they are close to linear,&nbsp;
but not exactly linear,&nbsp;

00:06:55.290 --> 00:06:57.660
regression analysis will be a useful tool.

00:06:57.660 --> 00:07:00.030
So these assumptions don't have to hold exactly.

00:07:00.030 --> 00:07:05.820
If they are close enough, then&nbsp;
we will get still good results.

00:07:05.820 --> 00:07:13.650
Also, it doesn't imply that if an estimator has&nbsp;
been proven to be consistent under some scenario,&nbsp;

00:07:13.650 --> 00:07:17.880
then it's immediately useless in other scenarios.

00:07:17.880 --> 00:07:20.820
So the fact that something has been proven in&nbsp;&nbsp;

00:07:20.820 --> 00:07:25.350
one condition doesn't mean that it&nbsp;
can not work in another condition.

00:07:25.350 --> 00:07:29.700
But it's important to understand the&nbsp;
limitations of these different techniques.

00:07:29.700 --> 00:07:36.720
And for that, we test the assumptions&nbsp;
typically after we do our analysis.

00:07:36.720 --> 00:07:42.450
Now that we have understood that the assumptions&nbsp;
are something that should ideally hold,&nbsp;

00:07:42.450 --> 00:07:46.500
but in practice, they hold only approximately.

00:07:46.500 --> 00:07:51.660
And also we have understood that&nbsp;
because we are in violation of,

00:07:51.660 --> 00:07:55.440
for example, the normality of&nbsp;
assumption in regression analysis,&nbsp;&nbsp;

00:07:55.440 --> 00:07:58.140
it doesn't necessarily have&nbsp;
any severe consequences,

00:07:58.140 --> 00:08:00.360
it just means that certain things can be proven,

00:08:00.360 --> 00:08:03.630
the thing that we can't prove could still be true.

00:08:03.630 --> 00:08:06.300
Let's take a look at there are actual assumptions.

00:08:06.300 --> 00:08:14.220
Regression analysis requires four&nbsp;
assumptions to provide or OLS estimates&nbsp;&nbsp;

00:08:14.220 --> 00:08:18.480
requires four assumptions to provide&nbsp;
you consistent and unbiased estimates.

00:08:18.480 --> 00:08:22.050
And the unbiasedness property&nbsp;
here refers to any sample size.

00:08:22.050 --> 00:08:25.860
So regression analysis is unbiased&nbsp;
regardless of the sample size.

00:08:25.860 --> 00:08:29.070
You can get unbiased estimates,
with sample of 10 observations.
 &nbsp;

00:08:29.070 --> 00:08:32.490
The estimates will be very precise,&nbsp;
but they're still unbiased.

00:08:32.490 --> 00:08:35.790
The first assumption is&nbsp;
that we have a linear model.

00:08:35.790 --> 00:08:38.820
So that assumption basically&nbsp;
just defines the model.

00:08:38.820 --> 00:08:41.700
And that's, that's all there is to it.

00:08:41.700 --> 00:08:44.550
Then the second assumption is random sampling.

00:08:44.550 --> 00:08:47.790
So random sampling means that&nbsp;
all observations are independent.

00:08:47.790 --> 00:08:55.230
And each observation in the population has equal&nbsp;
probability in getting selected to the sample.

00:08:55.230 --> 00:08:57.720
This is a feature of your research design.

00:08:57.720 --> 00:09:00.930
And it can't really be&nbsp;
tested empirically directly,&nbsp;

00:09:01.500 --> 00:09:06.090
you can test it in some aspects&nbsp;
of this random sampling.

00:09:06.090 --> 00:09:07.950
And I will talk about that later.

00:09:07.950 --> 00:09:11.970
Then we have two other assumptions.

00:09:11.970 --> 00:09:15.315
Assumption three is, there's&nbsp;
no perfect collinearity.

00:09:15.315 --> 00:09:18.360
It's so perfect collinearity is&nbsp;
different from multicollinearity.

00:09:18.360 --> 00:09:25.140
Perfect collinearity means that if&nbsp;
that one or more of the variables,&nbsp;&nbsp;

00:09:25.140 --> 00:09:30.600
independent variables in the model are completely&nbsp;
determined by another independent variable.

00:09:30.600 --> 00:09:38.070
So for example, if we have three dummy variables,&nbsp;
then we that define a categorical variable.

00:09:38.070 --> 00:09:42.210
If we know two values for the dummies,
then we can infer the third.

00:09:42.210 --> 00:09:48.210
That assumption requires that every&nbsp;
new observed new variable that we&nbsp;&nbsp;

00:09:48.210 --> 00:09:51.750
enter that the model, brings new&nbsp;
information about the phenomena.

00:09:51.750 --> 00:09:56.040
If we know that, let's use gender as an example.

00:09:56.040 --> 00:09:59.580
We only need to know whether&nbsp;
a person is or is not a male.

00:09:59.580 --> 00:10:06.240
If he is not a male, then we know that he's&nbsp;
a female, or she's a female, then having&nbsp;&nbsp;

00:10:06.240 --> 00:10:11.010
a variable for male or having a variable&nbsp;
for female, would be perfectly collinear.

00:10:11.010 --> 00:10:15.480
Because knowing whether a person&nbsp;
is a man automatically tells you&nbsp;&nbsp;

00:10:15.480 --> 00:10:17.760
whether the same person is a woman or not.

00:10:17.760 --> 00:10:20.040
So that's the perfect collinearity.

00:10:20.040 --> 00:10:26.310
The zero conditional mean, this is&nbsp;
a technical way of expressing it,&nbsp;

00:10:26.310 --> 00:10:29.580
but it basically tells you that we assume that&nbsp;&nbsp;

00:10:29.580 --> 00:10:34.830
the error term is uncorrelated&nbsp;
with all explanatory variables.

00:10:34.830 --> 00:10:40.950
And this is a bit more complicated assumption&nbsp;
that I'll explained in another video,&nbsp;

00:10:40.950 --> 00:10:45.210
but this is also referred to as&nbsp;
the no endogeneity assumption.

00:10:45.210 --> 00:10:49.860
And if we look at this diagram&nbsp;
of regression analysis,&nbsp;

00:10:49.860 --> 00:10:59.640
then this assumption number four can be understood&nbsp;
as that where this distribution is located,&nbsp;

00:10:59.640 --> 00:11:03.330
doesn't depend on the regression line.

00:11:03.330 --> 00:11:09.120
So the distribution is always exactly at&nbsp;
the regression line, instead of for example,&nbsp;

00:11:09.120 --> 00:11:14.070
the line going here, and the observations&nbsp;
being somewhere here normally distributed.

00:11:14.070 --> 00:11:19.470
So that is called the no endogeneity&nbsp;
assumption and endogeneity is a big issue.

00:11:19.470 --> 00:11:23.730
If we want to make causal&nbsp;
claims using observational data,&nbsp;

00:11:23.730 --> 00:11:26.100
I'll return to that in another video.

00:11:26.100 --> 00:11:31.980
So under these four assumptions,&nbsp;
OLS, is unbiased and consistent,&nbsp;&nbsp;

00:11:31.980 --> 00:11:40.680
we have still two more assumptions that OLS&nbsp;
makes that are required for the consistency&nbsp;&nbsp;

00:11:40.680 --> 00:11:44.250
and unbiasedness of standard errors,&nbsp;
and the normality of the estimates.

00:11:44.250 --> 00:11:54.675
Standard errors are unbiased and consistent if&nbsp;
the data or the error term is homoskedastic,

00:11:54.675 --> 00:11:55.770
so there is no heteroskedasticity.

00:11:55.770 --> 00:12:03.600
What this assumption means that the observations&nbsp;
are equally spread out around the regression line.

00:12:03.600 --> 00:12:08.730
We would have a heteroskedasticity problem,
if the observations are close&nbsp;&nbsp;

00:12:08.730 --> 00:12:12.570
to the regression line here,
but far from the regression line here.

00:12:12.570 --> 00:12:17.250
So if instead of observing a band of&nbsp;
observations under regression in line,&nbsp;&nbsp;

00:12:17.250 --> 00:12:23.640
we would observe a funnel shape that&nbsp;
opens up or megaphone shape that opens up.

00:12:23.640 --> 00:12:27.780
So that's the homoskedasticity assumption.

00:12:27.780 --> 00:12:32.850
These five assumptions together are known as the&nbsp;&nbsp;

00:12:32.850 --> 00:12:37.770
Gauss-Markov assumptions and OLS is&nbsp;
efficient under these assumptions.

00:12:37.770 --> 00:12:41.370
But more importantly, the homoskedasticity&nbsp;&nbsp;

00:12:41.370 --> 00:12:45.180
assumption is required for the standard&nbsp;
errors to be unbiased and consistent.

00:12:45.180 --> 00:12:52.890
That is important because the t statistic for our&nbsp;
statistical inferences for the p value requires&nbsp;&nbsp;

00:12:52.890 --> 00:12:59.010
that both the estimate and standard error are&nbsp;
consistent and unbiased under those conditions,

00:12:59.010 --> 00:13:02.580
the t value will follow the t distribution with&nbsp;&nbsp;

00:13:02.580 --> 00:13:07.290
a null hypothesis of no effect&nbsp;
holds and we get proper p values.

00:13:07.290 --> 00:13:09.300
So that's the fifth assumption.

00:13:10.050 --> 00:13:15.900
Then the final one is that&nbsp;
most people are probably&nbsp;&nbsp;

00:13:15.900 --> 00:13:19.980
our most aware of is the normality assumption.

00:13:19.980 --> 00:13:22.890
So, this is also misunderstood,&nbsp;&nbsp;

00:13:22.890 --> 00:13:27.300
regression analysis does not assume that any&nbsp;
observed variable is normally distributed.

00:13:27.840 --> 00:13:34.920
Instead, it assumes that error term the&nbsp;
unobservable or how much the observations&nbsp;&nbsp;

00:13:34.920 --> 00:13:38.190
vary around the regression line,
that is normally distributed.

00:13:38.190 --> 00:13:49.290
This our rule is actually, this&nbsp;
rule implies four and five rules.

00:13:49.290 --> 00:13:55.500
And these assumptions one through 1-6 are&nbsp;
called classical linear model assumptions.

00:13:55.500 --> 00:14:01.800
In practice, the normality of the&nbsp;
error term assumption can be ignored,

00:14:01.800 --> 00:14:09.990
because OLS estimator is, what&nbsp;
we say, asymptotically normal.

00:14:09.990 --> 00:14:15.690
So, it means that when a sample&nbsp;
size increases towards infinity,&nbsp;&nbsp;

00:14:15.690 --> 00:14:19.380
then the regression estimates&nbsp;
will be normally distributed,

00:14:19.380 --> 00:14:23.880
regardless of how the error term&nbsp;
is distributed in the population.

00:14:23.880 --> 00:14:30.780
In practice, the sample sizes that&nbsp;
we use, that are 100, or a few 100.

00:14:30.780 --> 00:14:35.370
That is enough for this asymptotic&nbsp;
normality to start to kick in.

00:14:35.370 --> 00:14:36.300
In practice.

00:14:36.900 --> 00:14:42.870
I have tried to demonstrate scenarios where&nbsp;
the lack of normality of the error term would&nbsp;&nbsp;

00:14:42.870 --> 00:14:46.620
be problematic with observations&nbsp;
of 50 or more and I have failed.

00:14:46.620 --> 00:14:51.630
So I cannot think of a scenario&nbsp;
where this normality assumption&nbsp;&nbsp;

00:14:51.630 --> 00:14:54.900
is a practical concern for applied researcher.

00:14:54.900 --> 00:14:57.510
Let's take a summary of the assumptions.

00:14:57.510 --> 00:14:59.490
So we have six assumptions.

00:14:59.490 --> 00:15:02.130
First all relationships are linear.

00:15:02.130 --> 00:15:07.290
That can be checked after the model has been&nbsp;
estimated how we check that I'll cover later,&nbsp;

00:15:07.290 --> 00:15:11.040
then independence of observations,
they must be a random sample.

00:15:11.040 --> 00:15:13.470
This is a feature of your research design.

00:15:13.470 --> 00:15:21.120
And you can check the independence of observations&nbsp;
are after estimation under certain scenarios.

00:15:21.120 --> 00:15:25.530
Perfect collinearity a nonzero&nbsp;
variance of independent variables.

00:15:25.530 --> 00:15:30.540
If that fails, then a regression&nbsp;
model cannot be estimated.

00:15:30.540 --> 00:15:37.860
For example, if you're studying the effects&nbsp;
of gender on performance on statistics course,&nbsp;

00:15:38.700 --> 00:15:40.350
and you only observe women,&nbsp;

00:15:40.350 --> 00:15:45.090
so you have no variation in gender variable,
then you cannot estimate the gender effect.

00:15:45.090 --> 00:15:52.560
Also, if you have two variables that quantify&nbsp;
the exact same thing, then you can't enter&nbsp;&nbsp;

00:15:52.560 --> 00:15:56.130
both into the regression model.
 
This does not need to be checked,&nbsp;

00:15:56.130 --> 00:16:00.780
because you will know that you can't even,
if you run running a regression analysis,&nbsp;

00:16:00.780 --> 00:16:03.660
you will know if this fails because&nbsp;
the regression doesn't complete.

00:16:03.660 --> 00:16:08.610
Then error term is expected value of zero&nbsp;
given any values of independent variables.

00:16:08.610 --> 00:16:12.360
In practice, this means that all other causes&nbsp;&nbsp;

00:16:12.360 --> 00:16:15.630
of the dependent variable that&nbsp;
are not included in the model,&nbsp;

00:16:15.630 --> 00:16:19.590
must be uncorrelated with all causes&nbsp;
that are included in the model.

00:16:19.590 --> 00:16:21.360
That's a strong assumption,&nbsp;

00:16:21.360 --> 00:16:25.440
it can be tested directly&nbsp;
after least squares estimation,&nbsp;

00:16:25.440 --> 00:16:30.510
but we can test this assumption with instrumental&nbsp;
variables that are covered in a later video.

00:16:30.510 --> 00:16:35.460
Then we have: error term has equal variance&nbsp;
given any values of independent variables.

00:16:35.460 --> 00:16:41.130
This is the no heteroskedasticity assumption,
this should be checked or after estimation,&nbsp;

00:16:41.130 --> 00:16:44.490
because it influences the standard&nbsp;
errors of regression analysis.

00:16:44.490 --> 00:16:48.210
And if you have a heteroskedasticity&nbsp;
problem, it is easy to fix.

00:16:48.210 --> 00:16:53.670
Then error term is normally distributed,
I typically check this because it's useful&nbsp;&nbsp;

00:16:53.670 --> 00:16:58.050
to know if some of the values are far from&nbsp;
the regression line to identify outliers,&nbsp;

00:16:58.050 --> 00:17:01.350
but other than that,
this is not an important assumption.