WEBVTT
00:00:00.060 --> 00:00:02.970
We also have a second strategy
for statistical inference.
00:00:02.970 --> 00:00:05.310
The problem with p-values is that,
00:00:05.310 --> 00:00:08.310
p-value only answers the question,
00:00:08.310 --> 00:00:13.530
is it likely or is it plausible that
there is no effect in the population?
00:00:13.530 --> 00:00:16.800
But p-value doesn't really provide
us with any direct evidence of
00:00:16.800 --> 00:00:19.981
the uncertainty of the estimate of the effect.
00:00:20.251 --> 00:00:23.310
And therefore we have another
strategy called confidence intervals.
00:00:23.310 --> 00:00:30.960
So a confidence interval is an interval of
two endpoints constructed around the estimate.
00:00:30.960 --> 00:00:33.900
So here's an example of a confidence interval,
00:00:33.900 --> 00:00:39.332
the estimate is 2.02 and we have one endpoint
00:00:39.332 --> 00:00:43.202
that is below the estimate, which is 2.01
00:00:43.202 --> 00:00:47.310
and then another endpoint that is
above the estimate that is 2.03.
00:00:47.310 --> 00:00:52.770
So that's an interval that says something
about the precision of the estimates.
00:00:52.770 --> 00:00:56.940
Formally the confidence interval is defined as
00:00:56.940 --> 00:01:00.450
an interval that is calculated in a way that,
00:01:00.450 --> 00:01:03.120
if we repeat the sample over and over many times,
00:01:03.120 --> 00:01:09.090
then the true population value
will fall within that interval,
00:01:09.090 --> 00:01:11.280
95% of the replications.
00:01:11.280 --> 00:01:15.300
So five times out of 100 replications,
00:01:15.300 --> 00:01:19.080
the population value would
be outside the interval.
00:01:19.080 --> 00:01:23.910
So from this inference, we can kind of infer that,
00:01:23.910 --> 00:01:27.810
the population value is maybe
somewhere within the interval.
00:01:27.810 --> 00:01:32.730
We can't say that it's precisely
there in any formal way
00:01:32.730 --> 00:01:35.100
but we can kind of infer that maybe it's there.
00:01:36.348 --> 00:01:41.370
So these intervals can also be used the same
way as a null hypothesis significance test,
00:01:41.370 --> 00:01:44.400
so you can compare whether zero
is included in the interval.
00:01:44.400 --> 00:01:48.390
So interval here is 2.01 and 2.03,
00:01:48.390 --> 00:01:50.820
zero is not within the interval and
00:01:50.820 --> 00:01:56.640
therefore we say that it's unlikely
that the population value would be zero.
00:01:57.407 --> 00:01:59.640
To understand the confidence interval better,
00:01:59.640 --> 00:02:04.500
it's useful to understand it in the
framework of the previous examples.
00:02:04.500 --> 00:02:10.050
So we had the difference between men-lead
companies and women-led companies
00:02:10.050 --> 00:02:11.970
that occurs by chance only,
00:02:11.970 --> 00:02:14.010
sometimes we get a large difference,
00:02:14.010 --> 00:02:16.050
sometimes we get a small difference.
00:02:16.050 --> 00:02:20.640
And these confidence intervals are intervals
that are constructed around an estimate,
00:02:20.640 --> 00:02:21.990
so let's say an estimate is here,
00:02:21.990 --> 00:02:24.120
then the interval could be here.
00:02:24.120 --> 00:02:28.380
So this interval would not
contain the population value,
00:02:28.380 --> 00:02:31.440
the population value of
zero is above the interval.
00:02:31.440 --> 00:02:35.610
These two intervals would
contain the population value,
00:02:35.610 --> 00:02:39.810
and in this last interval,
constructed around this estimate,
00:02:40.081 --> 00:02:44.460
the population value falls below the interval.
00:02:46.504 --> 00:02:52.246
If the confidence interval is
valid then these intervals,
00:02:52.591 --> 00:02:55.200
that include the population value,
00:02:55.200 --> 00:02:58.081
which in this case is 0, but it
could be something else as well,
00:02:58.727 --> 00:03:01.110
will be 95% of the case.
00:03:01.110 --> 00:03:05.400
So 5 times out of 20, we get
either this one or that one,
00:03:05.400 --> 00:03:07.860
so the population value is outside the interval.
00:03:08.612 --> 00:03:11.521
We can kind of informally infer that,
00:03:11.521 --> 00:03:14.670
maybe the population value is
somewhere within the interval,
00:03:14.670 --> 00:03:16.530
but we can't say it precisely.
00:03:17.282 --> 00:03:19.650
How these confidence intervals are calculated?
00:03:19.650 --> 00:03:25.051
One way, particularly common
way, is to use a normal approximation.
00:03:25.682 --> 00:03:29.250
So the idea of a normal approximation
confidence interval is that,
00:03:29.250 --> 00:03:35.940
we construct the interval so that
it's the estimate minus 1.96,
00:03:35.940 --> 00:03:45.616
which is two standard deviations or 95 % of the normal distribution, times the standard error.
00:03:47.901 --> 00:03:53.820
So then the upper interval is the estimate
plus 1.96 times the standard error.
00:03:53.820 --> 00:03:56.910
Why we multiply with 1.96 is that,
00:03:56.910 --> 00:04:02.790
that way we get 95 % of the normal
distribution within the interval.
00:04:03.677 --> 00:04:06.000
So the confidence interval estimate,
00:04:06.782 --> 00:04:08.881
we can just do a little bit of math,
00:04:08.881 --> 00:04:14.250
and we can say that this is
equivalent to comparing the
00:04:14.250 --> 00:04:17.100
estimate divided by standard error to 1.96,
00:04:17.100 --> 00:04:20.730
which is the t test basically.
00:04:20.730 --> 00:04:28.080
So there is an equivalence between
t test or z test, to be more precise,
00:04:28.080 --> 00:04:32.791
which is a comparison against normal
distribution, instead of student's t distribution.
00:04:33.077 --> 00:04:35.176
So if we just compare,
00:04:35.176 --> 00:04:38.430
whether the confidence interval
includes a zero or not,
00:04:38.430 --> 00:04:45.000
that is exactly the same thing as calculating
a p-value and comparing it against 0.05.
00:04:45.000 --> 00:04:48.030
So doing a confidence interval
doesn't make us any smarter,
00:04:48.030 --> 00:04:51.841
it's just the same thing in
a slightly more complex way.
00:04:53.419 --> 00:04:55.890
If we can't assume that the estimates are normal
00:04:55.890 --> 00:04:58.590
then these two approaches are not the same,
00:04:58.590 --> 00:05:02.730
and there are techniques for calculating
confidence intervals for that scenario as well.
00:05:03.406 --> 00:05:07.710
But the important thing to know
about confidence intervals is that
00:05:07.710 --> 00:05:12.240
they are pretty useless if you just
check whether zero is in the interval.
00:05:12.240 --> 00:05:16.980
There is a nice quote in an article by Cortina,
00:05:17.970 --> 00:05:23.070
who attributes the quote to Thompson that,
00:05:23.070 --> 00:05:30.330
if we were to be as rigid with confidence
intervals as we are with the p-values,
00:05:30.330 --> 00:05:33.990
taking in the 0.05 as the gold standard,
00:05:33.990 --> 00:05:37.380
then we would just be stupid on another metric.
00:05:37.786 --> 00:05:41.190
So doing confidence intervals
without interpreting,
00:05:41.190 --> 00:05:43.726
what the endpoints mean and just checking,
00:05:43.726 --> 00:05:47.610
whether zero is within the interval
doesn't really make any sense whatsoever.
00:05:48.963 --> 00:05:53.040
The problem with confidence
intervals and p-values is that
00:05:53.040 --> 00:05:56.370
both are commonly misinterpreted.
00:05:56.370 --> 00:06:02.100
So the p-value is, the probability
of obtaining the opposite result
00:06:02.100 --> 00:06:04.531
if the null hypothesis is correct.
00:06:04.906 --> 00:06:08.520
It is not the probability that
the null hypothesis is correct.
00:06:09.572 --> 00:06:12.060
I will explain that more in another video.
00:06:13.233 --> 00:06:17.640
So if you guessed a dice that is thrown correctly,
00:06:17.640 --> 00:06:19.050
it doesn't mean that you're clairvoyant.
00:06:19.050 --> 00:06:21.600
You could guess a dice randomly.
00:06:23.134 --> 00:06:27.120
The confidence interval is an interval
that will contain the population value
00:06:27.120 --> 00:06:29.580
with the frequency of given confidence level.
00:06:30.241 --> 00:06:33.780
It is not the probability
that the population value
00:06:33.780 --> 00:06:35.850
is contained within a particular interval,
00:06:35.850 --> 00:06:37.141
that's a different thing.
00:06:37.141 --> 00:06:39.780
Understanding, why these two are not the same,
00:06:39.780 --> 00:06:41.070
is a bit more complicated,
00:06:41.070 --> 00:06:44.160
so I will not cover that but we'll
take a look at this in more detail.