WEBVTT

00:00:00.060 --> 00:00:02.970
We also have a second strategy 
for statistical inference.

00:00:02.970 --> 00:00:05.310
The problem with p-values is that,

00:00:05.310 --> 00:00:08.310
p-value only answers the question,

00:00:08.310 --> 00:00:13.530
is it likely or is it plausible that 
there is no effect in the population?

00:00:13.530 --> 00:00:16.800
But p-value doesn't really provide 
us with any direct evidence of

00:00:16.800 --> 00:00:19.981
the uncertainty of the estimate of the effect.

00:00:20.251 --> 00:00:23.310
And therefore we have another 
strategy called confidence intervals.

00:00:23.310 --> 00:00:30.960
So a confidence interval is an interval of 
two endpoints constructed around the estimate.

00:00:30.960 --> 00:00:33.900
So here's an example of a confidence interval,

00:00:33.900 --> 00:00:39.332
the estimate is 2.02 and we have one endpoint

00:00:39.332 --> 00:00:43.202
that is below the estimate, which is 2.01

00:00:43.202 --> 00:00:47.310
and then another endpoint that is 
above the estimate that is 2.03.

00:00:47.310 --> 00:00:52.770
So that's an interval that says something 
about the precision of the estimates.

00:00:52.770 --> 00:00:56.940
Formally the confidence interval is defined as

00:00:56.940 --> 00:01:00.450
an interval that is calculated in a way that,

00:01:00.450 --> 00:01:03.120
if we repeat the sample over and over many times,

00:01:03.120 --> 00:01:09.090
then the true population value 
will fall within that interval,

00:01:09.090 --> 00:01:11.280
95% of the replications.

00:01:11.280 --> 00:01:15.300
So five times out of 100 replications,

00:01:15.300 --> 00:01:19.080
the population value would 
be outside the interval.

00:01:19.080 --> 00:01:23.910
So from this inference, we can kind of infer that,

00:01:23.910 --> 00:01:27.810
the population value is maybe 
somewhere within the interval.

00:01:27.810 --> 00:01:32.730
We can't say that it's precisely 
there in any formal way

00:01:32.730 --> 00:01:35.100
but we can kind of infer that maybe it's there.

00:01:36.348 --> 00:01:41.370
So these intervals can also be used the same 
way as a null hypothesis significance test,

00:01:41.370 --> 00:01:44.400
so you can compare whether zero 
is included in the interval.

00:01:44.400 --> 00:01:48.390
So interval here is 2.01 and 2.03,

00:01:48.390 --> 00:01:50.820
zero is not within the interval and

00:01:50.820 --> 00:01:56.640
therefore we say that it's unlikely 
that the population value would be zero.

00:01:57.407 --> 00:01:59.640
To understand the confidence interval better,

00:01:59.640 --> 00:02:04.500
it's useful to understand it in the 
framework of the previous examples.

00:02:04.500 --> 00:02:10.050
So we had the difference between men-lead 
companies and women-led companies

00:02:10.050 --> 00:02:11.970
that occurs by chance only,

00:02:11.970 --> 00:02:14.010
sometimes we get a large difference,

00:02:14.010 --> 00:02:16.050
sometimes we get a small difference.

00:02:16.050 --> 00:02:20.640
And these confidence intervals are intervals 
that are constructed around an estimate,

00:02:20.640 --> 00:02:21.990
so let's say an estimate is here,

00:02:21.990 --> 00:02:24.120
then the interval could be here.

00:02:24.120 --> 00:02:28.380
So this interval would not 
contain the population value,

00:02:28.380 --> 00:02:31.440
the population value of 
zero is above the interval.

00:02:31.440 --> 00:02:35.610
These two intervals would 
contain the population value,

00:02:35.610 --> 00:02:39.810
and in this last interval, 
constructed around this estimate,

00:02:40.081 --> 00:02:44.460
the population value falls below the interval.

00:02:46.504 --> 00:02:52.246
If the confidence interval is 
valid then these intervals,

00:02:52.591 --> 00:02:55.200
that include the population value,

00:02:55.200 --> 00:02:58.081
which in this case is 0, but it 
could be something else as well,

00:02:58.727 --> 00:03:01.110
will be 95% of the case.

00:03:01.110 --> 00:03:05.400
So 5 times out of 20, we get 
either this one or that one,

00:03:05.400 --> 00:03:07.860
so the population value is outside the interval.

00:03:08.612 --> 00:03:11.521
We can kind of informally infer that,

00:03:11.521 --> 00:03:14.670
maybe the population value is 
somewhere within the interval,

00:03:14.670 --> 00:03:16.530
but we can't say it precisely.

00:03:17.282 --> 00:03:19.650
How these confidence intervals are calculated?

00:03:19.650 --> 00:03:25.051
One way, particularly common 
way, is to use a normal approximation.

00:03:25.682 --> 00:03:29.250
So the idea of a normal approximation 
confidence interval is that,

00:03:29.250 --> 00:03:35.940
we construct the interval so that 
it's the estimate minus 1.96,

00:03:35.940 --> 00:03:45.616
which is two standard deviations or 95 % of the normal distribution, times the standard error.

00:03:47.901 --> 00:03:53.820
So then the upper interval is the estimate 
plus 1.96 times the standard error.

00:03:53.820 --> 00:03:56.910
Why we multiply with 1.96 is that,

00:03:56.910 --> 00:04:02.790
that way we get 95 % of the normal 
distribution within the interval.

00:04:03.677 --> 00:04:06.000
So the confidence interval estimate,

00:04:06.782 --> 00:04:08.881
we can just do a little bit of math,

00:04:08.881 --> 00:04:14.250
and we can say that this is 
equivalent to comparing the  

00:04:14.250 --> 00:04:17.100
estimate divided by standard error to 1.96,

00:04:17.100 --> 00:04:20.730
which is the t test basically.

00:04:20.730 --> 00:04:28.080
So there is an equivalence between
t test or z test, to be more precise,

00:04:28.080 --> 00:04:32.791
which is a comparison against normal 
distribution, instead of student's t distribution.

00:04:33.077 --> 00:04:35.176
So if we just compare,

00:04:35.176 --> 00:04:38.430
whether the confidence interval 
includes a zero or not,

00:04:38.430 --> 00:04:45.000
that is exactly the same thing as calculating 
a p-value and comparing it against 0.05.

00:04:45.000 --> 00:04:48.030
So doing a confidence interval 
doesn't make us any smarter,

00:04:48.030 --> 00:04:51.841
it's just the same thing in 
a slightly more complex way.

00:04:53.419 --> 00:04:55.890
If we can't assume that the estimates are normal

00:04:55.890 --> 00:04:58.590
then these two approaches are not the same,

00:04:58.590 --> 00:05:02.730
and there are techniques for calculating 
confidence intervals for that scenario as well.

00:05:03.406 --> 00:05:07.710
But the important thing to know 
about confidence intervals is that

00:05:07.710 --> 00:05:12.240
they are pretty useless if you just 
check whether zero is in the interval.

00:05:12.240 --> 00:05:16.980
There is a nice quote in an article by Cortina,

00:05:17.970 --> 00:05:23.070
who attributes the quote to Thompson that,

00:05:23.070 --> 00:05:30.330
if we were to be as rigid with confidence 
intervals as we are with the p-values,

00:05:30.330 --> 00:05:33.990
taking in the 0.05 as the gold standard,

00:05:33.990 --> 00:05:37.380
then we would just be stupid on another metric.

00:05:37.786 --> 00:05:41.190
So doing confidence intervals 
without interpreting,

00:05:41.190 --> 00:05:43.726
what the endpoints mean and just checking,

00:05:43.726 --> 00:05:47.610
whether zero is within the interval 
doesn't really make any sense whatsoever.

00:05:48.963 --> 00:05:53.040
The problem with confidence 
intervals and p-values is that

00:05:53.040 --> 00:05:56.370
both are commonly misinterpreted.

00:05:56.370 --> 00:06:02.100
So the p-value is, the probability 
of obtaining the opposite result

00:06:02.100 --> 00:06:04.531
if the null hypothesis is correct.

00:06:04.906 --> 00:06:08.520
It is not the probability that 
the null hypothesis is correct.

00:06:09.572 --> 00:06:12.060
I will explain that more in another video.

00:06:13.233 --> 00:06:17.640
So if you guessed a dice that is thrown correctly,

00:06:17.640 --> 00:06:19.050
it doesn't mean that you're clairvoyant.

00:06:19.050 --> 00:06:21.600
You could guess a dice randomly.

00:06:23.134 --> 00:06:27.120
The confidence interval is an interval 
that will contain the population value

00:06:27.120 --> 00:06:29.580
with the frequency of given confidence level.

00:06:30.241 --> 00:06:33.780
It is not the probability 
that the population value

00:06:33.780 --> 00:06:35.850
is contained within a particular interval,

00:06:35.850 --> 00:06:37.141
that's a different thing.

00:06:37.141 --> 00:06:39.780
Understanding, why these two are not the same,

00:06:39.780 --> 00:06:41.070
is a bit more complicated,

00:06:41.070 --> 00:06:44.160
so I will not cover that but we'll 
take a look at this in more detail.