WEBVTT 00:00:00.060 --> 00:00:02.970 We also have a second strategy for statistical inference. 00:00:02.970 --> 00:00:05.310 The problem with p-values is that, 00:00:05.310 --> 00:00:08.310 p-value only answers the question, 00:00:08.310 --> 00:00:13.530 is it likely or is it plausible that there is no effect in the population? 00:00:13.530 --> 00:00:16.800 But p-value doesn't really provide us with any direct evidence of 00:00:16.800 --> 00:00:19.981 the uncertainty of the estimate of the effect. 00:00:20.251 --> 00:00:23.310 And therefore we have another strategy called confidence intervals. 00:00:23.310 --> 00:00:30.960 So a confidence interval is an interval of two endpoints constructed around the estimate. 00:00:30.960 --> 00:00:33.900 So here's an example of a confidence interval, 00:00:33.900 --> 00:00:39.332 the estimate is 2.02 and we have one endpoint 00:00:39.332 --> 00:00:43.202 that is below the estimate, which is 2.01 00:00:43.202 --> 00:00:47.310 and then another endpoint that is above the estimate that is 2.03. 00:00:47.310 --> 00:00:52.770 So that's an interval that says something about the precision of the estimates. 00:00:52.770 --> 00:00:56.940 Formally the confidence interval is defined as 00:00:56.940 --> 00:01:00.450 an interval that is calculated in a way that, 00:01:00.450 --> 00:01:03.120 if we repeat the sample over and over many times, 00:01:03.120 --> 00:01:09.090 then the true population value will fall within that interval, 00:01:09.090 --> 00:01:11.280 95% of the replications. 00:01:11.280 --> 00:01:15.300 So five times out of 100 replications, 00:01:15.300 --> 00:01:19.080 the population value would be outside the interval. 00:01:19.080 --> 00:01:23.910 So from this inference, we can kind of infer that, 00:01:23.910 --> 00:01:27.810 the population value is maybe somewhere within the interval. 00:01:27.810 --> 00:01:32.730 We can't say that it's precisely there in any formal way 00:01:32.730 --> 00:01:35.100 but we can kind of infer that maybe it's there. 00:01:36.348 --> 00:01:41.370 So these intervals can also be used the same way as a null hypothesis significance test, 00:01:41.370 --> 00:01:44.400 so you can compare whether zero is included in the interval. 00:01:44.400 --> 00:01:48.390 So interval here is 2.01 and 2.03, 00:01:48.390 --> 00:01:50.820 zero is not within the interval and 00:01:50.820 --> 00:01:56.640 therefore we say that it's unlikely that the population value would be zero. 00:01:57.407 --> 00:01:59.640 To understand the confidence interval better, 00:01:59.640 --> 00:02:04.500 it's useful to understand it in the framework of the previous examples. 00:02:04.500 --> 00:02:10.050 So we had the difference between men-lead companies and women-led companies 00:02:10.050 --> 00:02:11.970 that occurs by chance only, 00:02:11.970 --> 00:02:14.010 sometimes we get a large difference, 00:02:14.010 --> 00:02:16.050 sometimes we get a small difference. 00:02:16.050 --> 00:02:20.640 And these confidence intervals are intervals that are constructed around an estimate, 00:02:20.640 --> 00:02:21.990 so let's say an estimate is here, 00:02:21.990 --> 00:02:24.120 then the interval could be here. 00:02:24.120 --> 00:02:28.380 So this interval would not contain the population value, 00:02:28.380 --> 00:02:31.440 the population value of zero is above the interval. 00:02:31.440 --> 00:02:35.610 These two intervals would contain the population value, 00:02:35.610 --> 00:02:39.810 and in this last interval, constructed around this estimate, 00:02:40.081 --> 00:02:44.460 the population value falls below the interval. 00:02:46.504 --> 00:02:52.246 If the confidence interval is valid then these intervals, 00:02:52.591 --> 00:02:55.200 that include the population value, 00:02:55.200 --> 00:02:58.081 which in this case is 0, but it could be something else as well, 00:02:58.727 --> 00:03:01.110 will be 95% of the case. 00:03:01.110 --> 00:03:05.400 So 5 times out of 20, we get either this one or that one, 00:03:05.400 --> 00:03:07.860 so the population value is outside the interval. 00:03:08.612 --> 00:03:11.521 We can kind of informally infer that, 00:03:11.521 --> 00:03:14.670 maybe the population value is somewhere within the interval, 00:03:14.670 --> 00:03:16.530 but we can't say it precisely. 00:03:17.282 --> 00:03:19.650 How these confidence intervals are calculated? 00:03:19.650 --> 00:03:25.051 One way, particularly common way, is to use a normal approximation. 00:03:25.682 --> 00:03:29.250 So the idea of a normal approximation confidence interval is that, 00:03:29.250 --> 00:03:35.940 we construct the interval so that it's the estimate minus 1.96, 00:03:35.940 --> 00:03:45.616 which is two standard deviations or 95 % of the normal distribution, times the standard error. 00:03:47.901 --> 00:03:53.820 So then the upper interval is the estimate plus 1.96 times the standard error. 00:03:53.820 --> 00:03:56.910 Why we multiply with 1.96 is that, 00:03:56.910 --> 00:04:02.790 that way we get 95 % of the normal distribution within the interval. 00:04:03.677 --> 00:04:06.000 So the confidence interval estimate, 00:04:06.782 --> 00:04:08.881 we can just do a little bit of math, 00:04:08.881 --> 00:04:14.250 and we can say that this is equivalent to comparing the 00:04:14.250 --> 00:04:17.100 estimate divided by standard error to 1.96, 00:04:17.100 --> 00:04:20.730 which is the t test basically. 00:04:20.730 --> 00:04:28.080 So there is an equivalence between t test or z test, to be more precise, 00:04:28.080 --> 00:04:32.791 which is a comparison against normal distribution, instead of student's t distribution. 00:04:33.077 --> 00:04:35.176 So if we just compare, 00:04:35.176 --> 00:04:38.430 whether the confidence interval includes a zero or not, 00:04:38.430 --> 00:04:45.000 that is exactly the same thing as calculating a p-value and comparing it against 0.05. 00:04:45.000 --> 00:04:48.030 So doing a confidence interval doesn't make us any smarter, 00:04:48.030 --> 00:04:51.841 it's just the same thing in a slightly more complex way. 00:04:53.419 --> 00:04:55.890 If we can't assume that the estimates are normal 00:04:55.890 --> 00:04:58.590 then these two approaches are not the same, 00:04:58.590 --> 00:05:02.730 and there are techniques for calculating confidence intervals for that scenario as well. 00:05:03.406 --> 00:05:07.710 But the important thing to know about confidence intervals is that 00:05:07.710 --> 00:05:12.240 they are pretty useless if you just check whether zero is in the interval. 00:05:12.240 --> 00:05:16.980 There is a nice quote in an article by Cortina, 00:05:17.970 --> 00:05:23.070 who attributes the quote to Thompson that, 00:05:23.070 --> 00:05:30.330 if we were to be as rigid with confidence intervals as we are with the p-values, 00:05:30.330 --> 00:05:33.990 taking in the 0.05 as the gold standard, 00:05:33.990 --> 00:05:37.380 then we would just be stupid on another metric. 00:05:37.786 --> 00:05:41.190 So doing confidence intervals without interpreting, 00:05:41.190 --> 00:05:43.726 what the endpoints mean and just checking, 00:05:43.726 --> 00:05:47.610 whether zero is within the interval doesn't really make any sense whatsoever. 00:05:48.963 --> 00:05:53.040 The problem with confidence intervals and p-values is that 00:05:53.040 --> 00:05:56.370 both are commonly misinterpreted. 00:05:56.370 --> 00:06:02.100 So the p-value is, the probability of obtaining the opposite result 00:06:02.100 --> 00:06:04.531 if the null hypothesis is correct. 00:06:04.906 --> 00:06:08.520 It is not the probability that the null hypothesis is correct. 00:06:09.572 --> 00:06:12.060 I will explain that more in another video. 00:06:13.233 --> 00:06:17.640 So if you guessed a dice that is thrown correctly, 00:06:17.640 --> 00:06:19.050 it doesn't mean that you're clairvoyant. 00:06:19.050 --> 00:06:21.600 You could guess a dice randomly. 00:06:23.134 --> 00:06:27.120 The confidence interval is an interval that will contain the population value 00:06:27.120 --> 00:06:29.580 with the frequency of given confidence level. 00:06:30.241 --> 00:06:33.780 It is not the probability that the population value 00:06:33.780 --> 00:06:35.850 is contained within a particular interval, 00:06:35.850 --> 00:06:37.141 that's a different thing. 00:06:37.141 --> 00:06:39.780 Understanding, why these two are not the same, 00:06:39.780 --> 00:06:41.070 is a bit more complicated, 00:06:41.070 --> 00:06:44.160 so I will not cover that but we'll take a look at this in more detail.