WEBVTT

00:00:00.060 --> 00:00:06.870
The first strategy for making coastal claims using 
quantitative data is a randomized experiment. 

00:00:06.870 --> 00:00:11.520
The idea of randomized experiment 
is that we have some population  

00:00:11.520 --> 00:00:18.840
of interest from which we take a sample.
Then we divide the sample into two randomly. 

00:00:18.840 --> 00:00:22.590
One is called a treatment group and 
another one is the control group. 

00:00:22.590 --> 00:00:29.280
Because we select those two groups 
randomly there are no differences  

00:00:29.280 --> 00:00:32.850
statistically between these two groups.
Or if there are some differences  

00:00:32.850 --> 00:00:40.200
then it's due to chance only.
This relates to back to our example of  

00:00:40.200 --> 00:00:46.350
dividing the men and women-led companies into two 
groups randomly to see if there's a difference. 

00:00:47.640 --> 00:00:51.270
We divide these in to two groups 
with treatment and control. 

00:00:51.270 --> 00:00:55.740
Then we apply some kind of 
treatment to the treatment group. 

00:00:57.300 --> 00:01:01.980
Typically this example is from 
medical research so this is  

00:01:01.980 --> 00:01:08.640
applied in medicine and it's easy to understand.
One group receives appeal, the other one doesn't. 

00:01:08.640 --> 00:01:17.880
Then after let's say two days, we assume that 
the effect, takes two days to be realized. 

00:01:17.880 --> 00:01:24.420
We measure the health of these two groups, 
we compare if the group that received appeal  

00:01:24.420 --> 00:01:30.810
medicine is better than the second group.
Then we conclude that there's a causal effect. 

00:01:30.810 --> 00:01:37.920
The why this is a valid causal claim 
is that these groups are perfectly  

00:01:37.920 --> 00:01:42.240
comparable to start with because they 
are randomly chosen from the same sample. 

00:01:42.240 --> 00:01:49.290
Therefore the only plausible explanation beyond 
chance for and a difference between the groups is  

00:01:49.290 --> 00:01:58.680
that there is an actual effect of the treatment.
This works well under certain conditions.  

00:01:58.680 --> 00:02:04.590
So we need to have a random 
assignment that's very important. 

00:02:04.590 --> 00:02:10.470
If we have people who get to choose whether they 
receive the medicine or not, then those people  

00:02:10.470 --> 00:02:16.200
who are more sick will likely choose to be in 
the treatment group than the control group. 

00:02:16.200 --> 00:02:25.140
And then comparison here would confound the 
selection effect of how people chose to be in  

00:02:25.140 --> 00:02:30.570
these groups and the treatment effect.
Then we have a large enough sample. 

00:02:30.570 --> 00:02:35.160
And then some other assumptions 
that are not as relevant. 

00:02:35.970 --> 00:02:38.790
We have large enough sample that we 
don't have to worry about chance,  

00:02:38.790 --> 00:02:44.700
we have random assignment here and after 
that we can compare the difference after  

00:02:44.700 --> 00:02:49.620
receiving the medicine or the 
treatment as causal effect. 

00:02:49.620 --> 00:02:55.410
The randomization is important because 
we want to show that this difference is  

00:02:55.410 --> 00:03:03.780
because of the treatment and not because we 
chose to assign the groups in a certain way. 

00:03:03.780 --> 00:03:08.670
We want to show that there is the treatment 
effect instead of a selection effect. 

00:03:08.670 --> 00:03:16.200
Then we repeat this a couple of times and when 
the study results have been verified independently  

00:03:16.200 --> 00:03:22.500
or two times then we can sell our medicine. 
And that's how randomized experiments work. 

00:03:22.500 --> 00:03:27.240
Of course there are variations to 
this design like you can compare,  

00:03:27.240 --> 00:03:35.700
how health of an individual increases so 
that would be a within individual study. 

00:03:35.700 --> 00:03:38.970
This is a between individual 
study but this is the base case. 

00:03:38.970 --> 00:03:46.740
This is the simplest possible experimental design.
Experimental designs are not always theasible  

00:03:46.740 --> 00:03:51.990
they can be done in business studies but if we 
study organizations then up line treatments to  

00:03:51.990 --> 00:03:58.560
organizations could be difficult to organize.
We also have a second best option called  

00:03:58.560 --> 00:04:03.030
Quasi-experiment and the idea of a 
quasi-experiment is that we have some  

00:04:03.030 --> 00:04:09.270
elements of experimental approach but we 
don't have the full experimental control. 

00:04:09.270 --> 00:04:18.900
For example we could have separate sample pretest 
and posttest. We have something for example,  

00:04:18.900 --> 00:04:23.850
we know that we have a school and 
the kids will receive a medicine. 

00:04:23.850 --> 00:04:29.770
Everyone gets the medicine on one 
day but we can't influence that. 

00:04:29.770 --> 00:04:35.890
What we can do is that we randomize the 
kids, we measure their health for half  

00:04:35.890 --> 00:04:42.040
of the students before the treatment 
for other half after the treatment. 

00:04:42.040 --> 00:04:48.400
And then we assume that this after the treatment 
group is otherwise comparable for them before  

00:04:48.400 --> 00:04:52.960
the treatment group except for the treatment.
So we assume that there are no time effects.  

00:04:52.960 --> 00:04:58.960
And that would allow us to make a causal 
claim based on quasi-experimental design. 

00:04:58.960 --> 00:05:06.820
We can also have experiments where the choice 
between treatment and control is not random. 

00:05:06.820 --> 00:05:14.590
Either it would look like random we don't 
have control on the randomization in which  

00:05:14.590 --> 00:05:19.420
case we would assume that these samples 
behave as if they were random samples. 

00:05:19.420 --> 00:05:24.310
Or we can do some statistical adjustments 
for this non-random selection. 

00:05:24.310 --> 00:05:29.680
So that's nonequivalent control group design.
Another one is interrupted time series design. 

00:05:29.680 --> 00:05:35.740
So we follow some units or some 
companies, people over time. 

00:05:35.740 --> 00:05:41.290
Then there is an exogenous shock that 
happens, so some kind of exogenous event. 

00:05:41.290 --> 00:05:48.340
For example new regulation is implemented in 
markets independently of these organizations. 

00:05:48.340 --> 00:05:53.080
Then we can analyze what is the effect of 
that new regulation on company performance. 

00:05:53.080 --> 00:05:59.950
Assuming that the implementation doesn't in any 
way depend on how these companies are doing. 

00:06:00.550 --> 00:06:05.080
That's another quasi-experimental design. 
So the idea of quasi-experimental design  

00:06:05.080 --> 00:06:10.030
is that we have a treatment but we 
don't have the full randomization. 

00:06:10.030 --> 00:06:13.900
So something happens, something 
is manipulated but we don't really  

00:06:13.900 --> 00:06:21.370
have quite a full experimental design.
Quasi-experiments are something that people  

00:06:21.370 --> 00:06:25.240
overlook when they think about their designs.
There's a great article in Organizational  

00:06:25.240 --> 00:06:28.720
Research Methods about different 
quasi-experimental designs. 

00:06:28.720 --> 00:06:33.130
I would recommend that you consider these 
when designing your studies because you can  

00:06:33.130 --> 00:06:40.090
make really strong claims that are perhaps 
more generalizable than lab experiments. 

00:06:40.090 --> 00:06:44.390
Because quasi-experiments typically 
take place in real-life settings.