WEBVTT
Kind: captions
Language: en

00:00:00.060 --> 00:00:01.595
There are a couple of different things  

00:00:01.595 --> 00:00:04.710
that influence the quality of 
your quantitative research,

00:00:04.710 --> 00:00:07.807
so what makes your research reliable and valid?

00:00:08.293 --> 00:00:10.230
To understand those things and

00:00:10.230 --> 00:00:13.350
what does the quality of your study depend on,

00:00:13.350 --> 00:00:15.550
we first have to go through

00:00:15.550 --> 00:00:21.270
what does the concept of reliability and validity mean in the context of quantitative studies?

00:00:21.675 --> 00:00:24.525
This is one way to understand 
reliability and validity.

00:00:25.170 --> 00:00:28.593
The concept of reliability 
is fairly easy to understand.

00:00:28.593 --> 00:00:32.340
It basically means that if we 
repeat the same study again,

00:00:32.340 --> 00:00:33.510
using the same sample,

00:00:33.510 --> 00:00:35.100
would we get the same result?

00:00:35.764 --> 00:00:41.121
In quantitative studies, your 
analysis is done by a computer,

00:00:41.121 --> 00:00:44.400
and the computer will always 
produce the same result,

00:00:44.400 --> 00:00:45.806
if you give it the same data.

00:00:46.210 --> 00:00:50.430
So reliability in quantitative 
analysis or quantitative studies

00:00:50.430 --> 00:00:54.090
is mostly about measurement reliability.

00:00:54.090 --> 00:00:56.151
So if you measure the same things again,

00:00:56.265 --> 00:00:57.975
would you get the same result?

00:00:58.347 --> 00:01:00.312
The validity, on the other hand,

00:01:00.571 --> 00:01:02.581
answers the question of,

00:01:03.480 --> 00:01:08.190
does the study answer the question that 
it is supposed to answer correctly?

00:01:08.190 --> 00:01:10.410
So it doesn't provide a correct answer.

00:01:10.410 --> 00:01:11.730
Reliability is about,

00:01:11.730 --> 00:01:15.390
do we get the same answer if 
you repeat the same study?

00:01:15.390 --> 00:01:18.898
Reliability doesn't tell us anything 
about whether the result is correct.

00:01:19.448 --> 00:01:21.750
Validate tells us whether the result is correct.

00:01:22.430 --> 00:01:27.289
Then, validity can be broken down 
into four different categories.

00:01:27.646 --> 00:01:29.040
Measurement validity,

00:01:29.040 --> 00:01:31.249
which we will discuss later,

00:01:31.249 --> 00:01:34.170
refers to whether the variables in our data

00:01:34.170 --> 00:01:36.990
measure the concepts that we claim they measure.

00:01:38.010 --> 00:01:40.727
Statistical conclusion validity refers to,

00:01:40.727 --> 00:01:43.290
whether our statistical results are correct.

00:01:43.290 --> 00:01:45.903
So if you have identified a trend

00:01:45.903 --> 00:01:48.573
or a difference in the sample,

00:01:49.107 --> 00:01:52.140
have we identified that correctly?

00:01:52.140 --> 00:01:56.310
So, is there really a difference 
or a trend in the population?

00:01:56.310 --> 00:02:01.290
So it relates to whether our statistical 
associations measured from the sample,

00:02:01.290 --> 00:02:03.030
generalize to the populous.

00:02:03.969 --> 00:02:06.073
Then we have internal validity,

00:02:06.073 --> 00:02:13.590
which refers to whether the relationships 
actually correspond to the causal relation

00:02:13.590 --> 00:02:14.710
that we claimed.

00:02:14.904 --> 00:02:18.990
And internal validity is about causal inference.

00:02:18.990 --> 00:02:22.500
So have we identified the right controls and

00:02:22.500 --> 00:02:25.745
how we control the controls appropriately,

00:02:25.745 --> 00:02:33.090
or is our experimental or quasi-experimental 
design free of any possible selection effects,

00:02:33.090 --> 00:02:36.138
that would confound the treatment effect.

00:02:36.591 --> 00:02:38.010
So that's causal inference.

00:02:38.010 --> 00:02:41.669
Then external validity simply refers to,

00:02:41.669 --> 00:02:46.680
do our results from one population 
generalize to other populations?

00:02:47.166 --> 00:02:53.461
So what determines the quality of a 
research study is an interesting question,

00:02:53.461 --> 00:02:57.267
and it can be examined 
through the research process

00:02:57.267 --> 00:02:59.366
according to Singleton and Straits book.

00:02:59.868 --> 00:03:03.900
So Singleton and Straits say 
that research always starts from

00:03:03.900 --> 00:03:07.134
research topic and formulating 
of a research question.

00:03:07.134 --> 00:03:09.933
Of course, your study is not very valuable,

00:03:09.933 --> 00:03:12.900
if the research question is not interesting,

00:03:12.900 --> 00:03:15.240
but we will be focusing on the empirical part.

00:03:15.758 --> 00:03:18.420
So after you have your research question set,

00:03:18.938 --> 00:03:22.000
then you start to prepare your research design.

00:03:22.437 --> 00:03:25.549
And the research design has two main components.

00:03:26.035 --> 00:03:27.280
One is sampling,

00:03:27.280 --> 00:03:29.130
so what are the units, people,

00:03:29.130 --> 00:03:30.903
organizations, projects,

00:03:30.903 --> 00:03:32.880
whatever that are the units that you're studying.

00:03:32.880 --> 00:03:35.730
So which units and how many are you studying.

00:03:36.167 --> 00:03:38.024
And then we have the measurement,

00:03:38.024 --> 00:03:39.900
which variables we collect.

00:03:40.110 --> 00:03:43.191
So if we think of our data as an Excel sheet,

00:03:43.434 --> 00:03:47.940
sampling concerns, what are 
the rows in that Excel sheet,

00:03:47.940 --> 00:03:53.310
and measurement concerns what are 
the columns in that Excel sheet.

00:03:53.731 --> 00:03:57.900
Then we do data collection and 
after the data have been collected,

00:03:57.900 --> 00:04:00.694
we typically process the data somehow,

00:04:00.856 --> 00:04:03.155
we screen it for errors,

00:04:03.155 --> 00:04:05.218
we modify it in the different form,

00:04:05.218 --> 00:04:07.020
and then we do data analysis,

00:04:07.020 --> 00:04:08.393
and we interpret the results,

00:04:08.474 --> 00:04:10.710
finally, we write an article about it.

00:04:11.196 --> 00:04:16.020
So, which part defines the quality of a study?

00:04:17.105 --> 00:04:19.440
It is this part here,

00:04:19.780 --> 00:04:21.945
so when you have collected your data,

00:04:22.252 --> 00:04:25.643
then you have basically already 
set an upper limit of the quality,

00:04:25.967 --> 00:04:28.457
if your data are not good,

00:04:28.667 --> 00:04:32.193
then you can't make a good study.

00:04:32.620 --> 00:04:33.939
On the other hand,

00:04:33.939 --> 00:04:35.265
if you have great data,

00:04:35.621 --> 00:04:40.470
even if you mess up your data collection 
or data analysis or interpretation,

00:04:40.470 --> 00:04:42.398
that is something that you can fix,

00:04:42.398 --> 00:04:45.210
you have the data, you can just 
analyze it a bit differently.

00:04:46.376 --> 00:04:53.072
It's important to understand that the validity 
of our causal claims depends crucially on,

00:04:53.072 --> 00:04:55.076
whether the sample is appropriate,

00:04:55.529 --> 00:04:58.920
and whether we have collected 
all relevant controls,

00:04:58.920 --> 00:05:01.680
or whether we have a valid experimental design.

00:05:02.101 --> 00:05:07.200
After that, data processing and 
data analysis are just mechanics

00:05:07.427 --> 00:05:12.550
that will allow you to document 
this great study conducted here.

00:05:12.809 --> 00:05:14.629
So this is the important part.

00:05:14.629 --> 00:05:19.410
And you should not rush into 
data collection, obviously.

00:05:19.410 --> 00:05:22.769
Because if you just go and 
you collect data right away,

00:05:22.866 --> 00:05:25.146
the odds for you doing it correctly,

00:05:25.146 --> 00:05:28.500
with the good design that 
includes all relevant controls,

00:05:28.500 --> 00:05:29.398
is pretty low.

00:05:31.195 --> 00:05:34.452
This is highlighted in some of the readings.

00:05:34.938 --> 00:05:38.400
So the problems in rejected manuscripts

00:05:38.400 --> 00:05:40.980
in good journals are rarely about data analysis.

00:05:40.980 --> 00:05:44.753
So when I myself review a paper,

00:05:45.000 --> 00:05:48.060
I typically have lots of things 
to say about the methods,

00:05:48.060 --> 00:05:49.574
because that's my speciality.

00:05:49.736 --> 00:05:52.440
But if the data are good, the design is good,

00:05:52.440 --> 00:05:55.800
then I will say that okay, 
do the analysis differently,

00:05:55.800 --> 00:06:00.060
resubmit and then I will 
re-evaluate your manuscript

00:06:00.060 --> 00:06:01.793
to see whether it makes sense.

00:06:02.052 --> 00:06:07.860
But if there is a control variable that is 
very important based on existing theory,

00:06:07.860 --> 00:06:12.424
that provides an alternative 
explanation for the phenomenon,

00:06:12.424 --> 00:06:16.453
that the researchers are studying, 
which has not been measured,

00:06:16.453 --> 00:06:19.680
then there is nothing that they 
can do about it in most cases,

00:06:19.680 --> 00:06:21.981
because it's very difficult to go,

00:06:21.981 --> 00:06:24.240
particularly if you collect 
the data with the survey,

00:06:24.240 --> 00:06:26.910
it's very difficult to go and 
collect additional variables.

00:06:27.590 --> 00:06:31.020
Also, Aguinis and Vandenberg say here,

00:06:31.910 --> 00:06:34.950
that data analysis problems are rarely something

00:06:34.950 --> 00:06:37.440
that cause an article to be rejected.

00:06:37.440 --> 00:06:39.480
Because data analysis problem is something

00:06:39.480 --> 00:06:41.310
that you just re-analyze the data,

00:06:41.310 --> 00:06:43.328
fix the problem and you're going to be fine.

00:06:43.976 --> 00:06:46.320
So the problem is that

00:06:46.320 --> 00:06:49.388
if your design doesn't allow 
you to make causal claims,

00:06:49.388 --> 00:06:53.850
then you can't make a claim and there's 
nothing that you can do about it.

00:06:53.850 --> 00:06:57.308
They also say that there is 
this kind of persistent belief,

00:06:57.680 --> 00:07:00.000
that if you have a bad design,

00:07:00.000 --> 00:07:03.720
you can compensate that using a fancy method.

00:07:03.720 --> 00:07:06.345
So some people seem to think that because,

00:07:06.507 --> 00:07:11.693
let's say multilevel generalized structural 
equation modeling is a new thing,

00:07:12.000 --> 00:07:14.730
therefore using that makes your study better.

00:07:15.183 --> 00:07:16.923
That is not true.

00:07:17.220 --> 00:07:20.820
The quality of the study is determined 
largely before data collection,

00:07:20.820 --> 00:07:25.080
and then after that, you have to 
just choose the appropriate analysis,

00:07:25.080 --> 00:07:28.890
instead of going with the 
one that is the most complex.

00:07:29.457 --> 00:07:32.223
If you have a bad design, you have bad data,

00:07:32.223 --> 00:07:35.211
then using a fancy method for that data

00:07:35.211 --> 00:07:38.070
just means that you spend 
a lot of time using a fancy

00:07:38.070 --> 00:07:40.650
and complicated method on bad data

00:07:40.650 --> 00:07:43.440
and the outcome is, it's not a good paper anyway.

00:07:44.039 --> 00:07:47.640
So you just end up with the poor 
study with the fancy method.