WEBVTT 00:00:00.000 --> 00:00:04.160 When we consider the quality of research,  we need to consider reliability and validity. 00:00:05.120 --> 00:00:10.480 Singleton and Straits define reliability  as the degree of stability and consistency. 00:00:10.480 --> 00:00:12.720 That is correct in the sense that 00:00:12.720 --> 00:00:15.200 empirically when we evaluate reliability, 00:00:15.200 --> 00:00:17.920 we can take a look at stability and consistency, 00:00:17.920 --> 00:00:20.960 but that's a bit confusing definition. 00:00:20.960 --> 00:00:24.160 It's a lot simpler to understand  reliability simply as, 00:00:24.800 --> 00:00:29.440 whether we get the same result  if we do the study again. 00:00:29.440 --> 00:00:34.720 So reliability is basically the  degree of random error in the data. 00:00:34.720 --> 00:00:38.000 So if our measurements, if our study is unreliable 00:00:38.000 --> 00:00:43.840 then repeating the measurement or study  again would produce a different result. 00:00:43.840 --> 00:00:48.960 Reliable measurements and reliable  studies always produce the same result. 00:00:49.760 --> 00:00:54.000 Then, validity is concerned about  whether the result is correct. 00:00:54.000 --> 00:00:58.400 So we can have a reliable  measure that is not valid. 00:00:58.400 --> 00:01:02.160 For example, if we measure  the intelligence of a person 00:01:02.160 --> 00:01:05.920 by measuring the circumference of  their head using measurement tape, 00:01:05.920 --> 00:01:12.240 that would be highly reliable but it  is not a valid measure of intelligence. 00:01:12.240 --> 00:01:17.440 Reliability and validity can also be  understood as precision and accuracy. 00:01:17.440 --> 00:01:21.840 So these are target practice diagrams, and 00:01:21.840 --> 00:01:23.920 there's someone shooting at the target, 00:01:23.920 --> 00:01:27.680 this shooter is highly reliable  always hitting the same spot, 00:01:27.680 --> 00:01:30.560 but he's not hitting the bull's eye. 00:01:30.560 --> 00:01:35.520 So this is a good shooter but the sights  are off, and it's reliable but not valid. 00:01:36.240 --> 00:01:38.000 This is a bad shooter, 00:01:38.000 --> 00:01:40.640 so he is not hitting the same spot, 00:01:40.640 --> 00:01:46.080 but the rifle has the sides correct,  so this is valid but not reliable. 00:01:46.080 --> 00:01:48.560 So which one is a more serious problem? 00:01:48.560 --> 00:01:51.600 Some sources at the Singleton and Straits say 00:01:51.600 --> 00:01:55.040 that you can't have validity without reliability. 00:01:55.040 --> 00:01:58.640 And that is true if we just  look at an individual shot here. 00:01:58.640 --> 00:02:04.480 So any individual hole on this target  is very unlikely to be on bull's eye. 00:02:04.480 --> 00:02:11.120 So taking an individual spot  from here would not tell us much. 00:02:11.840 --> 00:02:17.360 However, reliability can be improved by  taking multiple measures or multiple studies. 00:02:17.360 --> 00:02:20.800 So if you have multiple unreliable  things that are all valid, 00:02:20.800 --> 00:02:24.640 then taking together those  will produce valid conclusions. 00:02:24.640 --> 00:02:27.520 If we consider these two target practices, 00:02:27.520 --> 00:02:31.520 and we would have to stand in  front of one of the targets, 00:02:32.400 --> 00:02:36.080 standing in front of this bull's  eye would be perfectly safe, 00:02:36.080 --> 00:02:38.080 because this guy will never hit the target. 00:02:38.640 --> 00:02:41.040 Here if we stand in front of this target 00:02:41.040 --> 00:02:44.960 we will eventually die if the  person gets to shoot enough. 00:02:44.960 --> 00:02:49.680 So lack of reliability is less of  a problem than lack of validity, 00:02:49.680 --> 00:02:54.240 because you can just repeat the study or  repeat the measurement again and again, 00:02:54.240 --> 00:02:57.440 and eventually on average you will be correct, 00:02:57.440 --> 00:03:00.720 and in this case a person standing  in front of this target would die. 00:03:01.680 --> 00:03:06.960 Reliability and validity in quantitative  research is typically considered through 00:03:06.960 --> 00:03:11.440 these five different kinds of  reliabilities or validities. 00:03:12.160 --> 00:03:15.920 Because quantitative research  is done using numbers, 00:03:15.920 --> 00:03:19.280 and we process those numbers with the computer, 00:03:19.280 --> 00:03:23.760 the idea is that there is no  unreliability in the actual analysis. 00:03:23.760 --> 00:03:26.240 So when we focus on on reliability, 00:03:26.240 --> 00:03:29.200 we typically focus on the  reliability of measurement. 00:03:29.200 --> 00:03:31.520 So if we measure the same thing again, 00:03:31.520 --> 00:03:32.960 would we get the same result? 00:03:33.760 --> 00:03:38.800 Then we have four different  kinds of, or aspects of validity. 00:03:38.800 --> 00:03:41.040 The first one is measurement validity, 00:03:41.040 --> 00:03:46.000 do the data that we have actually measure,  what they are supposed to measure? 00:03:46.000 --> 00:03:51.840 For example, we have this  example from Talouselämää 500, 00:03:51.840 --> 00:03:53.200 where we were interested in, 00:03:53.200 --> 00:03:58.000 whether naming a woman as a CEO  causes profitability to increase. 00:03:58.000 --> 00:04:01.280 Then we need to consider,  whether our return on assets, 00:04:01.280 --> 00:04:02.960 which is the data here, 00:04:02.960 --> 00:04:05.200 is a valid measure of profitability. 00:04:06.000 --> 00:04:09.200 It probably is because that's  what managers like to use 00:04:09.200 --> 00:04:12.720 when they consider profitability,  or investors like to use. 00:04:13.280 --> 00:04:18.880 The statistical conclusion validity  refers to the first condition of the three 00:04:18.880 --> 00:04:21.520 that are required for demonstrating causality. 00:04:21.520 --> 00:04:24.880 Statistical conclusion  validity basically refers to 00:04:24.880 --> 00:04:28.080 whether we have identified  the association correctly. 00:04:28.080 --> 00:04:34.000 So is it possible that this 4.7 percent  points difference is only because of chance, 00:04:34.000 --> 00:04:38.240 or is it evidence of a systematic difference  between men and women at companies. 00:04:38.240 --> 00:04:41.440 So that's the idea of  statistical conclusion validity. 00:04:41.440 --> 00:04:43.360 Then internal validity refers to 00:04:43.360 --> 00:04:49.040 whether the second and third  conditions for causality are true. 00:04:49.040 --> 00:04:54.160 So is the association actual valid  evidence for causal relationship? 00:04:54.160 --> 00:04:58.800 Have we properly ruled out  that it's not y that causes x, 00:04:58.800 --> 00:05:01.040 but it's actually x that causes y? 00:05:01.040 --> 00:05:04.000 And have we ruled out any alternate explanations. 00:05:04.000 --> 00:05:07.520 Then finally external validity  is about generalizability. 00:05:07.520 --> 00:05:10.240 So let's assume that we have established 00:05:10.240 --> 00:05:13.200 that there is a clear causal relationship 00:05:13.200 --> 00:05:18.640 that naming a woman as a CEO of one  of the largest 500 Finnish companies 00:05:18.640 --> 00:05:20.560 causes profitability to increase. 00:05:21.360 --> 00:05:25.760 Can we say that, that is true  generally to all Finnish companies or 00:05:25.760 --> 00:05:30.080 can we say that, that applies to  all large companies in any country? 00:05:30.080 --> 00:05:33.520 So generalizability and  external validity are about, 00:05:35.040 --> 00:05:37.920 how broadly we can make our claim? 00:05:37.920 --> 00:05:42.800 So these are the things that we need to evaluate 00:05:42.800 --> 00:05:44.880 when we evaluate quantitative studies. 00:05:44.880 --> 00:05:48.560 In qualitative studies  sometimes we use the same terms, 00:05:48.560 --> 00:05:51.600 of course we don't do statistical  conclusion validity but 00:05:51.600 --> 00:05:56.560 how we actually go about evaluating  them is slightly different.