Understanding Significance and P-values

In this video I explain the concept of significance and how it relates to both effect size, sample size, and the law of large numbers. The role of a p-value in describing the likelihood of data is explained with an emphasis on avoiding misinterpretation or confusing low p-value with support for a particular hypothesis.

Don’t forget to subscribe to the channel to see future videos! Have questions or topics you’d like to see covered in a future video? Let me know by commenting or sending me an email!

Need more explanation? Check out my full psychology guide: Master Introductory Psychology: http://amzn.to/2eTqm5s

Video transcript:
Hi, I’m Michael Corayer and this is Psych Exam Review and in this video I’m going to talk about significance and significance is an inferential statistic, so unlike the descriptive statistics which simply describe our data, inferential statistics allow us to make judgments and inferences about what our data might mean.

Now when we use the term significance we’re not using it the way you use it in everyday life to say that you have a significant workload or a significant other. In research when we use the term significance we’re saying that the data meets certain mathematical criteria.

Now there’s a number of ways to calculate significance and we’re not going to get into the calculation in this video, but I am going to talk about the factors that influence whether data might be significant are not. So, what are these factors?

One of them is the effect size. And all that the effect size refers to is, how large is the difference that I found between my experimental group and my control group? So it’s the difference between the experimental group and the control group.

So let’s imagine that I told you I invented a smart drug. I said OK I’m going to test my drug with a group of students, so I gave half of the students the drug and the other half a placebo and then I measured their results on some exam and I tell you that the drug group had an average score of 88 and the placebo group had an average score of 87.

I say “look my smart drug worked!” You’d say well, maybe, because this effect size is really small and we can’t be so sure that it’s coming from the drug. So if I were to tell you that the drug group had an average score of 95 and the placebo group had an average score of 30 then you might be more intrigued to find out more. This drug might actually be doing something because we have a much larger effect size.

But it’s not effect size alone that matters. We need to know about the sample size. And that refers to our number of participants. So if I told you a student took this drug scored a 95, another student didn’t take the drug, scored a 30 you’d say OK but that’s only two participants that’s a really small sample. It could be the case that the student who was going to get a 95 happened the be one who took the drug and the student who was going to score poorly happened to be in the control group and so this could have just happened on its own, it’s not necessarily the drug.

But if I told you I had 1000 students who took the drug and they average 95 and 1000 students who didn’t and they averaged 30, now you’d see that this data is more convincing. So with sample size, we tend to follow the idea that bigger is always better. This is referred to sometimes as the law of large numbers. This is the idea that the more participants we have the better. The more confident we can be in our data.

Ok, so let’s see what significance actually looks. When we calculate the significance which we’re not going to do now, we come up with a p-value and what a p-value tells us is how likely it is that our data would occur. So it’s saying that is the data we collected likely to happen on its own or not? How much could chance be playing a role in these results?

And what we want, is we want very unlikely data. We want data that doesn’t just happen. We want to have a situation where it’s unlikely for this to happen and it happened anyway, that’s more interesting to us. So we usually set the p-value, in order to say something is significant it means that the p-value is below 0.05

And in some cases we might need to be more strict with our p-value, you may see cases where people want to see the p-value below .01 or even .001 but generally significance is a p-value lower than .05 So what exactly does this mean?

Let’s imagine a scenario that I gave you a fair coin and I told you that I had the ability to mentally control this coin and when you flipped the coin, I would make sure that it landed on heads.

Now imagine that you’ve flipped this coin once, and it landed on heads. What conclusions would you draw? Well you probably wouldn’t be willing to say that I had mental powers yet. You’d say, OK, that’s interesting but that’s likely to occur on its own. l mean if I flip a coin and you’re not even here, the odds of it landing on heads is 50% so I’m not convinced that you have mental powers, I’m going to want to see you do it again and again and again.

Each coin flip that you add makes the data become less and less likely to occur on its own. So if you flip the coin 100 times in a row and it’s heads every tim and I’m telling you this is because of my mental power. Now you might be more intrigued. And if you did it 1000 times in a row, it would be even more interesting. Now you could say, of course, this is still a possibility.

It is true if you flip a coin 1000 times it’s possible that you’ll get heads every time just by chance. But it’s very unlikely. And if you did it a million times, again it’s unlikely but it could happen. Each flip basically makes it less and less likely to happen. But you’ll never get to 0. So we’ll never be able to say that the p-value is 0. It could be 0.000000…..you know, hundreds of digits, long but it’s never going to get to zero. So there’s always a chance that that data could happen on its own.

So the important thing to remember about a p-value is even when p is less than .05 like in our coin flip example. Let’s say I did it 100 times, OK it could be chance, but it’s very unlikely, this doesn’t tell us why it’s happening. This is a really important point for thinking about significance the p-value tells us that the data is unlikely but it doesn’t tell us how or why the event is occurring. It just tells us it’s a very unlikely event to observe.

So this is really important because when we see a low p-value it doesn’t mean that the hypothesis is correct. The hypothesis is just the possible explanation for this data but the p-value doesn’t know what the hypothesis is. The p-value just says it’s unlikely for this to happen. So no matter how many times you flip the coin, that doesn’t prove that I have mental powers. That’s just the explanation that I’m giving you but the p-value doesn’t tell if that explanation is correct. The p-value just says “well this would be really really unlikely to happen on its own”.

So keep that in mind whenever you think about a p-value. It just tells you this is unlikely, this is probably not chance, but it could be still. But it’s very unlikely for that to happen. But it doesn’t tell us anything about the hypothesis. It doesn’t tell us how or why the event is occurring. It could be mental powers controlling the coin flips, or it could very well be something else and the p-value isn’t going to be able to answer that question.

I hope you found this helpful. If so, please like the video and subscribe to the channel for more. Thanks for watching!

Leave a Reply Cancel reply