Correlation, causation, and association – What does it all mean???

A comment posted by a reader on a post reprimanded me for suggesting that marijuana caused relationships to go bad.

In this instance the reader was mistaken, as I had specifically used the word “associated”, but the comment made me think that maybe I should explain the differences between correlation, causation, and association. I’m a scientist studying addiction, and in the field, it’s very important to be clear about what each of the words you use means.

Being clear about inferences in research

Correlation – When researchers find a correlation, which can also be called an association, what they are saying is that they found a relationship between two, or more, variables. For instance, in the case of the marijuana post, the researchers found an association between using marijuana as a teen, and having more troublesome relationships in mid, to late, twenties.

Correlations can be positive – so that as one variable (marijuana smoking) goes up, so does the other (relationship trouble); or they can be negative, which would mean that as one variable goes up (methamphetamine smoking) another goes down (grade point average).

The trouble is that, unless they are properly controlled for, there could be other variables affecting this relationship that the researchers don’t know about. For instance, education, gender, and mental health issues could be behind the marijuana-relationship association (these variables were all controlled for by the researchers in that study). Researchers have at their disposal a number of sophisticated statistical tools to control for these, ranging from the relatively simple (like multiple regression) to the highly complex and involved (multi-level modeling and structural equation modeling). These methods allow researchers to separate the effect of one variable from others, thereby leaving them more confident in making assertions about the true nature of the relationships they found. Still, even under the best analysis circumstances, correlation is not the same as causation.

Causation – When an article says that causation was found, this means that the researchers found that changes in one variable they measured directly caused changes in the other. An example would be research showing that jumping of a cliff directly causes great physical damage. In order to do this, researchers would need to assign people to jump off a cliff (versus lets say jumping off of a 12 inch ledge) and measure the amount of physical damage caused. When they find that jumping off the cliff causes more damage, they can assert causality. Good luck recruiting for that study!

Most of the research you read about indicates a correlation between variables, not causation. You can find the key words by carefully reading. If the article says something like “men were found to have,” or “women were more likely to,” they’re talking about associations, not causation.

Why the correlation-causation difference?

The reason is that in order to actually be able to claim causation, the researchers have to split the participants into different groups, and randomly assign some to the behavior or condition they want to study (like taking a new drug), while the rest receive something else. This is in fact what happens in clinical trials of medication because the FDA requires proof that the medication actually makes people better (more so than a placebo). It’s this random assignment to conditions (or randomization) that makes experiments suitable for the discovery of causality. Unlike in association studies, random assignment assures (if everything is designed correctly) that its the behavior being studied, and not some other random effect, that is causing the outcome.

Obviously, it is much more difficult to prove causation than it is to prove an association.

Should we just ignore associations?

No! Not at all!!! Not even close!!! Correlations are crucial for research and still need to be looked at and studied, especially in some areas of research like addiction.

The reason is simple – We can’t randomly give people drugs like methamphetamine as children and study their brain development to see how the stuff affects them, that would be unethical. So what we’re left with is a the study of what meth use (and use of other drugs) is associated with. It’s for this reason that researchers use special statistical methods to assess associations, making certain that they are also considering other things that may be interfering with their results.

In the case of the marijuana article, the researchers ruled out a number of other interfering variables known to affect relationships, like aggression, gender, education, closeness with other family members, etc. By doing so, they did their best to assure that the association found between marijuana and relationship status was real. Obviously other possibilities exist, but as more researchers assess this relationship in different ways, we’ll learn more about its true nature.

This is how research works.

It’s also how we found out that smoking causes cancer. Through endlessly repeated findings showing an association. That turned out pretty well, I think…

6 responses to “Correlation, causation, and association – What does it all mean???”

  1. Good day:

    I happened upon this website while researching correlation and causation.

    Let me state first that I appreciate the efforts of scientific research because, in many cases, cures and a better understanding are discovered.

    However, I do have an issue with correlation resulting in causation when too many other “risk factors” are evident but ignored (?).

    Correlation does not necessarily equal causation, yet there are numerous “results” published erroneously (in my opinion). For instance: How can it be determined that SHS causes SIDS when there are so many babies dying of SIDS who have NEVER been exposed? It is unfair and unethical to publish findings that create a “hyperbolic mass hysteria” in society. I could go on further with the “assumptions, allegations, and opinions” stating that SHS CAUSES cancer in people who do not smoke.

    Genetic mapping is the prominent factor here and whether a person smokes or not, if you are genetically predisposed, there is a good chance you will contract cancer.

    I thank you for “listening” and hope that, someday, the published reports are corrected and order restored to our society.



    • Thanks for writing Josie,
      In general, as I point out, scientists will NEVER claim that correlation proves causation. However, studies using correlations can use a whole slew of statistical techniques to rule out other KNOWN risk factors, making assertions about a relationship (still not causal) much stronger. What journalists do with scientific findings is often unfortunate, but they’re the ones to be yelled at here, most scientists adhere to the rule.
      As you point out, there are many children who die from SIDS who are never exposed to second hand smoke (too me a second to get the acronym). There also many people who die in car accidents who never drink and drive but we know that driving while intoxicated increases the likelihood that you will get involved in a fatal car crash. Here even causality doesn’t have to mean a 1 to 1 ratio where engaging in activity A necessarily leads to outcome B.
      SHS is considered to cause cancer in people who don’t smoke because over and over, in repeated studies using different populations and controlling for different factors (like age, SES, education, other health habits, etc.) SHS has been shown to be associated with an increased incidence, and prevalence, of cancer.
      I agree with the genetic disposition argument, but that’s probably only half the equation, the rest is behavioral (smoking, eating specific foods and not others) and environmental (SHS, stress, etc.).
      If you want to go straight to the source, read the academic articles (search on google scholar for instance) and not using popular media, who often distorts findings. Still, keep an open mind and be willing to accept, at least tentatively, findings that disagre with your viewpoint if the evidence seems to be there.

  2. Outstandingly educational thank you, I do think your current readers would probably want more stories like that keep up the good content.

  3. Priceless data contained by articles can make up for a fantastic e-book. This way your energy doesn’t remain buried beneath piles and piles of similar content in article directories, instead it continues to make prospects and affiliate income through intelligent link positioning within the content. Merely an idea broaden article marketing potential…

  4. i think this arcticle is very needed in that it shows how research is done and how certain conclusions are made however i dont appreciate the marijuana reference. i guess it was an easy analogy but what about the association of marijuana and cancer or glaucoma? still a very good read.

Leave a Reply

%d bloggers like this: