Critical thinking about causality
Causal relationship- one thing happening makes the other thing MORE PROBABLE to
happen (statistical relationship)
Correlation does not imply causation
We don’t see causal relationship→ we infer since A happens after B
Causality (John Stuart Mill): X causes Y only if
- Priority: change in X precedes change Y
- Longitudinal study needed
- Consistency: change X varies systematically with change Y
- Covariance is needed
- Exclusivity: there is no alternative explanation for the
relationship
- Manipulation (groups) is needed
*Conclusion is not possible since exclusivity cannot be met (a
third variable can explain the relationship between the variables)
*Priority principle is also not met: self-esteem is considered an
effect not a cause
Reasoning errors
1. Post hoc ergo propter hoc (Y happens after X… then X is the cause)
a. X precedes Y (priority)--> focus on one aspect of Mill’s
criteria and ignore the other two (check for consistency and
exclusivity)
b. X covaries with Y (consistency/correlation)--> ignore priority
and exclusivity
c. X is the only possible cause of Y (exclusivity)--> ignore
priority and consistency
*Insufficient: needs other elements
*Non-redundant: crucial, presence
makes difference
*Unnecessary: there are other ways to
start fire (replaceable)
*Sufficient: factors (set of things)
together are sufficient
,How to check for non-redundancy: have two versions of the world (identical) only
difference is one factor → now you have an ideal counterfactual (perfect counterfactual
does not exist 🙂)
- Create experimental and control group (people are identical except for
randomness/random assignment) ⇒ that is experimental design:
- Useful because of manipulation of variables, random assignment,
counterfactual, control group
Threats to causality:
1. History: influences outside of intervention which influence outcome
2. Maturation: natural changes that may be confused with effect treatment
3. Selection: selection criteria for treatment related to outcomes of treatment/
systematic differences over conditions that could also cause observed effect
4. Attrition: participant's failure, systematically correlated with conditions (dropping
out of participants… condition gets affected)
5. Instrumentation: change in measuring instrument resulting in a difference between
pre-and post-measurement
6. Testing: effect of measurement on measurement (fatigue, habituation, etc.) exposure
to a test can affect scores on subsequent exposures
7. Regression to the mean: extreme scores will be followed by less extreme scores
DAG⇒ makes it easier to: be more specific about what we are
assuming about the causal relationships, identify potential
confounds when estimating the true causal effect of one variable
on another, understand some applied issues ⇒ justified to
conclude that a correlation is causal
Mediation: effect of X and Y is indirect, mediated by Z
Coufounder: common cause→ X and Y correlate because they
share a common cause… distorted association when no control Z
Collider: common effect… distorted association when control Z
*Whether you should adjust for third variable (Z) depends on the situation you are in→
make assumptions explicit→ use causal graphs to help you and the reader out
- Don’t control for collider or mediator but control for confounder (controlling: going
into detail and separating the variable)
Foster (2010)- swamp of ambiguity has arisen around statements about causality
1. Ignoring causality- some authors write down only correlations, without making any
statements about causality.
, 2. Statements of causality are recognized, but unclear assumptions- statements
about causal relationships based on correlational data, but often without specifying
assumptions.
3. Pseudo-correlational statements- no direct statements about causality, but clearly
implied in the conclusion.
● If all confounders are controlled for, a correlation between treatment and outcome
can be seen as causal
○ Does not mean that the more variables are controlled for, the more accurate
the estimation of the causal effect ⇒ purification principle
■ Problem of overcorrection: controlling for mediators on the causal
path could lead to an over\underestimation of the causal effect
■ Collider bias: controlling for common effects will bias the estimation
of a causal relationship between two variables
❖ Indirect effect→ X cannot directly cause Y
❖ For total effect of X Y, don’t control for mediator
❖ For direct effect of X Y, control for mediator
❖ Check effect of X to Z to then check for Z to Y
Mediator is caused by the treatment variable X and is a cause of
the outcome variables
Collider (common effect):
❖ X and Y cause Z⇒ common effect
❖ Do not control for third variable
➢ Otherwise collider bias
■ Correlation (negative) that does not exist
■ X No sprinkler and no rain = wet lawn X
Tinder example: thinking a beautiful personality and a beautiful face are mutually exclusive
➔ Negative correlation between beauty and personality ⇒ because conditioning on
collider ⇒ COLLIDER BIAS
➔ Attractiveness/personality are selected to go out with them on Tinder date
◆ To the degree to one is absent, the other is likely to be more present
Correlation & Simple regression
Simple regression only has one predictor
Causal relationship- one thing happening makes the other thing MORE PROBABLE to
happen (statistical relationship)
Correlation does not imply causation
We don’t see causal relationship→ we infer since A happens after B
Causality (John Stuart Mill): X causes Y only if
- Priority: change in X precedes change Y
- Longitudinal study needed
- Consistency: change X varies systematically with change Y
- Covariance is needed
- Exclusivity: there is no alternative explanation for the
relationship
- Manipulation (groups) is needed
*Conclusion is not possible since exclusivity cannot be met (a
third variable can explain the relationship between the variables)
*Priority principle is also not met: self-esteem is considered an
effect not a cause
Reasoning errors
1. Post hoc ergo propter hoc (Y happens after X… then X is the cause)
a. X precedes Y (priority)--> focus on one aspect of Mill’s
criteria and ignore the other two (check for consistency and
exclusivity)
b. X covaries with Y (consistency/correlation)--> ignore priority
and exclusivity
c. X is the only possible cause of Y (exclusivity)--> ignore
priority and consistency
*Insufficient: needs other elements
*Non-redundant: crucial, presence
makes difference
*Unnecessary: there are other ways to
start fire (replaceable)
*Sufficient: factors (set of things)
together are sufficient
,How to check for non-redundancy: have two versions of the world (identical) only
difference is one factor → now you have an ideal counterfactual (perfect counterfactual
does not exist 🙂)
- Create experimental and control group (people are identical except for
randomness/random assignment) ⇒ that is experimental design:
- Useful because of manipulation of variables, random assignment,
counterfactual, control group
Threats to causality:
1. History: influences outside of intervention which influence outcome
2. Maturation: natural changes that may be confused with effect treatment
3. Selection: selection criteria for treatment related to outcomes of treatment/
systematic differences over conditions that could also cause observed effect
4. Attrition: participant's failure, systematically correlated with conditions (dropping
out of participants… condition gets affected)
5. Instrumentation: change in measuring instrument resulting in a difference between
pre-and post-measurement
6. Testing: effect of measurement on measurement (fatigue, habituation, etc.) exposure
to a test can affect scores on subsequent exposures
7. Regression to the mean: extreme scores will be followed by less extreme scores
DAG⇒ makes it easier to: be more specific about what we are
assuming about the causal relationships, identify potential
confounds when estimating the true causal effect of one variable
on another, understand some applied issues ⇒ justified to
conclude that a correlation is causal
Mediation: effect of X and Y is indirect, mediated by Z
Coufounder: common cause→ X and Y correlate because they
share a common cause… distorted association when no control Z
Collider: common effect… distorted association when control Z
*Whether you should adjust for third variable (Z) depends on the situation you are in→
make assumptions explicit→ use causal graphs to help you and the reader out
- Don’t control for collider or mediator but control for confounder (controlling: going
into detail and separating the variable)
Foster (2010)- swamp of ambiguity has arisen around statements about causality
1. Ignoring causality- some authors write down only correlations, without making any
statements about causality.
, 2. Statements of causality are recognized, but unclear assumptions- statements
about causal relationships based on correlational data, but often without specifying
assumptions.
3. Pseudo-correlational statements- no direct statements about causality, but clearly
implied in the conclusion.
● If all confounders are controlled for, a correlation between treatment and outcome
can be seen as causal
○ Does not mean that the more variables are controlled for, the more accurate
the estimation of the causal effect ⇒ purification principle
■ Problem of overcorrection: controlling for mediators on the causal
path could lead to an over\underestimation of the causal effect
■ Collider bias: controlling for common effects will bias the estimation
of a causal relationship between two variables
❖ Indirect effect→ X cannot directly cause Y
❖ For total effect of X Y, don’t control for mediator
❖ For direct effect of X Y, control for mediator
❖ Check effect of X to Z to then check for Z to Y
Mediator is caused by the treatment variable X and is a cause of
the outcome variables
Collider (common effect):
❖ X and Y cause Z⇒ common effect
❖ Do not control for third variable
➢ Otherwise collider bias
■ Correlation (negative) that does not exist
■ X No sprinkler and no rain = wet lawn X
Tinder example: thinking a beautiful personality and a beautiful face are mutually exclusive
➔ Negative correlation between beauty and personality ⇒ because conditioning on
collider ⇒ COLLIDER BIAS
➔ Attractiveness/personality are selected to go out with them on Tinder date
◆ To the degree to one is absent, the other is likely to be more present
Correlation & Simple regression
Simple regression only has one predictor