100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Summary

Summary Social Behaviour Dynamics

Rating
-
Sold
1
Pages
58
Uploaded on
09-02-2022
Written in
2021/2022

Applied Data Science Utrecht University (UU): dynamics of social and behavioural processes, and longitudinal research.

Institution
Course











Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Course

Document information

Uploaded on
February 9, 2022
Number of pages
58
Written in
2021/2022
Type
Summary

Subjects

Content preview

Harnan Et Al. (2019) – A Second Chance to Get Causal
Inference Right: A Classification of Data Science Tasks
Statement: statistics may be applied to make causal inferences when using data from
randomized experiments, but not when using nonexperimental (observational) data

Simpson’s paradox: failure to recognize that the choice of data analysis depends on the
causal structure of the problem

1. A Classification of Data Science Tasks
The scientific contributions of data science can be organized into three classes of tasks:

- Description: using data to provide quantitative summary of features of the world
o Elementary calculations (mean / proportion)
o Unsupervised learning algorithms (cluster analysis)
o “Clever” data visualisations (storytelling)
- Prediction: using data to map some features of the world (the inputs) to other
features of the world (the outputs); double to hundreds of variables
o Elementary calculations: e.g., correlation coefficient, risk difference
o Supervised learning algorithms: random forests, neural networks
- Counterfactual prediction: using data to predict certain features of the world as if
the world had been different, as is required in causal inference applications
o Elementary calculations by randomised experiments and perfect adherence
o Complex implementations like g-methods

Statistical inference (explanation; confirmatory) is often required for all three tasks

Sciences are primarily defined by their questions rather than by their tools




1

, 2. Prediction vs. Causal Inference
Predictive (non-causal) applications of data science: map inputs to outputs

- but do not consider how the world would look like under different courses of action

Mapping observed inputs to outputs is for automated data analysis because only requires:

- Large data set with inputs and outputs
- Algorithm that establishes a mapping between inputs and outputs
- Metric to assess the performance of the mapping, often based on a gold standard

Prediction tasks require expert knowledge to specify the scientific question:

- What inputs and what outputs
- Identify / generate relevant data sources

However, no expert knowledge is required for prediction after the inputs and outputs are
specified and measured in a particular dataset (machine learning can take over here)

Causal inference by expert knowledge to create meaning to prediction (causal structure)

Confounding factor: underlying mechanism within observations and features

Model paradox: if a variable takes over the effect of another variable (depending if left out)

Counterfact: potential outcome which is not observed

Naïve conclusion: assuming causal relations from predictions

3. Implications for Decision-Making
Predictive algorithms inform us decisions have to be made, but they cannot help us make the
decisions, also predictive algorithms do not depict actual causality

- Causal analysis needed to answer “what if” questions, and avoid agnostic features

Distinction between prediction and causal inference (counterfactual prediction) negligible for
decision-making when relevant expert knowledge is codifiable into algorithms

- Complex systems (too chaotic for long term prediction)
o unknown and nondeterministic governing laws (“rules of the game”)
o Uncertainty about necessary data are available
o Learning by trial and error (experimenting) is impossible

A complex system must be understood by qualitative (model) knowledge for causalities

- Extremely complex systems require narrow research questions and modest analysis
o Not explaining causal structure of entire system or globally optimal decisions




2

, 4. Process and Implications for Teaching
Accuracy of causal answers cannot be quantified using observational data

Data scientists without subject-matter knowledge cannot conduct causal analyses in isolation:

- They don’t know how to articulate the questions (what the target experiment is)
- They don’t know how to answer them (how to emulate the target experiment)

5. Conclusion
Data science that embraces causal inference must

- Develop methods for integration of sophisticated analytics with expert causal expertise
- Acknowledge (unlike prediction) assessment of the validity of causal inferences cannot
be exclusively data-driven because validity of causal inferences also depends on the
adequacy of expert causal knowledge

Causal directed acyclic graphs: represent different sets of causal structures compatible
with existing causal knowledge explore impact of causal uncertainty on effect estimates

Intelligence is the ability to predict counterfactually how the world would change under
different actions by integrating expert knowledge and mapping algorithms

- No AI will be worthy of the name without causal inference




3

, Holland (1986) – Statistics and Causal Inference
1. Introduction
Randomised experiments: statistical procedure with ability to identify causation

Purpose of the paper is to show, firstly, how statistics is useful for causal inference, and
secondly, the difference between causal and associational inference

2. Model for Associational Inference
The joint distribution of 𝑌 and 𝐴 over 𝑈 is specified by 𝑃𝑟( 𝑌 = 𝑦, 𝐴 = 𝑎) = proportion of 𝑢
in 𝑈 for which 𝑌(𝑢) = 𝑦 and 𝐴(𝑢) = 𝑎

- Associational parameters are determined by this joint distribution

𝑃𝑟(𝑌=𝑦, 𝐴=𝑎)
conditional distribution of 𝑌 given 𝐴 is specified by 𝑃𝑟(𝑌 = 𝑦|𝐴 = 𝑎) = 𝑃𝑟(𝐴=𝑎)


- Conditional distribution describes how distribution 𝑌 values change over 𝑈 as 𝐴 varies
- Associational parameter: regression of 𝑌 on 𝐴, conditional expectation 𝐸(𝑌|𝐴 = 𝑎)

3. Rubin’s Model for Causal Inference
o "A causes B" almost always means that A causes B relative to some other cause
that includes the condition "not A"
- Treatment 𝑆 = 𝑡 → 𝑌𝑡 (𝑈) (one cause) versus control 𝑆 = 𝑐 → 𝑌𝑐 (𝑈) (another cause)
o Controlled study: 𝑆 is constructed by the experimenter
o Uncontrolled study: 𝑆 is determined by factors beyond experimenter control
- Either case, the critical feature of the notion of cause in this model is that the value of
𝑆(𝑢) for each unit could have been different

Role of time now becomes important because a unit is exposed to a cause at some specific time
or within a specific time period variables now divide into two classes:

- Pre-exposure: those whose values are determined prior to exposure to the cause
- Post-exposure: those whose values are determined after exposure to the cause

Treatment 𝑡 causes the effect 𝑌𝑡 (𝑈) − 𝑌𝑐 (𝑈) on unit 𝑈, relative to treatment 𝑐

Fundamental problem of causal inference: it is impossible to observe the value of 𝑌𝑡 (𝑈)
and 𝑌𝑐 (𝑈) on the same unit and, therefore, it is impossible to observe the effect of 𝑡 on 𝑢

- Scientific solution: exploit various homogeneity or invariance assumptions
- Statistical solution: average causal effect 𝑇 of 𝑡 (relative to 𝑐) over 𝑈 is the expected
value of the difference 𝑌𝑡 (𝑈) − 𝑌𝑐 (𝑈) over the 𝑢’s in 𝑈: 𝐸(𝑌𝑡 − 𝑌𝑐 ) = 𝑇 = 𝐸(𝑌𝑡 ) − 𝐸(𝑌𝑐 )




4

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
Samme Universiteit Utrecht
Follow You need to be logged in order to follow users or courses
Sold
43
Member since
4 year
Number of followers
26
Documents
9
Last sold
1 month ago

4.0

1 reviews

5
0
4
1
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions