Summary

Summary Statistics 2 (P_BSTATIS_2)

Rating

Sold

Pages

Uploaded on

20-03-2025

Written in

2024/2025

A concise summary of the most important content from the Statistics 2 course (P_BSTATIS_2), based on lectures and the book (see below). Alan Agresti (2018). Statistical Methods For The Social Sciences – 5th global edition. Pearson Education International.

Show more Read less

Institution

Course

Whoops! We can’t load your doc right now. Try again or contact support.

Report Copyright Violation

Connected book

Alan Agresti, Barbara Finlay Statistical Methods for the Social Sciences, Global Edition

Edition:april 2018
ISBN:9781292220314
Edition:5

Written for

Institution: Vrije Universiteit Amsterdam (VU)
Study: Psychologie
Course: Statistiek 2 (P_BSTATIS_2)

All documents for this subject (38)

Document information

Summarized whole book?: No
Which chapters are summarized?: Chapter 10 to 14
Uploaded on: March 20, 2025
File latest updated on: March 20, 2025
Number of pages: 14
Written in: 2024/2025
Type: Summary

Subjects

statistics
statistics
stats
free university
vu
amsterdam
summary
stats 2
statistics 2
statistics 2
pbstatis2
vu amsterdam
psychology
psychology

Content preview

chapter 10 introduction to multivariate relationships
causal relationships are asymmetrical → 𝑥 causes 𝑦
- association between variables
o as 𝑥 changes, the distribution of 𝑦 should change in some way
o association does NOT imply causation
- appropriate time order
- elimination of alternative explanations
o observational studies can never prove that 1 variable is a cause of another
- anecdotal evidence is not enough to disprove causality unless it can deflate 1 of the 3
criteria
- randomized experiments are the standard for establishing causality, although this isn’t
always possible in social research

in multivariate analysis, a variable is said to be controlled when its influence is removed
- randomized experiments inherently control other variables in a probabilistic sense

statistical control: approximating an experimental type of control by grouping observations
with equal/similar values on the control variables in observational research

control variable: any variable that is held constant
lurking variable: a variable not measured in a study, but does influence the association

multivariate associations
- spurious: both 𝑥1 and 𝑦 are dependent on 𝑥2 , but their association disappears when 𝑥2
is controlled
- chain relationship: the relationship between 𝑥1 and 𝑦 exists but is indirect. 𝑥2 is an
intervening variable or mediator
- multiple causes: can either be independent or dependent (= there exists a relationship
between the causes themselves)
- suppressor: when controlling for a suppressor variable, the association between 2
variables increases
- interaction: an association has diff strengths and/or directions at diff values of the
control variable

Simpson’s paradox: the possibility that after controlling for a variable, each association has the
opposite direction as the bivariate association

confounding: when 2 explanatory variables both have effects on a response variable but are
also associated with each other
- omitted variable bias: a study neglecting to observe a confounding variable that
explains a major part of the effect

1

,chapter 9 linear regression and correlation
non-directional: 𝑥 predicts 𝑦
directional:
- pos association: higher 𝑥 predicts higher 𝑦
- neg association: higher 𝑥 predicts lower 𝑦

linear regression model: 𝑦̂ = 𝑎 + 𝑏𝑥
- predicted criterion value → 𝑦̂
- 𝑦-intercept → 𝑎
- slope → 𝑏
o pos when high 𝑥-values coincide with high 𝑦-values, and vice versa
o neg when low 𝑥-values coincide with high 𝑦-values, and vice versa
o we can’t use 𝑏 to interpret the strength of the association between 𝑥 and 𝑦
▪ 𝑏 depends on the scale

we consider 3 types of 𝑦:
- 𝑦: observed outcome value of an individual
- 𝑦̅: avg outcome value (mean of 𝑦)
- 𝑦̂: individual’s predicted outcome value based on model

least square estimation: the best straight line falling closest to all data points in the scatterplot

𝑠
Pearson’s correlation: 𝑏*= 𝑟 = (𝑠𝑥 ) 𝑏
𝑦
- interpretation: 0 < negligible < .10 ≤ small < .30 ≤ moderate < .50 ≤ large
- both 𝑟 and 𝑏* are measures of effect size

residual (𝒆): vertical distance between observed 𝑦 and predicted 𝑦̂
- 𝑒 = 𝑦 − 𝑦̂
- we can use this residual to determine how well the model performs in predicting 𝑦

total sum of squares: 𝑇𝑆𝑆 = ∑(𝑦 − 𝑦̅)2
how much variation is there in the to be
explained dependent variable
marginal variation

sum of squared errors: 𝑆𝑆𝐸 = ∑(𝑦 − 𝑦̂)2
how much variation is still unexplained
after adding the independent variable
conditional variation

regression sum of squares: 𝑅𝑆𝑆 = ∑(𝑦̂ − 𝑦̅)2
how much variation is explained by adding
the independent variable

the smaller the 𝑆𝑆𝐸, the better the prediction → 𝑆𝑆𝐸 = 𝑇𝑆𝑆 − 𝑅𝑆𝑆

we use diff sum of squares to inspect the explanatory power of the model and for significance

2

, coefficient of determination (𝑹𝟐 ): proportion of variation in 𝑦 that is explained by the model
𝑇𝑆𝑆−𝑆𝑆𝐸 ∑(𝑦−𝑦̅)2 −∑(𝑦−𝑦̂)2
- 𝑅2 = 𝑇𝑆𝑆
= ∑(𝑦−𝑦̅)2
- 0≤𝑅 ≤1 2

- the closer to 1, the stronger the linear relationship
- interpretation: 0 < negligible < .02 ≤ small < .13 ≤ moderate < .26 ≤ large

inferential statistics: using sample data to make inferences abt the population parameters
- we can’t confirm hypotheses, but we can falsify
o by inspecting the probability of finding 𝑏 (or 𝑟) when the null hypothesis was true
o null hypothesis: no association between variables (independent)
▪ 𝐻0: 𝛽 = 0
o alternative hypothesis: association between variables (dependent)
▪ 𝐻𝑎: 𝛽 ≠ 0
▪ if directional: 𝛽 < 0 or 𝛽 > 0
- check significance of 𝑏 using 𝑡-statistic
𝑏
o 𝐻0: 𝛽 = 0 𝑡 = 𝑠𝑒 with 𝑑𝑓 = 𝑛 − 2
- check significance of 𝑅2 using the 𝐹-statistic
𝑅 2 /1 (𝑇𝑆𝑆−𝑆𝑆𝐸)/1 𝑅𝑆𝑆/1 𝑀𝑆𝑅
o 𝐹 = (1−𝑅2)/(𝑛−2) = 𝑆𝑆𝐸/(𝑛−2)
= 𝑆𝑆𝐸/(𝑛−2) = 𝑀𝑆𝐸
▪ 𝑑𝑓1 = 𝑘 = 1
𝑘 = number of regression parameters 𝑏
▪ 𝑑𝑓2 = 𝑛 − 𝑘 − 1 = 𝑛 − 2
- based on the 𝑡- or 𝐹-statistic, determine the 𝑝-value:
o what is the probability of finding a result this extreme, when the 𝐻0 is true?
- 𝐹 = 𝑡 2 → both options yield the same conclusion

4 scenarios are possible, depending on the decision and the condition of 𝐻0
- 2x erroneous decision (which we want to avoid)
o type 1 error: probability of rejecting 𝐻0 when it is true
▪ determined by the selected 𝛼-level (.05)
▪ if observed 𝑝-value < 𝛼 : reject 𝐻0
o type 2 error (𝛽): probability of not rejecting 𝐻0 when it is false
▪ determined by:
• strength of association/diff in population
• sample size of study
• selected 𝛼-level
o trade-off: the smaller the type 1 error, the larger the type 2 error
- 2x correct decision
o 1 − 𝛽 = power → probability of correctly rejecting 𝐻0
▪ typically aim for 80%

assumptions of linear regression:
- representativeness: analyses are based on a random sample
- functional form: relation between 𝑥 and 𝑦 is linear
- homoscedasticity: conditional variance around 𝑏 is equal for all 𝑥
- normal distribution: conditional variance of 𝑦 for all 𝑥 is normal

3

$7.62

Get access to the full document:

100% satisfaction guarantee

Immediately available after payment

Both online and in PDF

No strings attached

Get to know the seller

d511

Get to know the seller

d511 Vrije Universiteit Amsterdam

View profile

Sold

Member since

2 year

Number of followers

Documents

Last sold

3 days ago

0.0

0 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller d511. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $7.62. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 50201 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 15 years now

Summary Statistics 2 (P_BSTATIS_2)

Connected book

Written for

Document information

Subjects

Content preview

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?