1. Comment
3 February 2020 at 13:28:35
Using b0 and b1 instead of a and
b allows us to work with multiple
variables (b3, b4, etc)
2. Comment
3 February 2020 at 14:11:15
Simple linear regression model
3. Comment
3 February 2020 at 13:34:43
Expected value (ignoring error) of y
given x
4. Comment
3 February 2020 at 13:36:18 Lecture 1: Introduction to regression analysis
Difference between points and line
(error i) Regression is related to correlation, but:
• Can estimate impact of multiple independent variables
5. Comment
• Not just strength of association, but size of effect
3 February 2020 at 13:28:35
• Can assess null hypothesis
Using b0 and b1 instead of a and
• Assumes linear correlation
b allows us to work with multiple
variables (b3, b4, etc)
Regression line
• Formula:
6. Comment
3 February 2020 at 14:12:40 • y = a + bx
Elaborate on web lecture
1 • ŷi = b0 + b1xi
• "Line of best t”: minimizes distances between points and line
• ^: estimate
• i: observation number (obs.1, obs.2, etc)
2 yi = b0 + b1xi + i
• i: error
• Mean = 0, variance = σ2 (only if y-variable is normally distributed)
3 Alternative formula: E[yi|xi] = b0 + b1xi
Ordinary Least Squares (OLS): method for finding regression line
4 • Minimizes sum of squared residuals
(yi − yî )2 = (yi − b0 − b1 xi )2
• Squared residuals: SSR = ∑ ∑
5 • Plug values into formula (ŷi = b0 + b1xi ) to find regression line
• Find b̂1 using SPSS
• b0̂ = ȳ − b1̂ x̄
Regression assumptions:
6 • Relationship between E[yi|x] and x is linear and additive
• E[ i|x] = 0
𝜀𝜀 fi 𝜀
,7. Comment
3 February 2020 at 14:14:42
Non-negative numbers (e.g. #
wars)
8. Comment
3 February 2020 at 15:02:02
Categorical/ordinal (named)
• Variables suited for regression:
• Dependent variable must be interval ratio, otherwise:
• If nominal/ordinal: logistical regression
7 • If count scale: Poisson and negative binomial regression (not in course)
• Explanatory variables can be any type
• Variance ≠ 0
Lecture 1: SPSS
Find b̂1:
[Analyze] → (Correlate] → [Bivariate…] → [Options…] → select “Cross-product deviations and
?
covariances” → [Continue] → [Paste] → click play → b1̂ =
?
Recode variable → different variables:
[Transform] → [Recode into Different Variables] → drag variable into box → [Old and New
8 Values…] → input relevant instructions → (select “Output variables are strings” if necessary) →
[Continue] → select variable → input new label → [Change] → [Paste] → click play
Add regression line to scatterplot:
Double-click graph in output viewer → [Elements] → [Fit Line at Total]
Select cases (multiple conditions):
[Data] → [Select Cases…] → select “If condition is satisfied” → [If…] → input conditions (“|”
between each full equation) → [Continue] → [Paste] → click play
, 9. Comment
10 February 2020 at 16:29:47
Produces random errors
10. Comment
10 February 2020 at 16:24:22
I.e. consider sampling error to
express uncertainty
11. Comment
10 February 2020 at 16:44:30
Since b̂ 1 is normally distributed
12. Comment
10 February 2020 at 16:28:32
SEb depends on SSr
Lecture 2: Simple Linear Regression Analysis
13. Comment
10 February 2020 at 16:40:19
9 Regression line of sample ≠ regression line of population
# explanatory variables (b1, b2,
b3, etc)
14. Comment Signi cance testing of regression line
10 February 2020 at 16:33:20 10 (Use inference to get to population parameter)
b̂ 1 is more precise
Use SPSS to generate values needed for following instructions.
11 T-test:
b̂ b1
•
t̂ = →t =
12
̂ b)̂
se( SEb1
• H0: b1 = 0
• H1: b1 ≠ 0
13 • df = n - p - 1
14 • Variance of b̂1 is lower if:
• X has high variance
• N is large
• has low variance (low SSR)
MSR
SÊ b1 =
• SSX
• MSR = mean square of residual
• SSX = sum of square of X variable
• Alternative:
• B: unstandardized regression coefficient
𝜀
fi
3 February 2020 at 13:28:35
Using b0 and b1 instead of a and
b allows us to work with multiple
variables (b3, b4, etc)
2. Comment
3 February 2020 at 14:11:15
Simple linear regression model
3. Comment
3 February 2020 at 13:34:43
Expected value (ignoring error) of y
given x
4. Comment
3 February 2020 at 13:36:18 Lecture 1: Introduction to regression analysis
Difference between points and line
(error i) Regression is related to correlation, but:
• Can estimate impact of multiple independent variables
5. Comment
• Not just strength of association, but size of effect
3 February 2020 at 13:28:35
• Can assess null hypothesis
Using b0 and b1 instead of a and
• Assumes linear correlation
b allows us to work with multiple
variables (b3, b4, etc)
Regression line
• Formula:
6. Comment
3 February 2020 at 14:12:40 • y = a + bx
Elaborate on web lecture
1 • ŷi = b0 + b1xi
• "Line of best t”: minimizes distances between points and line
• ^: estimate
• i: observation number (obs.1, obs.2, etc)
2 yi = b0 + b1xi + i
• i: error
• Mean = 0, variance = σ2 (only if y-variable is normally distributed)
3 Alternative formula: E[yi|xi] = b0 + b1xi
Ordinary Least Squares (OLS): method for finding regression line
4 • Minimizes sum of squared residuals
(yi − yî )2 = (yi − b0 − b1 xi )2
• Squared residuals: SSR = ∑ ∑
5 • Plug values into formula (ŷi = b0 + b1xi ) to find regression line
• Find b̂1 using SPSS
• b0̂ = ȳ − b1̂ x̄
Regression assumptions:
6 • Relationship between E[yi|x] and x is linear and additive
• E[ i|x] = 0
𝜀𝜀 fi 𝜀
,7. Comment
3 February 2020 at 14:14:42
Non-negative numbers (e.g. #
wars)
8. Comment
3 February 2020 at 15:02:02
Categorical/ordinal (named)
• Variables suited for regression:
• Dependent variable must be interval ratio, otherwise:
• If nominal/ordinal: logistical regression
7 • If count scale: Poisson and negative binomial regression (not in course)
• Explanatory variables can be any type
• Variance ≠ 0
Lecture 1: SPSS
Find b̂1:
[Analyze] → (Correlate] → [Bivariate…] → [Options…] → select “Cross-product deviations and
?
covariances” → [Continue] → [Paste] → click play → b1̂ =
?
Recode variable → different variables:
[Transform] → [Recode into Different Variables] → drag variable into box → [Old and New
8 Values…] → input relevant instructions → (select “Output variables are strings” if necessary) →
[Continue] → select variable → input new label → [Change] → [Paste] → click play
Add regression line to scatterplot:
Double-click graph in output viewer → [Elements] → [Fit Line at Total]
Select cases (multiple conditions):
[Data] → [Select Cases…] → select “If condition is satisfied” → [If…] → input conditions (“|”
between each full equation) → [Continue] → [Paste] → click play
, 9. Comment
10 February 2020 at 16:29:47
Produces random errors
10. Comment
10 February 2020 at 16:24:22
I.e. consider sampling error to
express uncertainty
11. Comment
10 February 2020 at 16:44:30
Since b̂ 1 is normally distributed
12. Comment
10 February 2020 at 16:28:32
SEb depends on SSr
Lecture 2: Simple Linear Regression Analysis
13. Comment
10 February 2020 at 16:40:19
9 Regression line of sample ≠ regression line of population
# explanatory variables (b1, b2,
b3, etc)
14. Comment Signi cance testing of regression line
10 February 2020 at 16:33:20 10 (Use inference to get to population parameter)
b̂ 1 is more precise
Use SPSS to generate values needed for following instructions.
11 T-test:
b̂ b1
•
t̂ = →t =
12
̂ b)̂
se( SEb1
• H0: b1 = 0
• H1: b1 ≠ 0
13 • df = n - p - 1
14 • Variance of b̂1 is lower if:
• X has high variance
• N is large
• has low variance (low SSR)
MSR
SÊ b1 =
• SSX
• MSR = mean square of residual
• SSX = sum of square of X variable
• Alternative:
• B: unstandardized regression coefficient
𝜀
fi