Regression
Part 1
Correlation vs regression
Correlation looks at relationship between variables
Regression asks how does the variable x predict variable y?
Correlation into simple regression
When you have data for both age and height, correlation tells you the strength of
the relationship – useful
In some situations only part of the data is provided. E.g. I might want to guess what
height my 9 yr old will be next year.
Regressions is used to make a simple prediction in cases such as these
Simple regression (one predictor
o Predictor > outcome/criterion variable
o Age > height
Multiple regression
But other variables that contribute to height such as nutrition and parent height. A
child with poor nutrition and or shorter parents may not reach same height as one
with good nutrition and or tall parents at the same age
If we want to make a good prediction we must quantify how much these variables
influence height and to what degree relative to each other. Multiple regression can
be used to answer these questions.
Nutrition – age – parent height > height
Predictor – predictor – predictor > outcome/criterion variable
Regression key terms
‘Independent variables’ are now predictor variables
‘Dependent variables’ are now outcome/ criterion variables
The overall ‘model fit’ is R/R2
The strength of predictors is shown by its beta values
o Positive betas = positive predictor
o Negative beta = negative predictor
o These above are similar to covariance/correlations
Formula fun
Intro to data modelling – GLM
Tests a linear model to predict values of an outcome variable from one or more
predictor variables
One predictor = simple regression
More than one predictor = multiple regression
Part 1
Correlation vs regression
Correlation looks at relationship between variables
Regression asks how does the variable x predict variable y?
Correlation into simple regression
When you have data for both age and height, correlation tells you the strength of
the relationship – useful
In some situations only part of the data is provided. E.g. I might want to guess what
height my 9 yr old will be next year.
Regressions is used to make a simple prediction in cases such as these
Simple regression (one predictor
o Predictor > outcome/criterion variable
o Age > height
Multiple regression
But other variables that contribute to height such as nutrition and parent height. A
child with poor nutrition and or shorter parents may not reach same height as one
with good nutrition and or tall parents at the same age
If we want to make a good prediction we must quantify how much these variables
influence height and to what degree relative to each other. Multiple regression can
be used to answer these questions.
Nutrition – age – parent height > height
Predictor – predictor – predictor > outcome/criterion variable
Regression key terms
‘Independent variables’ are now predictor variables
‘Dependent variables’ are now outcome/ criterion variables
The overall ‘model fit’ is R/R2
The strength of predictors is shown by its beta values
o Positive betas = positive predictor
o Negative beta = negative predictor
o These above are similar to covariance/correlations
Formula fun
Intro to data modelling – GLM
Tests a linear model to predict values of an outcome variable from one or more
predictor variables
One predictor = simple regression
More than one predictor = multiple regression