Learning objectives:
1. You can describe the difference between linear and (binary) logistic regression.
2. You understand why a linear regression cannot be used on a binary outcome.
3. You understand and can explain the difference between probability, odds and odds ratios.
4. You are able to interpret odds ratios.
Linear vs logistic regression
Recap linear regression
The idea in a linear regression is that we are trying to predict the value of a continuous outcome. For
example predict hight, symptom of depressions. We have 1 outcome variable that is measured on a
continuous scale. In a linear regression we are trying to predict the score on this continuous variable
using 1 or more predicter variables. These can be continuous (bijv. Age) or categorical (gender,
education level). Technically when u run a LRA when u have categorical variables u need to include
them as dummy variables. We can use a LR to answer the general research question and that is: if a
set of predicters or of this specific predicter can be use to predict the score of a continuous outcome
variable. See example for a more concrete example.
In a LR we try to model a line that u can see in the image below. So we try to model a linear
regression line. The dots are the data points that we have collect. And we try to model the line
through the datapoints that we have collected. We try to model the line using the linear regression
model (the formula).
, What do the different parts of the linear regression equation stand for?
Y= outcome or the level of our outcome that we trying to predict using the model
B0= intercept. The level of the outcome that we expect for people who score 0 on all the predictors
that we include.
B1= coefficient, represents the relation between a particularly predicter and an outcome variable. It
tells something about how much the outcome change if there is a change in the associate predicter.
X1= predicter or the value of the predicter
ei= error term that we attempt to include
so we know that we can use this type of model to predict a value on a continuous variable. But there
are also situations that we want to predict a dichotomous or binary variable. Then we cannot use this
model. That is where logistic regression comes in.
dichotomous dependent variables
there might be situation where we might do not want to predict a value or continuous variable or
maybe we are interested in a dichotomous (binary) variable instead. This means that we are trying to
predict whether or not a specific event is happening. In clinical psychology this means that we are
interested in whether someone develop a psychological disorder or not. It is quite a broad spectrum
of questions that we can answer if we have a dichotomous (or binary) variable.
The problem is that when the variable that we are trying to predict is a dichotomous variable we can
not use the linear regression model that we saw before. We will use a logistic regression instead.