Nezlek (2008)
An introduction to multilevel modeling for social and personality psychology
Sometimes multilevel data sets are referred to as ‘nested’ or ‘hierarchically nested’
because such observations at one level of analysis are nested within observations at
another level. Accordingly, when working with multilevel data, it should be analyzed using
techniques that take into account this nesting.
The importance of understanding levels of analysis
Multilevel analyses are appropriate when data have been collected at multiple levels
simultaneously.
- Levels: how the data is organized; whether observations are dependent (or not
independent).
These levels are ordered based on hierarchy from personal levels upwards: level 1
data could be observations of dairy submissions, level 2 data could be personality
traits. Or level 1 data = personality traits, level 2 data is group descriptions.
Multilevel analysis is present when level 1 data is not independent, because of
common characteristics. This dependence leads to ordinary least-squares
(OLS) techniques cannot be used because these violate a fundamental
assumption: independence of observations. Single-level analyses that ignore
the hierarchical structure of the data can thus provide misleading results
(higher correlations than is true).
Some have argued that such problems can be solved by using a variable indicating group
membership: least squares dummy-codes analysis > creating an interaction term
between dummy codes and level 1 variables, to examine the possibility that level 1
relationships vary between level 2 units of analysis. However, even when dummy codes
are included, the analyses violate important assumptions about sampling error.
Illustrative applications
Cases of data collection that might want to consider nested data:
o Interval contingent studies: data collected at certain intervals (either fixed or
random).
o Event-contingent studies: data collected whenever a certain type of event occurs.
o Research focusing on within-person variability in psychological traits.
o Data collected from groups (participants = level 1; groups = level 2).
o Cross-cultural research: participants from one population/country are nested.
Sampling error within multilevel data structures
Multiple sampling: in a multilevel data structure, units of observations are randomly
sampled from populations at different levels simultaneously. The error associated with
sampling at each level of analysis needs to be estimated.
- Sampling error: error in a statistical analysis arising from the
unrepresentativeness of the sample taken.
Within-group relationships are simply not interchangeable with within-group relationships.
The errors at the two levels of analysis are separate.
Random coefficient techniques: way of estimating error on separate levels that
relies on maximum likelihood algorithms that allow for the simultaneous
estimation of multiple unknowns.
Random coefficient techniques are better than OLS techniques when:
1. Hypotheses of interest concern within-unit relationships (dependent data).
2. The data structure is irregular (when groups differ in size, amount of data
provided per subject differs, etc).
What is multilevel modeling?
A.k.a. Multilevel Random Coefficient Modeling (MRCM).
- Random: the technique estimates random coefficients.
For each level 2 unit, a level 1 model is estimated. These models are functionally
equivalent to a standard OLS regression. Example: for each group (in a group studies) a
An introduction to multilevel modeling for social and personality psychology
Sometimes multilevel data sets are referred to as ‘nested’ or ‘hierarchically nested’
because such observations at one level of analysis are nested within observations at
another level. Accordingly, when working with multilevel data, it should be analyzed using
techniques that take into account this nesting.
The importance of understanding levels of analysis
Multilevel analyses are appropriate when data have been collected at multiple levels
simultaneously.
- Levels: how the data is organized; whether observations are dependent (or not
independent).
These levels are ordered based on hierarchy from personal levels upwards: level 1
data could be observations of dairy submissions, level 2 data could be personality
traits. Or level 1 data = personality traits, level 2 data is group descriptions.
Multilevel analysis is present when level 1 data is not independent, because of
common characteristics. This dependence leads to ordinary least-squares
(OLS) techniques cannot be used because these violate a fundamental
assumption: independence of observations. Single-level analyses that ignore
the hierarchical structure of the data can thus provide misleading results
(higher correlations than is true).
Some have argued that such problems can be solved by using a variable indicating group
membership: least squares dummy-codes analysis > creating an interaction term
between dummy codes and level 1 variables, to examine the possibility that level 1
relationships vary between level 2 units of analysis. However, even when dummy codes
are included, the analyses violate important assumptions about sampling error.
Illustrative applications
Cases of data collection that might want to consider nested data:
o Interval contingent studies: data collected at certain intervals (either fixed or
random).
o Event-contingent studies: data collected whenever a certain type of event occurs.
o Research focusing on within-person variability in psychological traits.
o Data collected from groups (participants = level 1; groups = level 2).
o Cross-cultural research: participants from one population/country are nested.
Sampling error within multilevel data structures
Multiple sampling: in a multilevel data structure, units of observations are randomly
sampled from populations at different levels simultaneously. The error associated with
sampling at each level of analysis needs to be estimated.
- Sampling error: error in a statistical analysis arising from the
unrepresentativeness of the sample taken.
Within-group relationships are simply not interchangeable with within-group relationships.
The errors at the two levels of analysis are separate.
Random coefficient techniques: way of estimating error on separate levels that
relies on maximum likelihood algorithms that allow for the simultaneous
estimation of multiple unknowns.
Random coefficient techniques are better than OLS techniques when:
1. Hypotheses of interest concern within-unit relationships (dependent data).
2. The data structure is irregular (when groups differ in size, amount of data
provided per subject differs, etc).
What is multilevel modeling?
A.k.a. Multilevel Random Coefficient Modeling (MRCM).
- Random: the technique estimates random coefficients.
For each level 2 unit, a level 1 model is estimated. These models are functionally
equivalent to a standard OLS regression. Example: for each group (in a group studies) a