The Robust Beauty of Improper Linear Models in Decision Making – Dawes
• even improper linear models may be superior to clinical predictions
• proper linear model: weights given to the predictor variables are chosen in such a way as to
optimize the relationship between the prediction and the criterion (e.g. regression analysis)
◦ Example: ratings of graduate students by faculty (outstanding, above average, average,
below average, dropped out of program in academic difficulty); faculty ratings were
predicted from a proper linear model based on the student's Graduate Record
Examination (GRE) score, the student's undergraduate grade point average (GPA), and a
measure of the selectivity of the student's undergraduate institution→ correlation with
model’s prediction is higher than correlation with clinical prediction but both
correlations low (interpret the findings as meaning that while the low correlation of the
model indicates that linear modeling is deficient as a method, the even lower correlation
of the judges indicates only that the wrong judges were used)
• Improper Lineal model: weights chosen by some nonoptimal method → chosen to be
equal, chosen on basis of intuition of person making prediction, chosen randomly
◦ may have great utility→ very crude improper linear model predicts a very important
variable: judgments about marital happiness
• statistical model may integrate the information in an optimal manner, but it is always the
individual (judge, clinician, subjects) who chooses variables → linear model cannot replace
the expert in deciding such things
◦ people—especially the experts in a field—are much better at selecting and coding
information than they are at integrating it (e.g. expert chess player can code the board in
an appropriate way to see the proper moves that distinguish a grand master from the
expert from the novice)
→ linear models work because people are good at picking out the right predictor variables and at
coding them in such a way that they have a conditionally monotone relationship with the criterion,
however, people are bad at integrating information from diverse and comparable sources, wherea
proper linear models are good at such integration when the predictions have a conditionally
monotone relationship to the criterion
• not possible construct a proper linear model in some situations (e.g. inadequate sample size)
◦ standard regression analysis cannot be used in situations where there is not a decent ratio
of observations to predictors
◦ cannot be used in situations in which there are no measurable criterion variables (e.g. no
proper conceptualizing of variable as “professional self-actualization”)
• bootstrapping (building improper linear model): process is to build a proper linear model of
an expert's judgments about an outcome criterion and then to use that linear model in place
of the judge → paramorphic representations (judges' psychological processes did not
involve computing an implicit or explicit weighted average of input variables, but that it
could be simulated by such a weighting) consistently do better than the judges from which
they were derived
• Bootstrapping has turned out to be pervasive (e.g. Goldenberg MMPI study, Dawes study;
only Libby found loan officers were better than the paramorphic representations)
• Why does bootstrapping work? → its success arises from the fact that a linear model distills
underlying policy (in the implicit weights) from otherwise variable behavior (e.g.,
judgments affected by context effects or extraneous variables)
• Study to test random linear models: on average, random linear models perform about as well
as the paramorphic models of the judges → linear model are robust over deviations from
optimal weighting (bootstrapping finding has simply been a reaffirmation of the earlier
finding that proper linear models are superior to human judgment)
◦ Weights that are near to optimal level produce almost the sameoutput as do optimal beta
weights → Because the expert judge knows at least something about the direction of the
• even improper linear models may be superior to clinical predictions
• proper linear model: weights given to the predictor variables are chosen in such a way as to
optimize the relationship between the prediction and the criterion (e.g. regression analysis)
◦ Example: ratings of graduate students by faculty (outstanding, above average, average,
below average, dropped out of program in academic difficulty); faculty ratings were
predicted from a proper linear model based on the student's Graduate Record
Examination (GRE) score, the student's undergraduate grade point average (GPA), and a
measure of the selectivity of the student's undergraduate institution→ correlation with
model’s prediction is higher than correlation with clinical prediction but both
correlations low (interpret the findings as meaning that while the low correlation of the
model indicates that linear modeling is deficient as a method, the even lower correlation
of the judges indicates only that the wrong judges were used)
• Improper Lineal model: weights chosen by some nonoptimal method → chosen to be
equal, chosen on basis of intuition of person making prediction, chosen randomly
◦ may have great utility→ very crude improper linear model predicts a very important
variable: judgments about marital happiness
• statistical model may integrate the information in an optimal manner, but it is always the
individual (judge, clinician, subjects) who chooses variables → linear model cannot replace
the expert in deciding such things
◦ people—especially the experts in a field—are much better at selecting and coding
information than they are at integrating it (e.g. expert chess player can code the board in
an appropriate way to see the proper moves that distinguish a grand master from the
expert from the novice)
→ linear models work because people are good at picking out the right predictor variables and at
coding them in such a way that they have a conditionally monotone relationship with the criterion,
however, people are bad at integrating information from diverse and comparable sources, wherea
proper linear models are good at such integration when the predictions have a conditionally
monotone relationship to the criterion
• not possible construct a proper linear model in some situations (e.g. inadequate sample size)
◦ standard regression analysis cannot be used in situations where there is not a decent ratio
of observations to predictors
◦ cannot be used in situations in which there are no measurable criterion variables (e.g. no
proper conceptualizing of variable as “professional self-actualization”)
• bootstrapping (building improper linear model): process is to build a proper linear model of
an expert's judgments about an outcome criterion and then to use that linear model in place
of the judge → paramorphic representations (judges' psychological processes did not
involve computing an implicit or explicit weighted average of input variables, but that it
could be simulated by such a weighting) consistently do better than the judges from which
they were derived
• Bootstrapping has turned out to be pervasive (e.g. Goldenberg MMPI study, Dawes study;
only Libby found loan officers were better than the paramorphic representations)
• Why does bootstrapping work? → its success arises from the fact that a linear model distills
underlying policy (in the implicit weights) from otherwise variable behavior (e.g.,
judgments affected by context effects or extraneous variables)
• Study to test random linear models: on average, random linear models perform about as well
as the paramorphic models of the judges → linear model are robust over deviations from
optimal weighting (bootstrapping finding has simply been a reaffirmation of the earlier
finding that proper linear models are superior to human judgment)
◦ Weights that are near to optimal level produce almost the sameoutput as do optimal beta
weights → Because the expert judge knows at least something about the direction of the