THE COURTROOM CASE STUDY SOLUTION
e
pl
SYNOPSIS
m
Sa
In November 2017, Students for Fair Admissions Inc. (SFFA) filed a complaint alleging that Harvard
College (Harvard) violated Title VI of the Civil Rights Act, which prohibited racial discrimination in
institutions that received federal funding. In the complaint, SFFA alleged that Harvard engaged in
intentional discrimination against Asian American applicants in its admissions process. 2 Harvard
n
acknowledged its use of race in the admissions process but maintained that it was only one of many factors
tio
the school considered. It also claimed that using race as a “plus factor” was supported by the law.
Both SFFA and Harvard used a large set of Harvard admissions data to attempt to plead their cases and
lu
both hired econometrics experts to argue their positions. SFFA hired Dr. Peter Arcidiacono and Harvard
hired Dr. David Card. Arcidiacono provided an analysis that found a statistically significant negative
So
relationship between being Asian American and the odds of being admitted to Harvard. Card, in turn,
criticized this model and proposed his own series of regression models, which concluded that being Asian
American did not have a statistically significant impact on the probability of admission.
The case proceeded to trial and Judge Allison D. Burroughs presided over the proceedings. Burroughs
needed to assess the two experts’ findings and reach a conclusion about the story the data was telling.
OBJECTIVES
The Case Solution Starts From page 6
,• Explain the use of multiple regression in high profile legal cases.
• Understand why regression analysis necessitates an underlying theory and contextual understanding of
the “data generating process,” illustrating the importance of theory-driven modelling choices in
statistical analysis.
• Identify what an interaction variable is, and what trade-offs are associated with the use of an
independent variable.
• Define the idea of “statistical power” and give examples of how different modelling assumptions may
lead to different levels of statistical power.
• Explain how control variables can be used in a regression analysis, and why having more control
variables is not necessarily better.
• Identify methods to determine which modelling choices are critical to reach a statistical conclusion.
e
pl
m
Sa
n
tio
lu
So
The Case Solution Starts From page 6
,ASSIGNMENT QUESTIONS
1. What was the general method recommended by both econometrics experts to determine whether Asian
Americans were discriminated against?
2. One difference between the SFFA model and the Harvard model was the data set they used. SFFA
recommended removing the ALDC group of candidates (i.e., athlete, legacy, dean’s list, and children
of faculty and staff), whereas Harvard favoured keeping them in the model. Which side do you find
more convincing, and why?
3. Another difference was the decision to consider each year’s admissions separately or to “pool” them
e
together. What are the pros and cons of the pooled model? Why might that model be more or less
appropriate in this situation?
pl
4. Another difference was the choice of control variables used. For example, Card included a control variable
for the occupation of the applicant’s parents. Why might these variables be included or excluded?
m
5. SFFA’s model included an interaction variable between race and gender. What is an interaction
variable? Why might it have been included?
Sa
6. SFFA and Harvard disagreed about the use of the personal rating factor in the admissions process.
SFFA recommended not including this independent variable, whereas Harvard favoured keeping it.
What was each side concerned about? Which side do you find more convincing, and why?
7. Which of the modelling disagreements do you feel is most critical, and why?
n
tio
lu
So
The Case Solution Starts From page 6
, e
pl
m
3. Another difference was the decision to consider each year’s admissions separately or to “pool”
them together. What are the pros and cons of the pooled model? Why might that model be more
or less appropriate in this situation?
Sa
Both sides used different approaches regarding including all years in one model (i.e., pooling) or using a
separate estimate for each year (i.e., stratifying). use this question to generate a discussion
on statistical power and on how it relates to sample size. The key learning outcomes of this discussion
n
should be that statistical power is positively related to sample size.
tio
Generally, the advantage of a pooled model is that it increases the model’s statistical power, which refers
to the probability that a study will detect an effect (assuming that there actually is one). Harvard argued that
this approach is inconsistent with the reality of the admissions process because it incorrectly assumes that
lu
applicants from different years are competing against each other. To address this issue, Harvard advocated
So
The Case Solution Starts From page 6
, 7. Which of the modelling disagreements do you feel is most critical, and why?
One way to determine how critical or “important” a modelling disagreement is would be to assess the degree
to which it impacts the conclusion. If the choice impacts a conclusion, then it may be considered critical.
Conversely, if a modelling choice does not impact the conclusion—meaning that the conclusion is robust
to the choice—the modelling decision may be considered not consequential.
e
pl
m
Sa
n
tio
lu
So
The Case Solution Starts From page 6