100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Samenvatting

Summary AMDA Spring Chapter 4: Missing Data

Beoordeling
-
Verkocht
-
Pagina's
9
Geüpload op
21-03-2023
Geschreven in
2020/2021

AMDA Spring Chapter 4: Missing Data










Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Documentinformatie

Geüpload op
21 maart 2023
Aantal pagina's
9
Geschreven in
2020/2021
Type
Samenvatting

Onderwerpen

Voorbeeld van de inhoud

81

4. Missing Data .1
Note: if you care about the exact equations, then you should look them up on the slides, because
you cannot really get them accurately with a normal computer on word. But they should also not be
very important since the exams are open book and about understanding, rather than quoting exact
formulas for calculating anything (that’s what we have computers for, right?)

1. Everyone will have missing data problems
2. Missing data problems are the heart of statistics

Causes of missing data
● There can be all kinds of reasons why you have missing data. E.g.:
● Respondent skipped the item
● Data transmission/coding error
● Drop out in longitudinal research
● Refusal to cooperate
● .. and so on

Consequences of missing data
● If you have less data than planned, statistical power problems might arise
● There might be biases in the data analysis, such as:
○ Effect bias
○ Representativity
○ Appropriate confidence interval, p-value?

Response indicator
Random variable Y with missing data (e.g. body weight)
Random variable X contains complete covariates (e.g. age)
Response indicator
● R = 1 if Y is observed
● R = 0 if Y is missing

● R is always complete!
● Using the response indicator, we might be able to tell a missing data mechanism (see next)

Missing data mechanisms
There are three different ways/categories, in which missing data can be separated: MCAR, MAR,
NMAR. They each have their own consequences. They will be more elaborated in the following.

MCAR
● Missing Completely at Random
● Probability to be missing is not related to any factor, aka it is completely random
● P(R=0|Y,X) = P(R=0) → the chance to be missing does not depend on any specific thing
● Example: respondent accidentally skipped question.

MAR
● Missing at Random
● Probability to be missing depends on known factors

, 82
● P(R=0|Y,X) = P(R=0|X) → the chance to be missing depends on a variable, that we are
also measuring in our data (therefore, we can account for it)
● Example: Gender always observed, and men have more missing data than women

MNAR
● Not Missing at Random
● Probability to be missing depends on unknown factors. So a factor that we do not include in
our data and therefore cannot take into consideration/count for, we do not know how the
data is missing
● P(R|Y,X) does not simplify
● Example: People with high incomes have more missing data on a variable measuring
income than people with lower incomes.

Ignorable vs not ignorable missing data
● MAR (and within that MCAR) can be rather ignored, but NMAR cannot be ignored.
● MCAR test: tests H0 that data are MCAR. However, if significant it remains unknown
whether data are NMAR or MAR
○ Usually you treat missing data as MAR, because it requires the least assumptions
and is still testable.
○ You can see whether data is missing with other variables (by seeing whether they
are dependent on each other). But it can also still be that those data points are
missing because of other variables that are not in the data set, or that those are
confounded by other variables.

Strategies to deal with missing data
There are different ways to deal with missing data: Prevention, simple methods, Likelihood
methods (EM), and multiple imputation. Each will be discussed in the following.

Prevention
● Prevention is always the best. For example in Qualtrics, make it a forced response so
people HAVE to answer before they are able to continue. That way you make sure you do
not have missing data etc. and therefore do not have to deal with it later on.

Simple methods
● Listwise deletion - complete-case analysis: as soon as someone is missing one datapoint,
they are not being included in the whole analysis
○ Advantages: Simple (default in SPSS), Correct standard errors, significance levels,
Works in some special NMAR cases (Little, 1993; Vach 1994)
○ Disadvantages: Wasteful, Same data - different n, OK under MCAR, biased under
MAR and partly NMAR
● Pairwise deletion - available case analysis: you only take out where there is actually
information missing, you still use the rest of the data
○ Advantages: Uses all available information
○ Disadvantages: Only works under MCAR, Computational problems: Negative
variances, rank problems
○ AVOID !
● Mean substitution - you substitute the missing data-points with the mean of the sample
○ Avoid!
○ Biased under MAR, underestimates the variance, disturbs the distribution

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
fionabrosig Universiteit Leiden
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
46
Lid sinds
4 jaar
Aantal volgers
33
Documenten
8
Laatst verkocht
7 maanden geleden

5,0

1 beoordelingen

5
1
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen