Chapter 6 – Empirical Estimates of Reliability
- Reliability is theoretical property only estimate can be made.
- Methods: all from notion of parallel tests, but differ in terms of kind of data that are
available + assumptions on which they rest. BUT: no single method gives completely
accurate estimates under all conditions.
1. Alternate/parallel forms reliability
- Obtaining scores from 2 different form of test compute correlation between forms
+ interpret this correlation as estimate of reliability. Consistent differences = reliable.
- Only interpret if 2 test forms are parallel: correlation tests = reliability.
o They’re measuring the same set of true scores.
o They have the same amount of error variance.
- Problems:
o Can never truly know if tests are parallel not sure if both measuring same
psychological attribute = both having same true scores, because contain
different items correlation not good estimate of reliability.
o Potential for carryover/contamination effects due to repeated testing: taking
one form of test has effect on performance on second form, e.g., memory,
learning, attitudes, mood states, etc. error scores of both forms are
correlated (COMPARE CTT assumption: error affecting test should be
random/uncorrelated).
- Forms have same means + standard deviations, we think they measure same
construct, then they could be close enough to be parallel correlation can be used
as estimate of reliability.
Recap: assumptions CTT
- Each observed score is additive function of true scores + error scores (Xo = Xt + Xe).
- True scores are completely identical across 2 forms.
- Error scores sum to 0 for each form.
- True scores are uncorrelated with error scores.
- Error variances are equal for 2 forms.
- Errors of 2 forms affecting test should be random/uncorrelated.
Both forms equally reliable.
2. Test-retest reliability
- Avoid some problems of alternate forms method + is potentially useful for measures
of stable psychological construct (e.g., intelligence).
o No problem of parallel tests in 2 alternate forms, because same people take
same test on >1 occasion correlation between 1st test scores + retest scores
can be interpreted as estimate of test’s reliability.
- BUT: applicability also rests on assumptions.
o Stability assumption: participants’ true scores are stable across 2 testing
occasions = being certain true scores don’t change.
o Error variance of 1st test is equal to error variance of 2nd test.
2 testing occasions must produce scores that are equally reliable; if so, then
correlation = estimate of score’s reliability.
1