Chapter 5 – Reliability: Conceptual Basis
Overview of reliability and classical test theory
- Reliable: ability to reflect real psychological differences accurately. Continuum of
reliability.
- Classical test theory (CTT): measurement theory that defines conceptual basis of
reliability + outlines procedures for estimating reliability of psychological measures.
o Observed scores: values obtained from measurement of characteristic.
o True scores (signal): real amount of characteristic. Observed scores interpreted
as good estimates of true scores, because most research are intended to
reflect true psychological characteristics.
o Reliability (signal/signal+noise): reflects extent to which differences in
respondents’ test scores are function of their true psychological differences, as
opposed to measurement error. Is an unobserved feature of test scores
estimate (compare intelligence).
o Measurement error (signal + noise): extent to which ‘other’ characteristics
contribute to differences in observed scores create inconsistency between
observed + true scores. Score is never perfectly reliable, always some error;
BUT: hard to find all sources of measurement error (e.g., measuring babies’
length affected by amount of squirming, or different nurses measuring babies).
Observed scores, true scores, and measurement error
- Reliability depends on:
o Extent to which differences in test scores can be attributed to real inter- or
intra-individual differences.
o Extent to which differences in test scores are function of measurement error.
- CTT: observed score is true score + error
o Assumption about measurement error: error occurs as if it’s random = just as
likely to inflate any score as to decrease score. Simply chance to score
higher/lower due to error independent from true scores, because error can
influence every true score, whether it is high or low.
Error tends to cancel itself out across respondents = inflates scores of
some respondents + deflates of other average effect of error = 0.
Error scores are uncorrelated with true scores: still affecting observed
scores, independent of level of true score.
1