Week 1: Scaling and Norms
Psychometrics = The branch of psychology concerned with the design and use of
psychological tests
Goal: Systematically evaluate the characteristics of a psychological test
How: The application of statistical and mathematical techniques to
psychological testing
Psychological constructs cannot be observed (latent traits)
- Measure them by taking a systematic sample of behaviour (= a test!)
- Psychological test = systematic sample of behaviour
A test measures:
● Inter-individual differences (between different people)
● Intra-individual differences (within one person)
Measurement Errors:
● Difference in score does not reflect a difference in the construct
● Systematic factors causing measurement error
- Validity
● Random factors causing measurement error
- Reliability
Levels of Measurement:
Scaling
- Transforming raw scores to scale scores
- » A person’s score on a test
,Two ways:
- An average score, mean of all items
- A total score, summation of all items
Norms = give meaning to (scale) scores
Two types of norms:
● Absolute norms: criterion referenced test
- Compare scores to a predetermined value
● Relative norms: norm referenced test
- Compare scores to that of a representative sample
- Raw score X → Relative norm score (z, T, percentile rank)
Three types of norm scores:
● Z Scores:
● T scores:
● Percentile scores (PX) = percentage of people that obtained the same score as or
lower than any particular score
,Week 2: Reliability
Reliability
- How accurate is my measurement?
- To what extent are test scores influenced by random measurement error?
- Can this test give an indication about individual differences?
Classical Test Theory
- Based on the idea that observed scores can be defined by a part true score and a
part error score:
Assumptions:
● μe = 0: (mean of the error in the population is zero)
● ret = 0 (errors are uncorrelated with true scores)
● reiej = 0 (errors are uncorrelated with each other)
Variance of observed test scores:
Reliability coefficients:
● Proportion variance of observed scores accounted for by true scores
● Squared correlations between observed and true scores
● 0 ≤ RXX ≤1 if classical test theory assumptions are met
Estimation based on two measurements:
Requirements parallel tests
● Same true scores:
, ● Identical error variances
Consequences of parallelity:
● Identical observed variances:
● Identical correlations with true score:
● Correlation between parallel tests = reliability of both tests
Three types of parallel tests:
- Alternate forms
- Test-retest
- Split-half
Alternate forms
● Different tests measuring the same construct
● Correlation between scores on both tests gives reliability
Problems:
● Never know if it is truly parallel
- Solutions:
- Domain sampling
- Mean and standard deviation
● Carry-over effects
- Under- or overestimation
Test-retest
● Take the same test twice (parallelism guaranteed!)
● Correlation between test scores equal to reliability
Problems:
● People change over time
- Hence, reliability is underestimated