CHAPTER 5
5.1 FOUNDATIONS OF MEASUREMENT
Measurement involves thinking about how to translate these abstract concepts in order to
observe them consistently and accurately
Levels of measurement = the relationship between numerical values on a measure.
– Different types of levels of measurement determine how you can treat the measure when
analysing it
. Nominal level of measurement = measuring a variable by assigning a number arbitrary in
1 order to name in numerically to distinguish it from other objects (e.g. jersey numbers)
– Attributes only named - weakest
. Ordinal level of measurement = measuring a variable using rankings (e.g. class rank)
– Attributes ordered
. Interval level of measurement = measuring a variable on a scale where the distance
between numbers is interpretable (e.g. temperature in Fahrenheit or celsius)
– Distance meaningful
. Ratio level of measurement = measuring a variable on a scale where the distance
between numbers is interpretable and there is an absolute zero value (e.g. weight, kelvin,
income)
– Absolute zero exists
– Can multiply and divide (e.g. 50 kelvin is half 100 kelvin)
Dummy variables = can only have values of 0 or 1 (usually at nominal values but they have
an absolute 0 so we can also think of it as a ratio variable)
Importance:
– Helps decide how to interpret data from that variable
4 – Helps decide what statistical analysis is appropriate on the values that were assigned
– All statistical analyses that depend on the average or use it as part of their calculation,
would not be appropriate
3
5.2 QUALITY OF MEASUREMENT (two criteria - reliability and validity)
2
(5.2a)
. RELIABILITY
1
– Reliable measures may be construct valid - reliability is necessary but not sufficient
condition for construct validity
= the consistency or stability of an observation
Does the observation provide the same results each time?
Foundation of reliability:
● True score theory
= a theory that maintains that an observed score is the sum of two components: true ability of
the respondent and random error
,X (observed score) = T(true score) + e(error: Er - random + Es - systematic)
○ Essentially the score that a person would have received if the score was perfectly
accurate
○ e.g. a student gets an 85 in their math test - they’re true ability might have been an 8,
so error of 4 could be due having a bad day, not eating breakfast etc.
○ Important: shows that most measurements will have an error component,
minimising measurement error is the key aim of developing reliable measures, can
be used in computer simulations as the basis for generating observed scores with
certain known properties
○ No true score = zero reliability - no random error (all true score) = perfectly reliable
Errors:
● Measurement error
○ Random error = a component or part of the value of a measure that varies entirely
by chance
◆ Adds noise to a measure and obscures the true value
◆ Caused by any factors that randomly affect the measurement of the variable
across the sample
◆ Adds variability to the data but does not affect average performance for the
groups - considered as noise
○ Systematic error = a component of an observed score that consistently affects the
responses in the distribution
◆ e.g. loud traffic outside a class of people taking a test will affect all people’s
scores
◆ Tend to be either positive or negative consistently; so sometimes can be
considered to be bias in measurement
Reducing measurement error:
● Pilot test your instruments to get feedback from your respondents regarding how easy or
hard the measure was, and the information about how the testing environment affected
their performance - always check if there was sufficient time to complete the measure
● If you are gathering measures using people to collect data (e.g. interviewers or
observers) - train them thoroughly so they’re aren’t introducing error - also trained to
assure respondents about confidentiality of their answers
● Double-check data collected for your study throughly - all data entry for computer
analysis should be double-entered and verified so the computer checks that data are
, ●
exactly the same - inadequate checking at this stage may lead to loss of data/omissions or
duplication
● Use of statistical procedures to adjust for measurement error - range from simple
formulas you can apply directly to data to complex procedures for modelling the error
and its effects
● Triangulate = combining multiple independent measures to get at a more accurate
estimate of a variable (especially for systematic error)
(5.2b) Theory of reliability
If your observation is reliable, you should pretty much get the same results each time you
measure it
– The two observation have their true scores common but different error scores because it
is random
Reliability is a ratio or fraction
Reliability can be expressed in terms of variances - Var(T) / Var(X)
But how do we calculate the variance of the true score? You can’t because you don’t know
the true score
– You estimate it
– The two observations X1 and X2 would be related to each other by sharing true
scores so you just use the simple formula for correlation:
(Sd = standard deviation) - that would be the same for the two values so sd^2 is variance
(varX)
BUT what is the range of a reliability estimate? It should range from 0 to 1
– e.g. 0.8 reliability means variability is 80% true ability and 20% error