100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Class notes

Lecture Notes Test Construction

Rating
-
Sold
1
Pages
34
Uploaded on
03-11-2021
Written in
2020/2021

All lecture notes of the course test construction. Some small sentences written in dutch, overall document is written in English

Institution
Course











Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Course

Document information

Uploaded on
November 3, 2021
Number of pages
34
Written in
2020/2021
Type
Class notes
Professor(s)
Iris egberink
Contains
All classes

Subjects

Content preview

Test construction college 1 3 februari
Introduction – developing maximum and typical performance tests Iris Egberink
Learning goals
- To know and understand the principles of test and questionnaire construction
- To know how tests and questionnaires for a particular aim and a particular group are
effectively constructed, evaluated and interpreted

Topics
- The process of test construction
- Methods for understanding psychometric properties
- The principles of various item response models and their application in practice
- Important issues of validity and norm-referencing

Exam
- Book, articles, lectures, exercises, practical
- Multiple choice
- Formulae
o Standard statistics by heart (mean, proportion, variance, SD, covariance, correlation,
variance of sum variable)
o Other, if necessary, will be provided at the exam – without name and explanation of
parameters
- Simple, non-programmable calculator
- Example questions on Nestor

Sincere advice
- Read course manual thoroughly
- Prepare each lecture, study the material after each lecture, use video-lectures if needed
- Be optimally prepared for the first exam – don’t postpone to resit  be careful with ‘learning
to the test’

Psychological and educational tests
- Test construction – development and application
o what does the test look like?
o Instructions for administration, scoring and interpretation
o Actual administrations of tests
 What info does it give?
 What is the usefulness of this info, and for whom (individuals, policy)?
- Test theory – statistical theory about behavior of item scores and test scores – What can I do
with the outcome?
o Examples – Classical test theory, item response theory (quality)
o Important issues – quantitative measures for the quality of items and tests for target
groups of respondents
- Both are needed for a sensible use of tests

Use of tests – in practice
1. Human Resource Management – personnel selection and development
2. Education – individual development and performance of students
a. Identify deviating patterns of development (pupil assessment
system/leerlingvolgsysteem)
b. Prediction of most suitable type of high school (end of primary school/CITO-toets
groep 8)

, 3. Psychodiagnostics – Npsy, clinical psych, developmental psych
- Judgments on individuals

Use of tests – in research
- Testing of hypothesis, theory; theory building
- E.g. ‘location and size of brain damage determines type and severity of behavioral difficulties
in the long term’
- Variables
o Indicators of location and size of brain damage
o Behavioral difficulties – e.g. anxiety, aggression, childish behavior, apathy, lack of
insight
- Judgments on populations/groups

Definition of a test – a psychological or educational test is an instrument for the measurement of a
person’s maximum or typical performance under standardized conditions, where the performance is
assumed to reflect one of more latent attributes

Test types
- Typical performance test
o Typifies person – no correct answers
o E.g. personality, attitude, mental health
- Maximum performance test
o Person’s achievement
o E.g. intelligence, ability level

Standardization – very important aspect of testing
- Test conditions are fixed
o E.g. test material, instructions, administration procedure, score computing
- Aim – to ensure comparability of test performances between persons and test occasions
- Difficult to achieve perfect standardization – write out specific instruction to give the
participants
- Specific aspects to standardize dependent on for example test or target population

Latent attribute
- Attribute that cannot be measured directly
o E.g. verbal ability, arithmetic skills, severity
of depression
- Test score (X) should reflect the latent attribute of
interest (T; true score)
o Causal relationship between attribute and test score
o Thus, if 2 persons differ on the attribute, the test scores differ as well, and the other
way around
- Testscore is indicators of attribute

Some important terminology
- Item
o Smallest test unit, on which person is scored
o Score can be the same as persons response
- Subtest (also denoted as subscale, or just scale)
o Independent part of a test
o Indicative of an attribute

, o Consists of various items

Example of maximum performance test
- Bayley-III
o Aims to assess the developmental level of young children (1-42m)
o Individual, standardized assessment
o Normed scores
o Assessing the developmental level by playing
o Aims of use
 For children with concerns about development
 Diagnosis of developmental delays, in order to plan and/or evaluate
interventions
o Consisting of 5 (or 7) subscales
 Administered with child interaction
 Cognition
 Language
o Reception
o Production
 Motor
o Fine
o Gross
 Parent questionnaires
 Social-emotional
 Adaptive behavior
o Example of item instruction – Gross motor
 Bal laten schoppen  succesvol 1; onsuccesvol (vallen, niet ver genoeg) 0

Test construction
1. Define the construct of interest
a. Constructs  abstract, theoretical concepts
b. Literature search – what is intelligence? What part of intelligence?
c. Homogeneity and dimensionality – different dimensions could have different
subscales
d. …
2. Develop the test
a. Essential aspects
i. Measurement mode of the test
1. Self-performance mode
2. Self-evaluation mode
3. Other-evaluation mode
4. Example – SDQ
a. Strengths and Difficulties Questionnaire  brief behavioral
screening questionnaire about 3-16y. Exists in several
versions to meet the need of researchers, clinicians and
educationalists.
b. 25 items on psychological attributes – all versions of the SDQ
ask about 25 attributes, some + others –
c. These 25 items are divided between 5 scales
i. Emotional symptoms – 5
ii. Conduct problems – 5
iii. Hyperactivity/inattention – 5

, iv. Peer relationship problems – 5
v. Prosocial behavior – 5
 1 tm 4 are added to generate a total
difficulties score (based on 20 items)
d. Thus, either 2 subscales (total difficulties, prosocial), or 5
subscales
ii. Objectives of the test
1. Research vs. practice
2. Individual or group level
3. Description vs. diagnosis vs. decision making
iii. Population and subpopulation of testees
1. Be as specific as possible
2. Inclusion and exclusion criteria
3. Too broad  implications for norm groups and their
representativeness
iv. Conceptual framework of the test
1. More specific than just definition; it helps to write items
2. Typical performance – three broad classes of strategies
a. Intuitive – rational, prototypical
b. Deductive
i. Construct method – use of theoretical framework
(e.g. Koster et al.)
ii. Facet design method – conceptual analysis of the
construct
c. Inductive – constructs to be measured cannot be defined
beforehand, but are identified using association measures
(e.g. correlations)
i. Internal – associations among items how they are
related to another
ii. External – associations between items and external
criterion (predictive validity)
3. Example internal based strategy
a. 16 personality factor questionnaire (16PF) – Cattell and co.
1940
b. Self-report measuring 16 primary traits
c. Based on factor analysis of variables describing broad range
of actual behaviors
i. FA – method to identify subgroups of variables
 With high correlations within the subgroups
 With low correlations between the
subgroups
d. Useful approach to describe differences between individuals
in personality characteristics
i. But it does NOT (and CANNOT) reveal sources
of/causes of differences in personality
v. Item response mode
1. Many, see book
2. Frequently-used scales
a. Dichotomous = binary
i. E.g. yes/no, true/false, correct/incorrect
ii. Typically encoded as 0, 1
b. Ordinal polytomous – e.g. never/sometimes/often
R100,87
Get access to the full document:

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Get to know the seller
Seller avatar
juliavonk28

Get to know the seller

Seller avatar
juliavonk28 Rijksuniversiteit Groningen
Follow You need to be logged in order to follow users or courses
Sold
1
Member since
5 year
Number of followers
1
Documents
5
Last sold
3 year ago

0,0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their exams and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can immediately select a different document that better matches what you need.

Pay how you prefer, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card or EFT and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions