100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.6 TrustPilot
logo-home
Summary

M&D3 Complete Summary (P_BMD3IOD)

Rating
-
Sold
-
Pages
82
Uploaded on
23-10-2025
Written in
2025/2026

Complete and easy-to-read summary of M&D3 – Assessment & Selection. Covers all lectures, key concepts, and important authors (e.g., Schmidt & Hunter, Sackett et al., Van Iddekinge et al.). Includes topics such as reliability, validity, test construction, job analysis, interviews, assessment centers, fairness, and decision-making. Perfect for students preparing for the exam — all lectures combined into one clear and structured file. I used this summary to prepare for the exam and scored an 8/10!

Show more Read less
Institution
Course











Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Course

Document information

Uploaded on
October 23, 2025
Number of pages
82
Written in
2025/2026
Type
Summary

Subjects

Content preview

HC1 M&D3 03/09/2025
Topics in assessment & selection (Van Iddekinge et al., 2023)
- Reliability and validity of selection methods
- Building, developing, and validating methods
- Fairness and test bias in selection methods
- Utility and decision-making in selection
- Applicant reactions
- Gamification
- AI in selection

Course aim: solve ‘the supreme problem’
“Psychologists should help in the supreme problem of diagnosing each individual, and
steering him toward his fittest place” (Hall, 1917, p. 11)
- What constructs (e.g., cognitive ability, personality) can predict important outcomes
such as job performance? → Can we predict behaviour?
- What methods should be used to measure these constructs?
- How do we ensure these methods are fair and unbiased?
- How do we use the methods to make decisions?

What is a good measure?
Measures must meet a lot of criteria to be useful
- COTAN = Commissie Testaangelegenheden Nederland → Checks the key criteria

Principles of test construction:
- Quality of test material
- Quality of a manual
- Standardization and norms
- Reliability
- Construct validity
- Criterion validity (tomorrow’s lecture)

Reliability
Reliability: “The degree to which measures are free from error and yield consistent results”
- In classical test theory (CTT), X = T + E (observed score = true score + error)

Example: My friends and I went fishing. I caught a big one.
- We wanted to know the unknown fish’s true weight (true score, T)
- But we could only obtain the fish’s observed weight (observed score, X)
- We took multiple measurements that were not identical due to error (E)

Errors in personnel selection → environment, examiner (rater), method (instrument) etc.

Types of reliability
- Test-retest: consistency in scores over time
- Parallel forms: Equivalence of two versions of the same test → correlation
Internal consistency: how well the items in a test measure the same underlying concept.
- Split-half approach → dividing into two and correlating the scores from these halves.
- Coefficient alpha ⍺ (average of all possible split-halfs)

,Inter-rater reliability (IRR): degree to which raters give consistent scores of the same thing.
- Consistency (r)
- Agreement (Kappa)
- Intraclass coefficients (ICCs, Shrout & Fleiss, 1979)

Alpha coefficient ⍺
Is based on:
- A single administration of a test
- (Co-)variances of the items → variation within and relationships between test items.
- Number of items → Higher ⍺ if there are more items




Interpretation of alpha coefficient ⍺
Reliability is a characteristic of a measurement, not a method (e.g., questionnaire)

If individual diagnosis, then COTAN standards:
- rₓₓ < .80 is insufficient → Too much measurement error for individual statements.
- .80 ≤ rₓₓ >.90 is sufficient → Reliable enough for individual interpretation.
- rₓₓ ≥ .90 is good → Highly reliable measurement, suitable for individual diagnosis.
→ rₓₓ = reliability coefficient, indicating proportion of true-score variance in observed scores.

In research, rₓₓ = .60 or .70 can sometimes be used with caution.

What alpha is not
- A measure of uni-dimensionality (Schmitt, 1996).
- An indicator of the extent to which we measure what we want to measure.

Inter-class coefficient (ICC)
The inter-class coefficient ICC is a correlation coefficient that assesses the consistency
between measures of the same class. → Between rater.

How reliable are the ratings from multiple raters?
- 3 clinical psychologists rate the behavior of children with special needs
- 5 court judges estimate the likelihood of a defendant to recommit a crime
- 4 consultants rate candidates’ behavior in an interview

,ICC differs by study design (Shrout & Fleiss, 1979)
- 6 ICC types: ICC (1, A), ICC (2, A), ICC (3, A), ICC (1, B), ICC (2, B), ICC (3, B)
- Each subject is rated by a different, randomly selected rater → 1
- A random sample of k raters rates all subjects → 2
- The same fixed set of raters rate all subjects → 3
- Are ratings by all raters averaged at the end? → B (sometimes called k) → If
not, A

To determine A or B, check how ratings are used in practice
- A reliability study may use multiple raters, but in practice, only one rater may be
available (only one person conducts an interview) → A
- Panel interview: Multiple interviewers rate independently, and the total score is the
average across interviewers → B
→ Often people do not want to rate individually, they want to consult each other.

ICC interpretation (KOO & LI, 2016)
Interpretation
- Below 0.50: poor
- Between 0.50 and 0.75: moderate
- Between 0.75 and 0.90: good
- Above 0.90: excellent

Confidence intervals (CI)
- We are interested in the uncertainty of the true score.
- With repeated sampling, % of the time the CI contains the true score.

CI = X ± z * SEM
- X = Test score
- z = test statistic (e.g., 1.96 for a 95% CI)
- SEM = standard error of measurement = σ * √(1 - rₓₓ), where:
- σ is the standard deviation of observed test scores
- rₓₓ = reliability of the test

As the reliability lowers, the confidence interval rises.

Validity
Validity: “The extent to which a test measures what it should measure”.

Types of validity
- Face validity = “does it look like a measure relevant for job performance?”
- Content validity = “Does a measure represent all facets of a given construct?”
- Construct validity = how well a test measures the underlying theoretical concept?
- Convergent validity
- Discriminant/ divergent validity
- Criterion-related validity = how well test scores relate to external criterion or outcome.
- Concurrent validity
- Predictive validity

, Construct validity
→ To what extent is the test a good measure of the underlying theoretical concept?

Internal structure
- Number of dimensions (factors) → factor analysis!
- Expected group differences (more therapy for people high vs. low on neuroticism).

External structure
- Convergent validity: correlation between two measures of constructs that
theoretically should be correlated → E.g.: workaholism and health problems.
- Divergent validity: no correlation between two measures of constructs that
theoretically should not be correlated → E.g.: cognitive ability and agreeableness.

Factor analysis (FA)
Factor analysis (FA) is useful for revealing (exploratory FA) or verifying (confirmatory FA) the
underlying dimension of a newly developed measure.
- Does our scale measure separate subdimensions or is it unidimensional?

In this course, we will only cover exploratory FA
- Summarize data by grouping together variables that are correlated.
- Typically used in the early stages of research, to consolidate variables.

Types of exploratory FA:
- Principal Components (PC): All variance in observed variables is analyzed. Variables
‘cause’ components
- Factor analysis (FA): Only shared variance is analyzed. Error variance is eliminated.
Factors ‘cause’ variables/items.

Imagine three items: “I can delay gratification”, “I avoid eating ice cream even if I would like
it”, “I don’t go clubbing if I have an exam the next day”. What could be an underlying factor
that ‘causes’ item responses? → Self control




with PC the arrows would be reversed!

We look for variables in a correlation matrix that ‘cluster together’
- So, matrices with correlatiecoëfficiënt r close to 0 are problematic. No clusters!

To check if our correlation matrix is appropriate for factor analysis?
- Bartlett’s test of sphericity: tests if correlations are zero, but is notoriously sensitive to
N (→ not reliable with large N).
$13.74
Get access to the full document:

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Get to know the seller
Seller avatar
jadevanhelmond1

Get to know the seller

Seller avatar
jadevanhelmond1 Vrije Universiteit Amsterdam
Follow You need to be logged in order to follow users or courses
Sold
2
Member since
3 year
Number of followers
0
Documents
4
Last sold
1 month ago

3e jaars Bestuurs- en Organisatiewetenschap student aan de VU. Momenteel volg ik de minor Sociale en Organisatie Psychologie.

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions