Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Exam (elaborations)

PSYCHOLOGICAL TESTING

Rating
-
Sold
-
Pages
99
Grade
A+
Uploaded on
28-12-2023
Written in
2023/2024

BASIC CONCEPTS What a Test Is  A test is a measurement device or technique used to quantify behavior or aid in the understanding and prediction of behavior.  Test measures only a sample of behavior, and error is always associated with a sampling process.  Test scores are not perfect measures of a behavior or characteristic, but they do add significantly to the prediction process, as you will see.  An item is a specific stimulus to which a person responds overtly; this response can be scored or evaluated (for example, classified, graded on a scale, or counted).  A psychological test or educational test is a set of items that are designed to measure characteristics of human beings that pertain to behavior.  Overt behavior is an individual’s observable activity.  Behavior can also be covert—that is, it takes place within an individual and cannot be directly observed.  Scores on tests may be related to traits, which are enduring characteristics or tendencies to respond in a certain manner.  Test scores may also be related to the state, or the specific condition or status, of an individual. Types of Tests  Those that can be given to only one person at a time are known as individual tests.  A group test, by contrast, can be administered to more than one person at a time by a single examiner, such as when an instructor gives everyone in the class a test at the same time.  Ability tests contain items that can be scored in terms of speed, accuracy, or both. On an ability test, the faster or the more accurate your responses, the better your scores on a particular characteristic.  Achievement refers to previous learning.  Aptitude, by contrast, refers to the potential for learning or acquiring a specific skill.  Intelligence refers to a person’s general potential to solve problems, adapt to changing circumstances, think abstractly, and profit from experience.  In view of the considerable overlap of achievement, aptitude, and intelligence tests, all three concepts are encompassed by the term human ability.  Whereas ability tests are related to capacity or potential, personality tests are related to the overt and covert dispositions of the individual.  Personality tests measure typical behavior.  Structured personality tests provide a statement, usually of the “self-report” variety, and require the subject to choose between two or more alternative responses such as “True” or “False”  In a projective personality test, either the stimulus (test materials) or the required response—or both—are ambiguous.  Projective tests assume that a person’s interpretation of an ambiguous stimulus will reflect his or her unique characteristics.  Psychological testing refers to all the possible uses, applications, and underlying concepts of psychological and educational tests. PSYCHOLOGICAL TESTING Page | 2 HISTORICAL PERSPECTIVE Early Antecedents  Evidence suggests that the Chinese had a relatively sophisticated civil service testing program more than 4000 years ago.  By the Han Dynasty (206 B.C.E. to 220 C.E.), the use of test batteries (two or more tests used in conjunction) was quite common.  Tests had become quite well developed by the Ming Dynasty (1368–1644 C.E.).  The Western world most likely learned about testing programs through the Chinese. Charles Darwin and Individual Differences  An important step toward understanding individual differences came with the publication of Charles Darwin’s highly influential book, The Origin of Species, in 1859.  Darwin also believed that those with the best or most adaptive characteristics survive at the expense of those who are less fit and that the survivors pass their characteristics on to the next generation. Through this process, he argued, life has evolved to its currently complex and intelligent levels.  Sir Francis Galton, a relative of Darwin’s, soon began applying Darwin’s theories to the study of human beings.  Galton set out to show that some people possessed characteristics that made them more fit than others, a theory he articulated in his book Hereditary Genius, published in 1869.  He concentrated on demonstrating that individual differences exist in human sensory and motor functioning, such as reaction time, visual acuity, and physical strength.  Galton’s work was extended by the U.S. psychologist James McKeen Cattell, who coined the term mental test (Cattell, 1890). Experimental Psychology and Psychophysical Measurement  Before psychology was practiced as a science, mathematical models of the mind were developed, in particular those of J. E. Herbart.  Wilhelm Wundt, who set up a laboratory at the University of Leipzig in 1879, is credited with founding the science of psychology, following in the tradition of Weber and Fechner (Hearst, 1979).  Whipple provided the basis for immense changes in the field of testing by conducting a seminar at the Carnegie Institute in 1919 attended by Thurstone, E. Strong, and other early prominent U.S. psychologists.  Thus, psychological testing developed from at least two lines of inquiry: one based on the work of Darwin, Galton, and Cattell on the measurement of individual differences, and the other (more theoretically relevant and probably stronger) based on the work of the German psychophysicists Herbart, Weber, Fechner, and Wundt.  Such tests also arose in response to important needs such as classifying and identifying the mentally and emotionally handicapped.  One of the earliest tests resembling current procedures, the Seguin Form Board Test (Seguin, 1866/1907), was developed in an effort to educate and evaluate the mentally disabled.  Working in conjunction with the French physician T. Simon, Binet developed the first major general intelligence test. The Evolution of Intelligence and Standardized Achievement Tests  The first version of the test, known as the Binet-Simon Scale, was published in 1905.  This instrument contained 30 items of increasing difficulty and was designed to identify intellectually subnormal individuals.  A representative sample is one that comprises individuals similar to those for whom the test is to be used. Page | 3  When the test is used for the general population, a representative sample must reflect all segments of the population in proportion to their actual numbers.  The 1908 Binet-Simon Scale also determined a child’s mental age, thereby introducing a historically significant concept.  In simplified terms, you might think of mental age as a measurement of a child’s performance on the test relative to other children of that particular age group.  By 1916, L. M. Terman of Stanford University had revised the Binet test for use in the United States.  Terman’s revision, known as the Stanford-Binet Intelligence Scale (Terman, 1916), was the only American version of the Binet test that flourished. World War 1  The testing movement grew enormously in the United States because of the demand for a quick, efficient way of evaluating the emotional and intellectual functioning of thousands of military recruits in World War I.  Yerkes headed a committee of distinguished psychologists who soon developed two structured group tests of human abilities: the Army Alpha and the Army Beta. The Army Alpha required reading ability, whereas the Army Beta measured the intelligence of illiterate adults.  World War I fueled the widespread development of group tests. Achievement Test  Among the most important developments following World War I was the development of standardized achievement tests.  Standardized achievement tests caught on quickly because of the relative ease of administration and scoring and the lack of subjectivity or favoritism that can occur in essay or other written tests.  In 1923, the development of standardized achievement tests culminated in the publication of the Stanford Achievement Test by T. L. Kelley, G. M. Ruch, and L. M. Terman. Rising to the Challenge  A mere 2 years after the 1937 revision of the Stanford-Binet test, David Wechsler published the first version of the Wechsler intelligence scales (see Chapter 10), the Wechsler-Bellevue Intelligence Scale (WB) (Wechsler, 1939).  The Wechsler-Bellevue scale contained several interesting innovations in intelligence testing.  Among the various scores produced by the Wechsler test was the performance IQ.  Wechsler’s inclusion of a nonverbal scale thus helped overcome some of the practical and theoretical weaknesses of the Binet test. Personality Tests:  Just before and after World War II, personality tests began to blossom.  Traits are relatively enduring dispositions (tendencies to act, think, or feel in a certain manner in any given circumstance) that distinguish one individual from another.  One of the basic goals of traditional personality tests is to measure traits.  The earliest personality tests were structured paper-and-pencil group tests.  These tests provided multiple-choice and true–false questions that could be administered to a large group.  The first structured personality test, the Woodworth Personal Data Sheet, was developed during World War I and was published in final form just after the war.  The Rorschach test was first published by Herman Rorschach of Switzerland in 1921.  Adding to the momentum for the acceptance and use of projective tests was the development of the Thematic Apperception Test (TAT) by Henry Murray and Christina Morgan in 1935. Page | 4  The TAT purported to measure human needs and thus to ascertain individual differences in motivation. The Emergence of New Approaches to Personality Testing  In 1943, the Minnesota Multiphasic Personality Inventory (MMPI) began a new era for structured personality tests.  The problem with early structured personality tests such as the Woodworth was that they made far too many assumptions that subsequent scientific investigations failed to substantiate.  Factor analysis is a method of finding the minimum number of dimensions (characteristics, attributes), called factors, to account for a large number of variables.  A factor analysis can identify how much they overlap and whether they can all be accounted for or subsumed under a single dimension (or factor) such as extroversion.  By the end of that decade, R. B. Cattell had introduced the Sixteen Personality Factor Questionnaire (16PF); despite its declining popularity, it remains one of the most well-constructed structured personality tests and an important example of a test developed with the aid of factor analysis. The Current Environment  Neuropsychologists use tests in hospitals and other clinical settings to assess brain injury.  Health psychologists use tests and surveys in a variety of medical settings.  Forensic psychologists use tests in the legal system to assess mental state as it relates to an insanity defense, competency to stand trial or to be executed, and emotional damages.  Child psychologists use tests to assess childhood disorders.  Testing is indeed one of the essential elements of psychology.  To study any area of human behavior effectively, one must understand the basic principles of measurement. CHAPTER 2- NORMS AND BASIC STATISTICS FOR TESTING WHY WE NEED STATISTICS  Statistical methods serve two important purposes in the quest for scientific understanding.  First, statistics are used for purposes of description.  Numbers provide convenient summaries and allow us to evaluate some observations relative to others (Cohen & Lea, 2004; Pagano, 2004).  Second, we can use statistics to make inferences, which are logical deductions about events that cannot be observed directly.  First comes the detective work of gathering and displaying clues, or what the statistician John Tukey calls exploratory data analysis.  Then comes a period of confirmatory data analysis, when the clues are evaluated against rigid statistical rules. This latter phase is like the work done by judges and juries.  Descriptive statistics are methods used to provide a concise description of a collection of quantitative information.  Inferential statistics are methods used to make inferences from observations of a small group of people known as a sample to a larger group of individuals known as a population. SCALES OF MEASUREMENT  One may define measurement as the application of rules for assigning numbers to objects.  The basic feature of these types of systems is the scale of measurement. Page | 5 Properties of Scales  Three important properties make scales of measurement different from one another: magnitude, equal intervals, and an absolute 0. Magnitude  Magnitude is the property of “moreness.”  A scale has the property of magnitude if we can say that a particular instance of the attribute represents more, less, or equal amounts of the given quantity than does another instance Equal Intervals  A scale has the property of equal intervals if the difference between two points at any place on the scale has the same meaning as the difference between two other points that differ by the same number of scale units.  For example, the difference between inch 2 and inch 4 on a ruler represents the same quantity as the difference between inch 10 and inch 12: exactly 2 inches.  A psychological test rarely has the property of equal intervals.  When a scale has the property of equal intervals, the relationship between the measured units and some outcome can be described by a straight line or a linear equation in the form Y= a+bX. Absolute 0  An absolute 0 is obtained when nothing of the property being measured exists.  For many psychological qualities, it is extremely difficult, if not impossible, to define an absolute 0 point. Types of Scales  Nominal scales are really not scales at all; their only purpose is to name objects.  For example, the numbers on the backs of football players’ uniforms are nominal.  Nominal scales are used when the information is qualitative rather than quantitative.  A scale with the property of magnitude but not equal intervals or an absolute 0 is an ordinal scale.  This scale allows you to rank individuals or objects but not to say anything about the meaning of the differences between the ranks.  For example, if Fred was the tallest, Susan the second tallest, and George the third tallest, you would assign them the ranks 1, 2, and 3, respectively.  For most problems in psychology, the precision to measure the exact differences between intervals does not exist.  When a scale has the properties of magnitude and equal intervals but not absolute 0, we refer to it as an interval scale.  The most common example of an interval scale is the measurement of temperature in degrees Fahrenheit.  Because the scale does not have an absolute 0, we cannot make statements in terms of ratios.  A scale that has all three properties (magnitude, equal intervals, and an absolute 0) is called a ratio scale.  For instance, 0 miles per hour (mph) is the point at which there is no speed at all. If you are driving onto a highway at 30 mph and increase your speed to 60 when you merge, then you have doubled your speed. Permissible Operations  For nominal data, each observation can be placed in only one mutually exclusive category. For example, you are a member of only one gender.  Ordinal measurements can be manipulated using arithmetic; however, the result is often difficult to interpret because it reflects neither the magnitudes of the manipulated observations nor the true amounts of the property that have been measured.  With interval data, one can apply any arithmetic operation to the differences between scores. Page | 6  Mathematical operation is reserved for ratio scales, for which any mathematical operation is permissible. FREQUENCY DISTRIBUTIONS  A distribution of scores summarizes the scores for a group of individuals.  The frequency distribution displays scores on a variable or a measure to reflect how frequently each value was obtained.  Usually, scores are arranged on the horizontal axis from the lowest to the highest value.  For most distributions of test scores, the frequency distribution is bell-shaped, with the greatest frequency of scores toward the center of the distribution and decreasing scores as the values become greater or less than the value in the center of the distribution. PERCENTILE RANKS  Percentile ranks replace simple ranks when we want to adjust for the number of scores in a group.  A percentile rank answers the question “What percent of the scores fall below a particular score (Xi)?”  The formula is PERCENTILES  Percentiles are the specific scores or points within a distribution.  Percentiles divide the total frequency for a set of observations into hundredths.  Remember that a percentile rank is a measure of relative performance. DESCRIBING DISTRIBUTIONS Mean  A variable is a score that can have different values.  The arithmetic average score in a distribution is called the mean.  To calculate the mean, we total the scores and divide the sum by the number of cases, or N.  The capital Greek letter sigma (Σ) means summation.  Thus, the formula for the mean, which we signify as X, is Standard Deviation  The standard deviation is an approximation of the average deviation around the mean.  One way to measure variability is to subtract the mean from each score (X-X) and then total the deviations.  In fact, the sum of the deviations around the mean will always equal 0. Page | 7  You can obtain the average squared deviation around the mean, known as the variance. The formula for the variance is  The square root of the variance is the standard deviation (s), and it is represented by the following formula.  The standard deviation is thus the square root of the average squared deviation around the mean. Z Score  The Z score transforms data into standardized units that are easier to interpret.  A Z score is the difference between a score and the mean, divided by the standard deviation:  If a score is equal to the mean, then its Z score is 0.  If the score is greater than the mean, then the Z score is positive; if the score is less than the mean, then the Z score is negative. McCall’s T  One system was established in 1939 by W. A. McCall, who originally intended to develop a system to derive equal units on mental quantities.  In McCall’s system, called McCall’s T, the standard deviation was set at 10.  In effect, McCall generated a system that is exactly the same as standard scores (Z scores), except that the mean in McCall’s system is 50 rather than 0 and the standard deviation is 10 rather than 1.  Indeed, a Z score can be transformed to a T score by applying the linear transformation.  An example of a test developed using standardized scores is the Scholastic Aptitude Test (SAT).  If a distribution of scores is skewed before the transformation is applied, it will also be skewed after the transformation has been used. In other words, transformations standardize but do not normalize. Quartiles and Deciles  The quartile system divides the percentage scale into four groups, whereas the decile system divides the scale into 10 groups.  Quartiles are points that divide the frequency distribution into equal fourths.  The first quartile is the 25th percentile; the second quartile is the median, or 50th, percentile; and the third quartile is the 75th percentile.  The interquartile range is the interval of scores bounded by the 25th and 75th percentiles. In other words, the interquartile range is bounded by the range of scores that represents the middle 50% of the distribution.  Deciles are similar to quartiles except that they use points that mark 10% rather than 25% intervals.  Actually the term stanine comes from “standard nine.”  The scale is standardized to have a mean of 5 and a standard deviation of approximately 2. Page | 8 NORMS  Norms refer to the performances by defined groups on particular tests.  The norms for a test are based on the distribution of scores obtained by some defined sample of individuals.  The mean is a norm, and the 50th percentile is a norm.  Norms are used to give information about performance relative to what has been observed in a standardization sample.  Norms are obtained by administering the test to a sample of people and obtaining the distribution of scores for that group. Age-related Norms  Most IQ tests are of this sort.  When the Stanford-Binet IQ test was originally created, distributions of the performance of random samples of children were obtained for various age groups.  When applying an IQ test, the tester’s task is to determine the mental age of the person being tested. Tracking  The tendency to stay at about the same level relative to one’s peers is known as tracking.  Height and weight are good examples of physical characteristics that track. Criterion-Referenced Tests  The purpose of establishing norms for a test is to determine how a test taker compares with others.  A norm-referenced test compares each person with a norm.  These tests do not compare students with one another; they compare each student’s performance with a criterion or an expected level of performance.  A criterion-referenced test describes the specific types of skills, tasks, or knowledge that the test taker can demonstrate such as mathematical skills.  The results of such a test might demonstrate that a particular child can add, subtract, and multiply but has difficulty with both long and short division.

Show more Read less
Institution
PSYCHOLOGICAL TESTING
Course
PSYCHOLOGICAL TESTING











Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
PSYCHOLOGICAL TESTING
Course
PSYCHOLOGICAL TESTING

Document information

Uploaded on
December 28, 2023
Number of pages
99
Written in
2023/2024
Type
Exam (elaborations)
Contains
Questions & answers
$14.99
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
NURSE0050 Manchester Metropolitan University
View profile
Follow You need to be logged in order to follow users or courses
Sold
86
Member since
3 year
Number of followers
51
Documents
875
Last sold
1 month ago
I offer the best study resources(EXAMS,STUDY GUIDES and TEST BANKS)

Get quality Test banks and Exam accurate and verified solutions here.Leave feedback after purchase.Please give a REVIEW after purchase.

3.9

15 reviews

5
7
4
4
3
1
2
2
1
1

Trending documents

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions