Chapter 1 – Overview and Descriptive Statistics
SHORT ANSWER
1. Give one possible sample of size 4 from each of the following populations:
a. All daily newspapers published in the United States
b. All companies listed on the New York Stock Exchange
c. All students at your college or university
d. All grade point averages of students at your college or university
ANS:
a. Houston Chronicle, Des Moines Register, Chicago Tribune, Washington Post
b. Capital One, Campbell Soup, Merrill Lynch, Pulitzer
c. John Anderson, Emily Black, Bill Carter, Kay Davis
d. 2.58. 2.96, 3.51, 3.69
PTS: 1
2. A Southern State University system consists of 23 campuses. An administrator wishes to make an inference about
the average distance between the hometowns of students and their campuses. Describe and discuss several different
sampling methods that might be employed. Would this be an enumerative or an analytic study? Explain your
reasoning.
ANS:
One could take a simple random sample of students from all students in the California State University system and
ask each student in the sample to report the distance from their hometown to campus. Alternatively, the sample
could be generated by taking a stratified random sample by taking a simple random sample from each of the 23
campuses and again asking each student in the sample to report the distance from their hometown to campus.
Certain problems might arise with self reporting of distances, such as recording error or poor recall. This study is
enumerative because there exists a finite, identifiable population of objects from which to sample.
PTS: 1
3. A Michigan city divides naturally into ten district neighborhoods. How might a real estate appraiser select a sample
of single-family homes that could be used as a basis for developing an equation to predict appraised value from
characteristics such as age, size, number of bathrooms, and distance to the nearest school, and so on? Is the study
enumerative or analytic?
ANS:
One could generate a simple random sample of all single family homes in the city or a stratified random sample by
taking a simple random sample from each of the 10 district neighborhoods. From each of the homes in the sample
the necessary variables would be collected. This would be an enumerative study because there exists a finite,
identifiable population of objects from which to sample.
, PTS: 1
4. An experiment was carried out to study how flow rate through a solenoid valve in an automobile’s pollution-control
system depended on three factors: armature lengths, spring load, and bobbin depth. Two different levels (low and
high) of each factor were chosen, and a single observation on flow was made for each combination of levels.
a. The resulting data set consisted of how many observations?
b. Is this an enumerative or analytic study? Explain your reasoning.
ANS:
a. Number observations equal 2 2 2=8
b. This could be called an analytic study because the data would be collected on an existing process.
There is no sampling frame.
PTS: 1
5. The accompanying data specific gravity values for various wood types used in construction .
.41 .41 .42 .42. .42 .42 .42 .43 .44
.54 .55 .58 .62 .66 .66 .67 .68 .75
.31 .35 .36 .36 .37 .38 .40 .40 .40
.45 .46 .46 .47 .48 .48 .48 .51 .54
Construct a stem-and-leaf display using repeated stems and comment on any interesting features of the display.
ANS:
One method of denoting the pairs of stems having equal values is to denote the stem by L, for ‘low’ and the second
stem by H, for ‘high’. Using this notation, the stem-and-leaf display would appear as follows:
3L 1 stem: tenths
3H 56678 leaf: hundredths
4L 000112222234
5L 144
5H 58
6L 2
6H 6678
7L
7H 5
The stem-and-leaf display on the previous page shows that .45 is a good representative value for the data. In
addition, the display is not symmetric and appears to be positively skewed. The spread of the data is .75 - .31 = .44,
which is .44/.45 = .978 or about 98% of the typical value of .45. This constitutes a reasonably large amount of
variation in the data. The data value .75 is a possible outlier.
PTS: 1
6. Temperature transducers of a certain type are shipped in batches of 50. A sample of 60 batches was selected, and
the number of transducers in each batch not conforming to design specifications was determined, resulting in the
following data:
0 4 2 1 3 1 1 3 4 1 2 3 2 2 8 4 5 1 3 1
2 1 2 4 0 1 3 2 0 5 3 3 1 3 2 4 7 0 2 3
5 0 2 3 2 1 0 6 4 2 1 6 0 3 3 3 6 1 2 3
, a. Determine frequencies and relative frequencies for the observed values of x = number of nonconforming
transducers in a batch.
b. What proportion of batches in the sample has at most four nonconforming transducers? What proportion has
fewer than four? What proportion has at least four nonconforming units?
ANS:
a.
Number Nonconforming Frequency Relative Frequency
0 7 0.117
1 12 0.200
2 13 0.217
3 14 0.233
4 6 0.100
5 3 0.050
6 3 0.050
7 1 0.017
8 1 0.017
1.001
The relative frequencies don’t add up exactly to 1because they have been rounded
b. The number of batches with at most 4 nonconforming items is 7+12+13+14+6=52, which is a proportion of
52/60=.867. The proportion of batches with (strictly) fewer than 4 nonconforming items is 46/60=.767.
PTS: 1
7. The number of contaminating particles on a silicon wafer prior to a certain rinsing process was determined for each
wafer in a sample size 100, resulting in the following frequencies:
Number of particles Frequency Number of particles Frequency
0 1 8 12
1 2 9 4
2 3 10 5
3 12 11 3
4 11 12 1
5 15 13 2
6 18 14 1
7 10
a. What proportion of the sampled wafers had at least two particles? At least six particles?
b. What proportion of the sampled wafers had between four and nine particles, inclusive? Strictly between four and
nine particles?
ANS:
a. From this frequency distribution, the proportion of wafers that contained at least two particles is (100-1-2)/100 =
.97, or 97%. In a similar fashion, the proportion containing at least 6 particles is (100 – 1-2-3-12-11-15)/100 =
56/100 = .56, or 56%.
b. The proportion containing between 4 and 9 particles inclusive is (11+15+18+10+12+4)/100 = 70/100 = .70, or
70%. The proportion that contain strictly between 4 and 9 (meaning strictly more than 4 and strictly less than 9)
is (15+ 18+10+12)/100= 55/100 = .55, or 55%.
SHORT ANSWER
1. Give one possible sample of size 4 from each of the following populations:
a. All daily newspapers published in the United States
b. All companies listed on the New York Stock Exchange
c. All students at your college or university
d. All grade point averages of students at your college or university
ANS:
a. Houston Chronicle, Des Moines Register, Chicago Tribune, Washington Post
b. Capital One, Campbell Soup, Merrill Lynch, Pulitzer
c. John Anderson, Emily Black, Bill Carter, Kay Davis
d. 2.58. 2.96, 3.51, 3.69
PTS: 1
2. A Southern State University system consists of 23 campuses. An administrator wishes to make an inference about
the average distance between the hometowns of students and their campuses. Describe and discuss several different
sampling methods that might be employed. Would this be an enumerative or an analytic study? Explain your
reasoning.
ANS:
One could take a simple random sample of students from all students in the California State University system and
ask each student in the sample to report the distance from their hometown to campus. Alternatively, the sample
could be generated by taking a stratified random sample by taking a simple random sample from each of the 23
campuses and again asking each student in the sample to report the distance from their hometown to campus.
Certain problems might arise with self reporting of distances, such as recording error or poor recall. This study is
enumerative because there exists a finite, identifiable population of objects from which to sample.
PTS: 1
3. A Michigan city divides naturally into ten district neighborhoods. How might a real estate appraiser select a sample
of single-family homes that could be used as a basis for developing an equation to predict appraised value from
characteristics such as age, size, number of bathrooms, and distance to the nearest school, and so on? Is the study
enumerative or analytic?
ANS:
One could generate a simple random sample of all single family homes in the city or a stratified random sample by
taking a simple random sample from each of the 10 district neighborhoods. From each of the homes in the sample
the necessary variables would be collected. This would be an enumerative study because there exists a finite,
identifiable population of objects from which to sample.
, PTS: 1
4. An experiment was carried out to study how flow rate through a solenoid valve in an automobile’s pollution-control
system depended on three factors: armature lengths, spring load, and bobbin depth. Two different levels (low and
high) of each factor were chosen, and a single observation on flow was made for each combination of levels.
a. The resulting data set consisted of how many observations?
b. Is this an enumerative or analytic study? Explain your reasoning.
ANS:
a. Number observations equal 2 2 2=8
b. This could be called an analytic study because the data would be collected on an existing process.
There is no sampling frame.
PTS: 1
5. The accompanying data specific gravity values for various wood types used in construction .
.41 .41 .42 .42. .42 .42 .42 .43 .44
.54 .55 .58 .62 .66 .66 .67 .68 .75
.31 .35 .36 .36 .37 .38 .40 .40 .40
.45 .46 .46 .47 .48 .48 .48 .51 .54
Construct a stem-and-leaf display using repeated stems and comment on any interesting features of the display.
ANS:
One method of denoting the pairs of stems having equal values is to denote the stem by L, for ‘low’ and the second
stem by H, for ‘high’. Using this notation, the stem-and-leaf display would appear as follows:
3L 1 stem: tenths
3H 56678 leaf: hundredths
4L 000112222234
5L 144
5H 58
6L 2
6H 6678
7L
7H 5
The stem-and-leaf display on the previous page shows that .45 is a good representative value for the data. In
addition, the display is not symmetric and appears to be positively skewed. The spread of the data is .75 - .31 = .44,
which is .44/.45 = .978 or about 98% of the typical value of .45. This constitutes a reasonably large amount of
variation in the data. The data value .75 is a possible outlier.
PTS: 1
6. Temperature transducers of a certain type are shipped in batches of 50. A sample of 60 batches was selected, and
the number of transducers in each batch not conforming to design specifications was determined, resulting in the
following data:
0 4 2 1 3 1 1 3 4 1 2 3 2 2 8 4 5 1 3 1
2 1 2 4 0 1 3 2 0 5 3 3 1 3 2 4 7 0 2 3
5 0 2 3 2 1 0 6 4 2 1 6 0 3 3 3 6 1 2 3
, a. Determine frequencies and relative frequencies for the observed values of x = number of nonconforming
transducers in a batch.
b. What proportion of batches in the sample has at most four nonconforming transducers? What proportion has
fewer than four? What proportion has at least four nonconforming units?
ANS:
a.
Number Nonconforming Frequency Relative Frequency
0 7 0.117
1 12 0.200
2 13 0.217
3 14 0.233
4 6 0.100
5 3 0.050
6 3 0.050
7 1 0.017
8 1 0.017
1.001
The relative frequencies don’t add up exactly to 1because they have been rounded
b. The number of batches with at most 4 nonconforming items is 7+12+13+14+6=52, which is a proportion of
52/60=.867. The proportion of batches with (strictly) fewer than 4 nonconforming items is 46/60=.767.
PTS: 1
7. The number of contaminating particles on a silicon wafer prior to a certain rinsing process was determined for each
wafer in a sample size 100, resulting in the following frequencies:
Number of particles Frequency Number of particles Frequency
0 1 8 12
1 2 9 4
2 3 10 5
3 12 11 3
4 11 12 1
5 15 13 2
6 18 14 1
7 10
a. What proportion of the sampled wafers had at least two particles? At least six particles?
b. What proportion of the sampled wafers had between four and nine particles, inclusive? Strictly between four and
nine particles?
ANS:
a. From this frequency distribution, the proportion of wafers that contained at least two particles is (100-1-2)/100 =
.97, or 97%. In a similar fashion, the proportion containing at least 6 particles is (100 – 1-2-3-12-11-15)/100 =
56/100 = .56, or 56%.
b. The proportion containing between 4 and 9 particles inclusive is (11+15+18+10+12+4)/100 = 70/100 = .70, or
70%. The proportion that contain strictly between 4 and 9 (meaning strictly more than 4 and strictly less than 9)
is (15+ 18+10+12)/100= 55/100 = .55, or 55%.