And The Sciences 9th Ed by Jay L. Devore.
Chapter 1 – Overview and Descriptive Statistics
t t t t t t
SHORT ANSWER t
1. Give one possible sample of size 4 from each of the following populations:
t t t t t t t t t t t t
a. All daily newspapers published in the United States t t t t t t t
b. All companies listed on the New York Stock Exchange t t t t t t t t
c. All students at your college or university t t t t t t
d. All grade point averages of students at your college or university
t t t t t t t t t t
ANS:
a. Houston Chronicle, Des Moines Register, Chicago Tribune, Washington Post t t t t t t t t
b. Capital One, Campbell Soup, Merrill Lynch, Pulitzer t t t t t t
c. John Anderson, Emily Black, Bill Carter, Kay Davis
t t t t t t t
d. 2.58. 2.96, 3.51, 3.69 t t t
PTS: 1 t t
2. A Southern State University system consists of 23 campuses. An administrator wishes to make an inference about the
t t t t t t t t t t t t t t t t t
average distance between the hometowns of students and their campuses. Describe and discuss several different
t t t t t t t t t t t t t t t
sampling methods that might be employed. Would this be an enumerative or an analytic study? Explain your
t t t t t t t t t t t t t t t t t
reasoning.
t
ANS:
One could take a simple random sample of students from all students in the California State University system and ask
t t t t t t t t t t t t t t t t t t t
each student in the sample to report the distance from their hometown to campus. Alternatively, the sample could be
t t t t t t t t t t t t t t t t t t t
generated by taking a stratified random sample by taking a simple random sample from each of the 23 campuses and
t t t t t t t t t t t t t t t t t t t t
again asking each student in the sample to report the distance from their hometown to campus.
t t t t t t t t t t t t t t t t
Certain problems might arise with self reporting of distances, such as recording error or poor recall. This study is enumerative
t t t t t t t t t t t t t t t t t t t
because there exists a finite, identifiable population of objects from which to sample.
t t t t t t t t t t t t t
PTS: 1 t t
3. A Michigan city divides naturally into ten district neighborhoods. How might a real estate appraiser select a sample of
t t t t t t t t t t t t t t t t t t
t single-family homes that could be used as a basis for developing an equation to predict appraised value from
t t t t t t t t t t t t t t t t t
t characteristics such as age, size, number of bathrooms, and distance to the nearest school, and so on? Is the study
t t t t t t t t t t t t t t t t t t t
t enumerative or analytic? t t
ANS:
One could generate a simple random sample of all single family homes in the city or a stratified random sample by taking
t t t t t t t t t t t t t t t t t t t t t
a simple random sample from each of the 10 district neighborhoods. From each of the homes in the sample the
t t t t t t t t t t t t t t t t t t t t
necessary variables would be collected. This would be an enumerative study because there exists a finite, identifiable
t t t t t t t t t t t t t t t t t
population of objects from which to sample.
t t t t t t t
, PTS: 1 t t
4. An experiment was carried out to study how flow rate through a solenoid valve in an automobile’s pollution-control
t t t t t t t t t t t t t t t t t
t system depended on three factors: armature lengths, spring load, and bobbin depth. Two different levels (low and
t t t t t t t t t t t t t t t t
t high) of each factor were chosen, and a single observation on flow was made for each combination of levels.
t t t t t t t t t t t t t t t t t t
a. The resulting data set consisted of how many observations?
t t t t t t t t
b. Is this an enumerative or analytic study? Explain your reasoning.
t t t t t t t t t
ANS:
a. Number observations equal 2 2 2=8 t t t t t t t
b. This could be called an analytic study because the data would be collected on an existing process.
t t t t t t t t t t t t t t t t
There is no sampling frame.
t t t t t
PTS: 1 t t
5. The accompanying data specific gravity values for various wood types used in construction .
t t t t t t t t t t t t t
.41 .41 .42 .42. .42 .42 .42 .43 .44
.54 .55 .58 .62 .66 .66 .67 .68 .75
.31 .35 .36 .36 .37 .38 .40 .40 .40
.45 .46 .46 .47 .48 .48 .48 .51 .54
Construct a stem-and-leaf display using repeated stems and comment on any interesting features of the display.
t t t t t t t t t t t t t t t
ANS:
One method of denoting the pairs of stems having equal values is to denote the stem by L, for ‘low’ and the second stem
t t t t t t t t t t t t t t t t t t t t t t t
by H, for ‘high’. Using this notation, the stem-and-leaf display would appear as follows:
t t t t t t t t t t t t t t
3L 1 stem: tenths t
3H 56678 leaf: hundredths t
4L 000112222234
5L 144
5H 58
6L 2
6H 6678
7L
7H 5
The stem-and-leaf display on the previous page shows that .45 is a good representative value for the data. In addition,
t t t t t t t t t t t t t t t t t t t
the display is not symmetric and appears to be positively skewed. The spread of the data is .75 - .31 = .44, which is
t t t t t t t t t t t t t t t t t t t t t t t t
.44/.45 = .978 or about 98% of the typical value of .45. This constitutes a reasonably large amount of variation in the
t t t t t t t t t t t t t t t t t t t t t t
data. The data value .75 is a possible outlier.
t t t t t t t t t
PTS: 1 t t
6. Temperature transducers of a certain type are shipped in batches of 50. A sample of 60 batches was selected, and the
t t t t t t t t t t t t t t t t t t t t
t number of transducers in each batch not conforming to design specifications was determined, resulting in the
t t t t t t t t t t t t t t t
t following data: t
0 4 t t 2 t t 1 t 1 1 3 4 1 2 3 2 2 8 4 5 1 3 1
t3
2 1 t t 2 t t 4 t 1 3 2 0 5 3 3 1 3 2 4 7 0 2 3
t0
5 0 t t 2 t t 3 t 1 0 6 4 2 1 6 0 3 3 3 6 1 2 3
t2
, a. Determine frequencies and relative frequencies for the observed values of x = number of
t nonconforming t t t t t t t t t t t t
transducers in a batch.
t t t t
b. What proportion of batches in the sample has at most four nonconforming transducers? What proportion has
t t t t t t t t t t t t t t t
fewer than four? What proportion has at least four nonconforming units?
t t t t t t t t t t t
ANS:
a.
Number Nonconforming Relative Frequency t Frequency t
0 0.117 7
1 0.200 12
2 0.217 13
3 0.233 14
4 0.100 6
5 0.050 3
6 0.050 3
7 0.017 1
8 0.017 1
1.001
The relative frequencies don’t add up exactly to 1because they have been rounded
t t t t t t t t t t t t
b. The number of batches with at most 4 nonconforming items is 7+12+13+14+6=52, which is a proportion of
t t t t t t t t t t t t t t t t t
52/60=.867. The proportion of batches with (strictly) fewer than 4 nonconforming items is 46/60=.767.
t t t t t t t t t t t t t t
PTS: 1 t t
7. The number of contaminating particles on a silicon wafer prior to a certain rinsing process was determined for each
t t t t t t t t t t t t t t t t t t
t wafer in a sample size 100, resulting in the following frequencies:
t t t t t t t t t t
Number of particles t t Frequency Number of particles t t Frequency
0 1 8 12
1 2 9 4
2 3 10 5
3 12 11 3
4 11 12 1
5 15 13 2
6 18 14 1
7 10
a. What proportion of the sampled wafers had at least two particles? At least six particles?
t t t t t t t t t t t t t t
b. What proportion of the sampled wafers had between four and nine particles, inclusive? Strictly between four and
t t t t t t t t t t t t t t t t
nine particles?
t t
ANS:
a. From this frequency distribution, the proportion of wafers that contained at least two particles is (100-1-2)/100 =
t t t t t t t t t t t t t t t t
.97, or 97%. In a similar fashion, the proportion containing at least 6 particles is (100 – 1-2-3-12-11-15)/100 =
t t t t t t t t t t t t t t t t t t
56/100 = .56, or 56%.
t t t t t
b. The proportion containing between 4 and 9 particles inclusive is (11+15+18+10+12+4)/100 = 70/100 = .70, or
t t t t t t t t t t t t t t t
70%. The proportion that contain strictly between 4 and 9 (meaning strictly more than 4 and strictly less than 9) is
t t t t t t t t t t t t t t t t t t t t t
(15+ 18+10+12)/100= 55/100 = .55, or 55%.
t t t t t t t
, PTS: 1 t t
8. The cumulative frequency and cumulative relative frequency for a particular class interval are the sum of
t t t t t t t t t t t t t t t
t frequencies and relative frequencies, respectively, for that interval and all intervals lying below it. Compute the
t t t t t t t t t t t t t t t
t cumulative frequencies and cumulative relative frequencies for the following data:
t t t t t t t t t
75 89 80 93 64 67 72 70 66 85
89 81 81 71 74 82 85 63 72 81
81 95 84 81 80 70 69 66 60 83
85 98 84 68 90 82 69 72 87 88
ANS:
Class Frequency Relative Cumulative Cumulative
t Frequency Frequency
t Relative Frequency
t t
60 – under 65 t t t 3 .075 3 .075
65 – under 70 t t t 6 .15 9 .225
70 – under 75 t t t 7 .175 16 .40
75 – under 80 t t t 1 .025 17 .425
80 – under 85 t t t 12 .30 29 .725
85 – under 90 t t t 7 .175 36 .90
90 – under 95 t t t 2 .05 38 .95
95 – under 100 t t t 2 .05 40 1.0
PTS: 1 t t
9. Consider the following observations on shear strength of a joint bonded in a particular manner:
t t t t t t t t t t t t t t
30.0 4.4 33.1 66.7 81.5 22.2 40.4 16.4 73.7 36.6 109.9
a. Determine the value of the sample mean. t t t t t t
b. Determine the value of the sample median. Why is it so different from the mean?
t t t t t t t t t t t t t t
c. Calculate a trimmed mean by deleting the smallest and largest observations. What is the corresponding
t t t t t t t t t t t t t t
t trimming percentage? How does the value of this t t t t t t t t t compare to the mean and median? t t t t t
ANS:
a. The sum of the n = 11 data points is 514.90, so
t t = 514.90/11 = 46.81.
t t t t t t t t t t t t t t
b. The sample size (n = 11) is odd, so there will be a middle value. Sorting from smallest to largest: 4.4 16.4 22.2
t t t t t t t t t t t t t t t t t t t t t t
30.0 33.1 36.6 40.4 66.7 73.7 81.5 109.9. The sixth value, 36.6 is the middle, or median, value. The mean
t t t t t t t t t t t t t t t t t t t
differs from the median because the largest sample observations are much further from the median than are the
t t t t t t t t t t t t t t t t t t
smallest values.
t t
c. Deleting the smallest (x = 4.4) and largest (x = 109.9) values, the sum of the remaining 9 observations is 400.6. The
t t t t t t t t t t t t t t t t t t t t t
trimmed mean
t is 400.6/9 = 44.51. The trimming percentage is 100(1/11) = 9.1%.
t lies between the mean
t t t t t t t t t t t t t t t t t
and median.
t t
PTS: 1 t t
10. A sample of 26 offshore oil workers took part in a simulated escape exercise, resulting in the accompanying data on time
t t t t t t t t t t t t t t t t t t t t
t (sec) to complete the escape:
t t t t
373 370 364 366 364 325 339 393
356 359 363 375 424 325 394 402