Data Collection
Section 1.1 21. Qualitative 22. Qualitative
1. Statistics is the science of collecting, 23. Discrete 24. Continuous
organizing, summarizing and analyzing
information in order to draw conclusions and 25. Continuous 26. Discrete
answer questions. In addition, statistics is
27. Continuous 28. Continuous
about providing a measure of confidence in
any conclusions. 29. Discrete 30. Continuous
2. The population is the group to be studied as 31. Nominal 32. Ordinal
defined by the research objective. A sample is
any subset of the population. 33. Ratio 34. Interval
3. Individual 35. Ordinal 36. Nominal
4. Descriptive; Inferential 37. Ratio 38. Interval
5. Statistic; Parameter 39. The population consists of all teenagers 13 to
17 years old who live in the United States.
6. Variables The sample consists of the 1,028 teenagers 13
to 17 years old who were contacted by the
7. 18% is a parameter because it describes a
Gallup Organization.
population (all of the governors).
40. The population consists of all bottles of Coca-
8. 72% is a parameter because it describes a
Cola filled by that particular machine on
population (the entire class).
October 15. The sample consists of the 50
9. 32% is a statistic because it describes a sample bottles of Coca-Cola that were selected by the
(the high school students surveyed). quality control manager.
10. 10.0% is a statistic because it describes a 41. The population consists of all of the soybean
sample (the youths surveyed). plants in this farmer’s crop. The sample
consists of the 100 soybean plants that were
11. 0.366 is a parameter because it describes a selected by the farmer.
population (all of Ty Cobb’s at-bats).
42. The population consists of all households
12. 39 years, 11 months, 15 days is a parameter within the United States. The sample consists
because it describes a population (all the men of the 50,000 households that are surveyed by
who have walked on the moon). the U.S. Census Bureau.
13. 23% is a statistic because it describes a sample 43. The population consists of all women 27 to 44
(the 6,076 adults studied). years of age with hypertension. The sample
consists of the 7,373 women 27 to 44 years of
14. 44% is a statistic because it describes a sample age with hypertension who were included in
(the 100 adults interviewed). the study.
15. Qualitative 16. Quantitative 44. The population consists of all full-time
students enrolled at this large community
17. Quantitative 18. Qualitative college. The sample consists of the 128 full-
time students who were surveyed by the
19. Quantitative 20. Quantitative administration.
Copyright © 2014 Pearson Education, Inc.
1
,Chapter 1: Data Collection
45. Individuals: Motorola Droid X, Motorola Excellent, Excellent, Very Good;
Droid 2, Apple iPhone 4, Samsung Epic 4G, Data for memory: 32, 32, 4, 16, 8 (GB);
Samsung Captivate. Data for audio playback time: 40, 33, 15, 33,
Variables: Weight (ounces), Service Provider, 24 (hours).
Depth (inches). The variable rating is qualitative; the variable
Data for weight: 5.47, 5.96, 4.8, 5.5, 4.5 memory is discrete (because memory
(ounces); ultimately comes down to a finite number of
Data for service provider: Verizon, Verizon, bits available); the variable audio playback
ATT, Sprint, ATT; time is continuous (playback time is
Data for depth: 0.39, 0.53, 0.37, 0.6, 0.39 measured, but the data have been rounded to
(inches). the nearest whole hour.)
The variable weight is continuous; the variable
service provider is qualitative; the variable 49. (a) The research objective is to determine if
depth is continuous. adolescents who smoke have a lower IQ
than nonsmokers.
46. Individuals: 3 Series, 5 Series, 6 Series, 7 (b) The population is all adolescents aged 18-
Series, X3, Z4 Roadster 21. The sample consisted of 20,211 18-
Variables: Body Style, Weight (lb), Number year-old Israeli military recruits.
of Seats
Data for body style: Coupe, Sedan, (c) Descriptive statistics: The average IQ of
Convertible, Sedan, Sport utility, Coupe; the smokers was 94, and the average IQ
Data for weight: 3362, 4056, 4277, 4564, of nonsmokers was 101.
4012, 3505m (lb);
Data for number of seats: 4, 5, 4, 5, 5, 2. The (d) The conclusion is that individuals with a
variable body style is qualitative; the variable lower IQ are more likely to choose to
weight is continuous; the variable number of smoke.
seats is discrete. 50. (a) The research objective is to determine if
the application of duct tape is as effective
47. Individuals: Alabama, Colorado, Indiana,
as cryotherapy in the treatment of
North Carolina, Wisconsin.
Variables: Minimum age for Driver’s License common warts.
(unrestricted); mandatory belt-use seating (b) The population is all people with warts.
positions, maximum allowable speed limit The sample consisted of 51 patients with
(rural interstate) in 2007. warts.
Data for minimum age for driver’s license:
17, 17, 18, 16, 18; (c) Descriptive statistics: 85% of patients in
Data for mandatory belt-use seating positions: group 1 and 60% of patients in group 2
front, front, all, all, all; had complete resolution of their warts.
Data for maximum allowable speed limit (d) The conclusion is that duct tape is
(rural interstate) 2007: 70, 75, 70, 70, 65 significantly more effective in treating
(mph.) warts than cryotherapy.
The variable minimum age for driver’s license
is continuous; the variable mandatory belt-use 51. (a) The research objective is to determine the
seating positions is qualitative; the variable proportion of adult Americans who
maximum allowable speed limit (rural believe the Federal government wastes 51
interstate) 2007 is continuous (although only cents or more of every dollar.
discrete values are typically chosen for speed
limits.) (b) The population is all adult Americans
aged 18 years or older.
48. Individuals: Apple iPod Touch, Zune HD,
SanDisk Sansa Clip+, Sony X-Series (c) The sample is the 1026 American adults
Walkman, Apple iPod Nano. aged 18 years or older that were
Variables: Rating, Memory (GB), Audio surveyed.
Playback Time (hours).
Data for rating: Outstanding, Excellent,
Copyright © 2014 Pearson Education, Inc.
2
, Section 1.1: Introduction to the Practice of Statistics
(d) Descriptive statistics: Of the 1026 (c) The sample consisted of the 967 children
individuals surveyed, 35% indicated that whose parents answered questions about
51 cents or more is wasted. TV habits and behavior issues.
(e) From this study, one can infer that many (d) Descriptive statistic: The risk of attention
Americans believe the Federal problems five years later doubled for
government wastes much of the money each hour per day kids under 3 watched
collected in taxes. violent child-oriented programs.
52. (a) The research objective is to determine (e) Inference: Children under the age of 3
what proportion of the population of years should not watch television. If they
employees in the United States are do watch, it should be educational and
currently participating in the employer not violent child-oriented entertainment.
sponsored automatic payroll deduction Shows that are violent double the risk of
for a 401(k) plan to save for retirement. attention problems for each additional
hour watched each day. Even educational
(b) The population is all employees in the programs can result in a substantial risk
United States. for attention problems.
(c) The sample is all employees in the United 56. Quantitative variables are numerical measures
States. such that meaningful arithmetic operations can
(d) Descriptive statistics: 27% of the 1172 be performed on the values of the variable.
employees surveyed indicated that they Qualitative variables describe an attribute or
were participating in the employer characteristic of the individual that allows
sponsored automatic payroll deduction researchers categorize the individual.
for a 401(k) plan to save for retirement. 57. The values of a discrete random variable result
(e) The conclusion is that the majority of from counting. The values of a continuous
employees do not participate in the random variable result from a measurement.
employer sponsored automatic payroll 58. The four levels of measurement of a variable
deduction for a 401(k) plan to save for are nominal, ordinal, interval, and ratio.
retirement Examples: Nominal—brand of clothing;
53. Jersey number is nominal (the numbers Ordinal—size of a car (small, mid-size, large);
generally indicate a type of position played). Interval—temperature (in degrees Celsius);
However, if the researcher feels that lower Ratio—number of students in a class
caliber players received higher numbers, then (Examples will vary.)
jersey number would be ordinal since players 59. We say data vary, because when we draw a
could be ranked by their number. random sample from a population, we do not
54. (a) Nominal; the ticket number is categorized know which individuals will be included. If
as a winner or a loser. we were to take another random sample, we
would have different individuals and therefore
(b) Ordinal; the ticket number gives an different data. This variability affects the
indication as to the order of arrival of results of a statistical analysis because the
guests. results would differ if a study is repeated.
(c) Ratio; the implication is that the ticket 60. The Process of Statistics is to (1) identify the
number gives an indication of the number research objective, which means to determine
of people attending the party. what should be studied and what we hope to
55. (a) The research question is to determine the learn; (2) collect the data needed to answer the
role that TV watching by children research question, which is typically done by
taking a random sample from a population; (3)
younger than 3 plays in future attention
describe the data, which is done by presenting
problems for the children.
descriptive statistics; and (4) perform
(b) The population of interest is all children inference, in which the results are generalized
under the age of 3 years. to a larger population.
Copyright © 2014 Pearson Education, Inc.
3
, Chapter 1: Data Collection
61. Age could be considered a discrete matched with those that do not have the
random variable. A random variable can characteristic. Case-control studies are
be “discretized” by allowing, for typically superior to cross-sectional studies.
example, only whole numbers to be They are relatively inexpensive, provide
recorded. individual level data, and give longitudinal
information not available in a cross-sectional
study.
Section 1.2 6. A cohort study identifies the individuals to
participate and then follows them over a
1. The response variable is the variable of
period of time. During this period, information
interest in a research study. An explanatory
about the individuals is gathered, but there is
variable is a variable that affects (or explains)
no attempt to influence the individuals. Cohort
the value of the response variable. In research,
studies are superior to case-control studies
we want to see how changes in the value of
because cohort studies do not require recall to
the explanatory variable affect the value of the
obtain the data.
response variable.
7. There is a perceived benefit to obtaining a flu
2. An observational study uses data obtained by
shot, so there are ethical issues in intentionally
studying individuals in a sample without
denying certain seniors access to the
trying to manipulate or influence the
treatment.
variable(s) of interest. In a designed
experiment, a treatment is applied to the 8. A retrospective study looks at data from the
individuals in a sample in order to isolate the past either through recall or existing records.
effects of the treatment on a response variable. A prospective study gathers data over time by
Only an experiment can establish causation following the individuals in the study and
between an explanatory variable and a recording data as they occur.
response variable. Observational studies can
indicate a relationship, but cannot establish 9. This is an observational study because the
causation. researchers merely observed existing data.
There was no attempt by the researchers to
3. Confounding exists in a study when the effects manipulate or influence the variable(s) of
of two or more explanatory variables are not interest.
separated. So any relation that appears to exist
between a certain explanatory variable and the 10. This is an experiment because the researchers
response variable may be due to some other intentionally changed the value of the
variable or variables not accounted for in the explanatory variable (medication dose) to
study. A lurking variable is a variable not observe a potential effect on the response
accounted for in a study, but one that affects variable (cancer growth).
the value of the response variable. 11. This is an experiment because the explanatory
4. The choice between an observational study variable (teaching method) was intentionally
and an experiment depends on the varied to see how it affected the response
circumstances involved. Sometimes there are variable (score on proficiency test).
ethical reasons why an experiment cannot be 12. This is an observational study because no
conducted. Other times the researcher may attempt was made to influence the variable of
conduct an observational study first to validate interest. Voting choices were merely
a belief prior to investing a large amount of observed.
time and money into a designed experiment. A
designed experiment is preferred if ethics, 13. This is an observational study because the
time, and money are not an issue. survey only observed preference of Coke or
Pepsi. No attempt was made to manipulate or
5. Cross-sectional studies collect information at a influence the variable of interest.
specific point in time (or over a very short
period of time). Case-control studies are 14. This is an experiment because the researcher
retrospective (they look back in time). Also, intentionally imposed treatments on
individuals that have a certain characteristic individuals in a controlled setting.
(such as cancer) in a case-control study are
Copyright © 2014 Pearson Education, Inc.
4