Data Mining Exam 2 Questions with
Complete Solutions
Which of the following is not a quantitative graphical method? A. Histogram B. Bar
Chart C. Scatter Diagram D. Box Plot E. Stem-and-Leaf Diagram - Answer-Answer:
B. box chart (chapter 4)
rue or False.. A histogram is a tabular method that is used on Quantitative data. -
Answer-Answer : False. Histograms are used on quantitative data but it is a
graphical method. (Slide 3 in Chapter 3)
All of the following are steps in Factor Analysis except for..
A. Factor extraction
B. Factor rotation
C. Compute correlation matrix for all variables
D. Final decision about underlying factors
E. All of the above are steps in factor analysis - Answer-Answer : E. All provided
answers are steps in factor analysis (Slide 23 in Chapter 4)
QUESTION: Bar charts are useful for comparing multiple statistics acorss various
ranges. - Answer-ANSWER: FALSE - Bar charts are really only useful for comparing
a single statistic. The statistics that are compared can include average, count or
percentage.
QUESTION: Which of the following is NOT a step used when incorporating costs
and benefits into Lift Charts?
A) For each record, record the cost (benefit) associated with the actual outcome
B) For the highest proability (i.e., first) the value in step 2 is the y coordinate of the
first point on the lift chart. The x coordinate is index number 1.
C) Sort the records in order of predicted probability of success (where success =
belonging to the class of interest).
D) The reference line is a parabola from the origin to the point y=total net benefit and
x=N where N =Number of records. - Answer-ANSWER: D) - The correct step is as
follows: The reference line is a STRAIGHT LINE from the origin to the point y=total
net benefit and x=N where N =Number of records.
Distribution Plots display how many of each value occur in a data set. - Answer-True
Multiple Choice
Descriptive Statistics are the___, ____, and ____ methods used to summarize and
present data.
a.) graphical, quantitative, numerical
b.) tabular, graphical, and numerical
, c.) numerical, statistical, graphical
d.) none of these - Answer-Answer: B
True or False: Qualitative data are numerical values that indicate how much or how
many? - Answer-Answer - False, qualitative data uses labels or names to identify
categories of like items. (Chap 3, slide 2)
2. For numerical measures, which of the following is not a measure of location?
A. Mean
B. Median
C. Quartiles
D. Interquartile range - Answer-Answer - D, interquartile range is a measure of
variability. The measures of location are as followed, mean, median, mode,
percentiles, and quartiles. (Chap 4, slide 2)
Most accuracy measures are derived from the classification matrix. - Answer-
Answer: TRUE.
Principal components analysis (PCA) is a useful procedure for reducing the number
of predictors in the model by analyzing the....
A. Output Variables
B. Input Variables
C. Input and Output Variables
D. Categorical Variables - Answer-Answer: B - Input Variables (Chapter 4, pg. 78 )
True or False: Pivot tables can only be used for one variable. - Answer-Answer:
False: Pivot tables can be used for multiple variables. For categorical variables we
obtain a breakdown of the records by the combination of categories.
Chapter 4, page 75, Pivot tables
________ is a way to reduce the number of predictors in a model by analyzing input
variables.
Pivot Tables
Principle Component Analysis
Correlation Analysis
Dimension Reduction - Answer-Answer: b. Principle Component Analysis; Principle
Component Analysis is a way to reduce the number of predictors in a model by
analyzing input variables. It is especially useful when we have highly correlated
subsets of measurements.
Complete Solutions
Which of the following is not a quantitative graphical method? A. Histogram B. Bar
Chart C. Scatter Diagram D. Box Plot E. Stem-and-Leaf Diagram - Answer-Answer:
B. box chart (chapter 4)
rue or False.. A histogram is a tabular method that is used on Quantitative data. -
Answer-Answer : False. Histograms are used on quantitative data but it is a
graphical method. (Slide 3 in Chapter 3)
All of the following are steps in Factor Analysis except for..
A. Factor extraction
B. Factor rotation
C. Compute correlation matrix for all variables
D. Final decision about underlying factors
E. All of the above are steps in factor analysis - Answer-Answer : E. All provided
answers are steps in factor analysis (Slide 23 in Chapter 4)
QUESTION: Bar charts are useful for comparing multiple statistics acorss various
ranges. - Answer-ANSWER: FALSE - Bar charts are really only useful for comparing
a single statistic. The statistics that are compared can include average, count or
percentage.
QUESTION: Which of the following is NOT a step used when incorporating costs
and benefits into Lift Charts?
A) For each record, record the cost (benefit) associated with the actual outcome
B) For the highest proability (i.e., first) the value in step 2 is the y coordinate of the
first point on the lift chart. The x coordinate is index number 1.
C) Sort the records in order of predicted probability of success (where success =
belonging to the class of interest).
D) The reference line is a parabola from the origin to the point y=total net benefit and
x=N where N =Number of records. - Answer-ANSWER: D) - The correct step is as
follows: The reference line is a STRAIGHT LINE from the origin to the point y=total
net benefit and x=N where N =Number of records.
Distribution Plots display how many of each value occur in a data set. - Answer-True
Multiple Choice
Descriptive Statistics are the___, ____, and ____ methods used to summarize and
present data.
a.) graphical, quantitative, numerical
b.) tabular, graphical, and numerical
, c.) numerical, statistical, graphical
d.) none of these - Answer-Answer: B
True or False: Qualitative data are numerical values that indicate how much or how
many? - Answer-Answer - False, qualitative data uses labels or names to identify
categories of like items. (Chap 3, slide 2)
2. For numerical measures, which of the following is not a measure of location?
A. Mean
B. Median
C. Quartiles
D. Interquartile range - Answer-Answer - D, interquartile range is a measure of
variability. The measures of location are as followed, mean, median, mode,
percentiles, and quartiles. (Chap 4, slide 2)
Most accuracy measures are derived from the classification matrix. - Answer-
Answer: TRUE.
Principal components analysis (PCA) is a useful procedure for reducing the number
of predictors in the model by analyzing the....
A. Output Variables
B. Input Variables
C. Input and Output Variables
D. Categorical Variables - Answer-Answer: B - Input Variables (Chapter 4, pg. 78 )
True or False: Pivot tables can only be used for one variable. - Answer-Answer:
False: Pivot tables can be used for multiple variables. For categorical variables we
obtain a breakdown of the records by the combination of categories.
Chapter 4, page 75, Pivot tables
________ is a way to reduce the number of predictors in a model by analyzing input
variables.
Pivot Tables
Principle Component Analysis
Correlation Analysis
Dimension Reduction - Answer-Answer: b. Principle Component Analysis; Principle
Component Analysis is a way to reduce the number of predictors in a model by
analyzing input variables. It is especially useful when we have highly correlated
subsets of measurements.