BUSINESS INTELLIGENCE & DATA
MINING EXAM QUESTIONS WITH
COMPLETE SOLUTIONS
True or False:
Learning things that are true but not useful can serve as a validation that the data
mining techniques are working and the data is reasonably accurate. - Answer-True
An insurance company tracks the total insurance claim amount (in dollars) for all
customer claims using a variable called Amount. How would you display the
distribution of the variable Amount?
Select one:
a. Use a bar chart with Amount given the role of X.
b. Use a histogram with Amount given the role of X.
c. Use a histogram with Amount given the role of Category
d. Use a bar chart with Amount given the role of Category - Answer-Use a histogram
with Amount given the role of X.
True or False
The most important difference between directed and undirected data mining is that
undirected data mining does not use a specific target variable. - Answer-True
True or False:
Undirected data mining requires even more human understanding than directed
techniques. - Answer-True
One advantage of undirected data mining is that it can be fully automated. - Answer-
False
One advantage of undirected data mining is that it can be fully automated. - Answer-
Falsely
In geometric terms, clusters should be widely spaced from each other. - Answer-
True;
Cluster members should be close to each other. Clusters, themselves, should be
widely spaced.
k-means clustering is one of the most popular techniques for clustering analysis. -
Answer-True;
K-means clustering is one of the most popular techniques for clustering analysis.
_____ for observations within each segment. This leads to segments that are
internally _____. - Answer-similar; homogenous
_____ for observations between segments. This leads to segments that are
externally _______. - Answer-different; heterogenous
, When selecting variables to use in a cluster analysis, you should choose variables
that (select all that apply) - Answer--are meaningful to the analysis objective
-have low correlation between variables
-are predominantly interval
-have skewness between -2 and 2
The filter node removes ____________________ from the analysis. - Answer-
observations
Market Segmentation - Answer-"...is the process of dividing customers into groups or
segments of customers whose needs vary little within the group but vary greatly
among groups..."
-The primary goal is to better satisfy customer needs or wants.
Tailor Marketing Strategies - Answer-Identify people with similar patterns of past
purchases or demographics
Class Levels Count Threshold - Answer--change this to "2" so that interval variables
with only 2 levels (1/2, 0/1) will be automatically assigned the level of binary.
Reject level count threshold - Answer-detect the number class levels of nominal
variables and assign a role of Rejectedto those with class counts above the selected
threshold (default=20) ( a column of first names of 1000 students would be rejected)
When we determine if a variable should be input, rejected, id, etc., we are
determining the variables - Answer-role
When we are determining the variable type we are determining the - Answer-
Measurement Level
Nominal, Interval, Binary
Alphanumeric - Answer-2 levels--> binary
>20 levels--> rejected
2-20 levels--> nominal
Numeric - Answer-2 levels--> binary variable
>20 levels--> interval
2-20 levels--> nominal
Fixed costs of a direct marketing campaign are variable. - Answer-False;
Fixed costs are a fixed number - not dependent on volume or other factors. Variable
costs are variable
Response modeling scores prospects on when they respond to a direct marketing
campaign. - Answer-False;
Response modeling scores prospects on their likelihood to respond to a direct
marketing campaign.
Which of the following would be considered nominal variables?
MINING EXAM QUESTIONS WITH
COMPLETE SOLUTIONS
True or False:
Learning things that are true but not useful can serve as a validation that the data
mining techniques are working and the data is reasonably accurate. - Answer-True
An insurance company tracks the total insurance claim amount (in dollars) for all
customer claims using a variable called Amount. How would you display the
distribution of the variable Amount?
Select one:
a. Use a bar chart with Amount given the role of X.
b. Use a histogram with Amount given the role of X.
c. Use a histogram with Amount given the role of Category
d. Use a bar chart with Amount given the role of Category - Answer-Use a histogram
with Amount given the role of X.
True or False
The most important difference between directed and undirected data mining is that
undirected data mining does not use a specific target variable. - Answer-True
True or False:
Undirected data mining requires even more human understanding than directed
techniques. - Answer-True
One advantage of undirected data mining is that it can be fully automated. - Answer-
False
One advantage of undirected data mining is that it can be fully automated. - Answer-
Falsely
In geometric terms, clusters should be widely spaced from each other. - Answer-
True;
Cluster members should be close to each other. Clusters, themselves, should be
widely spaced.
k-means clustering is one of the most popular techniques for clustering analysis. -
Answer-True;
K-means clustering is one of the most popular techniques for clustering analysis.
_____ for observations within each segment. This leads to segments that are
internally _____. - Answer-similar; homogenous
_____ for observations between segments. This leads to segments that are
externally _______. - Answer-different; heterogenous
, When selecting variables to use in a cluster analysis, you should choose variables
that (select all that apply) - Answer--are meaningful to the analysis objective
-have low correlation between variables
-are predominantly interval
-have skewness between -2 and 2
The filter node removes ____________________ from the analysis. - Answer-
observations
Market Segmentation - Answer-"...is the process of dividing customers into groups or
segments of customers whose needs vary little within the group but vary greatly
among groups..."
-The primary goal is to better satisfy customer needs or wants.
Tailor Marketing Strategies - Answer-Identify people with similar patterns of past
purchases or demographics
Class Levels Count Threshold - Answer--change this to "2" so that interval variables
with only 2 levels (1/2, 0/1) will be automatically assigned the level of binary.
Reject level count threshold - Answer-detect the number class levels of nominal
variables and assign a role of Rejectedto those with class counts above the selected
threshold (default=20) ( a column of first names of 1000 students would be rejected)
When we determine if a variable should be input, rejected, id, etc., we are
determining the variables - Answer-role
When we are determining the variable type we are determining the - Answer-
Measurement Level
Nominal, Interval, Binary
Alphanumeric - Answer-2 levels--> binary
>20 levels--> rejected
2-20 levels--> nominal
Numeric - Answer-2 levels--> binary variable
>20 levels--> interval
2-20 levels--> nominal
Fixed costs of a direct marketing campaign are variable. - Answer-False;
Fixed costs are a fixed number - not dependent on volume or other factors. Variable
costs are variable
Response modeling scores prospects on when they respond to a direct marketing
campaign. - Answer-False;
Response modeling scores prospects on their likelihood to respond to a direct
marketing campaign.
Which of the following would be considered nominal variables?