Data Mining Midterm Exam Questions
and Answers
Market basket analysis - Answer-Support is the general measure of association
between the two item sets
Association rule is suitable fo r marketing and sales promotion
Data clustering - Answer-The approach to identifying frequently occurring terms in a
document
What data mining task would be best for spam detection? - Answer-Classification
What is the best data mining technique for predicting sales amount? - Answer-
Regression
Classification is a predictive task - Answer-True
Result of KDD - Answer-Useful information
Meaningful data
Which of the following methods does not involve data mining?
AI
Statistics
Database
Information - Answer-Information
What would final data or mined data look like? - Answer-A few facts
Numbers
Texts
Which of the following is an example of data mining?
A database query that displays information from a table
The extraction of data itself
Query a web search engine for information about "Amazon"
What are soled more on a particular day than other days? - Answer-What are soled
more on a particular day than other days?
Documents can be better grouped by... - Answer-Clustering
A test set is used in ______ to determine the accuracy of the model - Answer-
Classification
What data mining method can be used to detect and resolve data value equalities? -
Answer-Truth Discovery
Data mining vs data warehouse - Answer-Data mining is a process of extracting data
from large data sets
Data warehousing is a process of pooling all the relevant data together
, Example of relational OLAP - Answer-Metacube
Operational data and data warehouse - Answer-Operational data is used as a source
for the data warehouse
A well-accepted multidimensional view of measuring data quality - Answer-Accuracy,
consistency, and accessibility
Major tasks in data preprocessing? - Answer-Data cleaning
Data integration
Data transformation
Data reduction
Data cleaning methods - Answer-Binning
Clustering
Regression
Concept Hierarchy - Answer-Used for multiple levels of abstraction
Non-parametric methods of numerosity data reduction techniques - Answer-
Histograms
Clustering
A Data warehouse differs from an operational database because most data
warehouses have a product orientation and tuned to handle transactions that update
the database - Answer-False
Human inspection is an important an appropriate method to handle noisy data -
Answer-False
OLTP captures, stores, and processes data from transactions in real time, while
OLAP uses complex queries to analyze aggregated historical data - Answer-True
Multiple warehouses are needed in a database-centric solution. However, integration
the warehouse is a problem - Answer-True
If you want to handle noisy data, you can use regression - Answer-True
Normalization its used to do data transformation - Answer-True
Feature selection is a dimensionality reduction technique - Answer-True
It is not necessary to have a target variable for applying dimensionality reduction in
data reduction - Answer-True
Discretization and Concept Hierarchy Generations divides the range of continuous
attributes into intervals - Answer-True
Distributive functions - Answer-count()
and Answers
Market basket analysis - Answer-Support is the general measure of association
between the two item sets
Association rule is suitable fo r marketing and sales promotion
Data clustering - Answer-The approach to identifying frequently occurring terms in a
document
What data mining task would be best for spam detection? - Answer-Classification
What is the best data mining technique for predicting sales amount? - Answer-
Regression
Classification is a predictive task - Answer-True
Result of KDD - Answer-Useful information
Meaningful data
Which of the following methods does not involve data mining?
AI
Statistics
Database
Information - Answer-Information
What would final data or mined data look like? - Answer-A few facts
Numbers
Texts
Which of the following is an example of data mining?
A database query that displays information from a table
The extraction of data itself
Query a web search engine for information about "Amazon"
What are soled more on a particular day than other days? - Answer-What are soled
more on a particular day than other days?
Documents can be better grouped by... - Answer-Clustering
A test set is used in ______ to determine the accuracy of the model - Answer-
Classification
What data mining method can be used to detect and resolve data value equalities? -
Answer-Truth Discovery
Data mining vs data warehouse - Answer-Data mining is a process of extracting data
from large data sets
Data warehousing is a process of pooling all the relevant data together
, Example of relational OLAP - Answer-Metacube
Operational data and data warehouse - Answer-Operational data is used as a source
for the data warehouse
A well-accepted multidimensional view of measuring data quality - Answer-Accuracy,
consistency, and accessibility
Major tasks in data preprocessing? - Answer-Data cleaning
Data integration
Data transformation
Data reduction
Data cleaning methods - Answer-Binning
Clustering
Regression
Concept Hierarchy - Answer-Used for multiple levels of abstraction
Non-parametric methods of numerosity data reduction techniques - Answer-
Histograms
Clustering
A Data warehouse differs from an operational database because most data
warehouses have a product orientation and tuned to handle transactions that update
the database - Answer-False
Human inspection is an important an appropriate method to handle noisy data -
Answer-False
OLTP captures, stores, and processes data from transactions in real time, while
OLAP uses complex queries to analyze aggregated historical data - Answer-True
Multiple warehouses are needed in a database-centric solution. However, integration
the warehouse is a problem - Answer-True
If you want to handle noisy data, you can use regression - Answer-True
Normalization its used to do data transformation - Answer-True
Feature selection is a dimensionality reduction technique - Answer-True
It is not necessary to have a target variable for applying dimensionality reduction in
data reduction - Answer-True
Discretization and Concept Hierarchy Generations divides the range of continuous
attributes into intervals - Answer-True
Distributive functions - Answer-count()