DATA MINING MIDTERM EXAM
QUESTIONS AND ANSWERS
Concept Hierarchy - Answer-Used for multiple levels of abstraction
Non-parametric methods of numerosity data reduction techniques - Answer-
Histograms
Clustering
A Data warehouse differs from an operational database because most data
warehouses have a product orientation and tuned to handle transactions that update
the database - Answer-False
Human inspection is an important an appropriate method to handle noisy data -
Answer-False
OLTP captures, stores, and processes data from transactions in real time, while
OLAP uses complex queries to analyze aggregated historical data - Answer-True
Multiple warehouses are needed in a database-centric solution. However, integration
the warehouse is a problem - Answer-True
If you want to handle noisy data, you can use regression - Answer-True
Normalization its used to do data transformation - Answer-True
Feature selection is a dimensionality reduction technique - Answer-True
It is not necessary to have a target variable for applying dimensionality reduction in
data reduction - Answer-True
Discretization and Concept Hierarchy Generations divides the range of continuous
attributes into intervals - Answer-True
Distributive functions - Answer-count()
max()
sum()
Algebraic functions - Answer-avg()
min_N()
Holistic functions - Answer-mode()
rank()
median()
Data cube is generally used for easily smoothing data - Answer-False
, A concept hierarchy climbing defines a sequence of concept mappings from a set of
low-level concepts of higher-level, more general concepts - Answer-True
What are normalization methods? - Answer-Min-max
Z-score
Decimal scaling
Min-max normalization - Answer-A normalization technique in which values are
shifted and rescaled so that they end up ranging between 0 and 1
Equal width and frequency of data are used in... - Answer-Binning
Generalization - Answer-
Examples of supervised learning - Answer-Credit/loan approval
Fraud detection
What is the objective of unsupervised learning? - Answer-Determine data patterns
Determine data groupings
Examples of supervised learning algorithms - Answer-Neural network
Support vector machines
Reinforcement learning - Answer-Has a rewarding strategy
ID3 algorithm can also be used in clustering - Answer-False
Two main processes in classification algorithms - Answer-Model construction
Model usage
An attribute used in DT can have both discrete and continuous values - Answer-True
DT Tree construction - Answer-The terminal told hold s class label
An internal node in a DT implies a test or attribute
k-NN algorithm does more computation on test time rather than train time - Answer-
Yes
For discriminative classifiers, what makes the difference between discriminative and
non-discriminative data? - Answer-P(Y)
Regarding the Bayesian Network as shown in the diagram, what do the nodes and
links in the Bayesian Network Classifier imply? - Answer-Random variables and their
dependencies
Market basket analysis - Answer-Support is the general measure of association
between the two item sets
Association rule is suitable fo r marketing and sales promotion
QUESTIONS AND ANSWERS
Concept Hierarchy - Answer-Used for multiple levels of abstraction
Non-parametric methods of numerosity data reduction techniques - Answer-
Histograms
Clustering
A Data warehouse differs from an operational database because most data
warehouses have a product orientation and tuned to handle transactions that update
the database - Answer-False
Human inspection is an important an appropriate method to handle noisy data -
Answer-False
OLTP captures, stores, and processes data from transactions in real time, while
OLAP uses complex queries to analyze aggregated historical data - Answer-True
Multiple warehouses are needed in a database-centric solution. However, integration
the warehouse is a problem - Answer-True
If you want to handle noisy data, you can use regression - Answer-True
Normalization its used to do data transformation - Answer-True
Feature selection is a dimensionality reduction technique - Answer-True
It is not necessary to have a target variable for applying dimensionality reduction in
data reduction - Answer-True
Discretization and Concept Hierarchy Generations divides the range of continuous
attributes into intervals - Answer-True
Distributive functions - Answer-count()
max()
sum()
Algebraic functions - Answer-avg()
min_N()
Holistic functions - Answer-mode()
rank()
median()
Data cube is generally used for easily smoothing data - Answer-False
, A concept hierarchy climbing defines a sequence of concept mappings from a set of
low-level concepts of higher-level, more general concepts - Answer-True
What are normalization methods? - Answer-Min-max
Z-score
Decimal scaling
Min-max normalization - Answer-A normalization technique in which values are
shifted and rescaled so that they end up ranging between 0 and 1
Equal width and frequency of data are used in... - Answer-Binning
Generalization - Answer-
Examples of supervised learning - Answer-Credit/loan approval
Fraud detection
What is the objective of unsupervised learning? - Answer-Determine data patterns
Determine data groupings
Examples of supervised learning algorithms - Answer-Neural network
Support vector machines
Reinforcement learning - Answer-Has a rewarding strategy
ID3 algorithm can also be used in clustering - Answer-False
Two main processes in classification algorithms - Answer-Model construction
Model usage
An attribute used in DT can have both discrete and continuous values - Answer-True
DT Tree construction - Answer-The terminal told hold s class label
An internal node in a DT implies a test or attribute
k-NN algorithm does more computation on test time rather than train time - Answer-
Yes
For discriminative classifiers, what makes the difference between discriminative and
non-discriminative data? - Answer-P(Y)
Regarding the Bayesian Network as shown in the diagram, what do the nodes and
links in the Bayesian Network Classifier imply? - Answer-Random variables and their
dependencies
Market basket analysis - Answer-Support is the general measure of association
between the two item sets
Association rule is suitable fo r marketing and sales promotion