QUESTIONS AND VERIFIED ANSWERS
◉ What are the main data mining application areas?. Answer: -
Customer relationship management
-Banking
-Healthcare
-Insurance and medicine
The common factors with them is to solve pressing problems,
explore emerging business opportunity, or to create a sustainable
competitive advantage.
◉ Why do we need a standardized data mining process?. Answer: To
systematically carry out data mining projects in the CRISP-DM
process is generally followed.
◉ CRISP-DM. Answer: Cross industry standard process for data
mining is used as a standard methodology to systematically carry
out data mining projects.
◉ Difference between the two most commonly used data mining
processes. Answer: Process 1- CRISP-DM
, Process 2- SEMMA (sample, explore, modify, model and asses)
CRISP-DM take a more comprehensive approach where SEMMA
assumes that the data mining project goals and objectives have been
identified and understood.
◉ Our data mining processes a mere sequential set of activities.
Answer: While the steps are sequential the whole process can be
very iterative.
◉ Why do we need data preprocessing?. Answer: Is to take data
identified and preparing it for analysis by data mining methods and
roughly accounts for 80% of the total time spent data mining.
The reason is real-world data is generally incomplete.
◉ Discuss the reasoning behind the assessment of classification
models. Answer: Classification models learn patterns from past data
in order to place new instances in the respected groups or classes.
◉ What is the main difference between classification and
clustering?. Answer: Clustering uses one or more heuristics to
discover natural groupings where classification learns the function
between the characteristic of things and their membership.