DATA MINING TEST 1 - CHAPTER 1
QUESTIONS AND ANSWERS
supervised learning - Answer-is basically a synonym for classification.
unsupervised learning - Answer-is essentially a synonym for clustering.
semi-supervised learning - Answer-is a class of machine learning techniques that
make use
of both labelled and unlabelled examples when learning a model.
active learning - Answer-is a machine learning approach that lets users play an
active role in the learning process.
database systems research - Answer-focuses on the creation, maintenance, and use
of databases for organizations and end-users.
Information retrieval - Answer-is the science of searching for documents or
information in documents.
Major Issues in Data Mining - Answer-Mining methodology
user interaction
efficiency and scalability
diversity of data types
data mining society
Why is mining methodology a major issue in data mining? - Answer-data uncertainty,
noise, and incompleteness.
Why is user interaction a major issue in data mining? - Answer-how to interact with a
data mining system, how to incorporate a user's background knowledge in mining,
and how to visualize and comprehend data mining results.
Efficiency and scalability of data mining algorithms: - Answer-Data mining algorithms
must be efficient and scalable in order to effectively extract information from huge
amounts of data in many data repositories or in dynamic data streams. In other
words, the running time of a data mining algorithm must be predictable, short, and
acceptable by applications. Efficiency, scalability, performance, optimization, and the
ability to
execute in real time are key criteria that drive the development of many new data
mining algorithms.
Data mining turns a large collection of data into - Answer-knowledge.
KDD - Answer-Knowledge discovery from data
Data cleaning - Answer-(to remove noise and inconsistent data)
QUESTIONS AND ANSWERS
supervised learning - Answer-is basically a synonym for classification.
unsupervised learning - Answer-is essentially a synonym for clustering.
semi-supervised learning - Answer-is a class of machine learning techniques that
make use
of both labelled and unlabelled examples when learning a model.
active learning - Answer-is a machine learning approach that lets users play an
active role in the learning process.
database systems research - Answer-focuses on the creation, maintenance, and use
of databases for organizations and end-users.
Information retrieval - Answer-is the science of searching for documents or
information in documents.
Major Issues in Data Mining - Answer-Mining methodology
user interaction
efficiency and scalability
diversity of data types
data mining society
Why is mining methodology a major issue in data mining? - Answer-data uncertainty,
noise, and incompleteness.
Why is user interaction a major issue in data mining? - Answer-how to interact with a
data mining system, how to incorporate a user's background knowledge in mining,
and how to visualize and comprehend data mining results.
Efficiency and scalability of data mining algorithms: - Answer-Data mining algorithms
must be efficient and scalable in order to effectively extract information from huge
amounts of data in many data repositories or in dynamic data streams. In other
words, the running time of a data mining algorithm must be predictable, short, and
acceptable by applications. Efficiency, scalability, performance, optimization, and the
ability to
execute in real time are key criteria that drive the development of many new data
mining algorithms.
Data mining turns a large collection of data into - Answer-knowledge.
KDD - Answer-Knowledge discovery from data
Data cleaning - Answer-(to remove noise and inconsistent data)