Clustering
The process of making a group of abstract objects into classes of similar
objects. Points to Remember. First partition the set, then apply labels to each
partition.
Clustering (Identify)
What type of data mining task?:
Several sets of companies. Each of these sets includes companies that are
similar to each other. Companies belonging to different sets are dissimilar to
one another.
Association Rule Mining (Identify)
What type of data mining task?:
Purchases of (A) imply simultaneous purchases of (B)
Association Rule Mining
A very popular DM method in business
Finds interesting relationships (affinities) between variables (items or events)
Classification (Identify)
What type of data mining task?:
A model that maps a patient's clinical history to positive or negative diagnosis
of a specific disease.
Classification
Assigns items in a collection to target categories or classes.
Regression or Numeric Prediction
A model that maps year-to-date world economic data to tomorrow's exchange
rate between two country's currencies.
Classification; Regression or Numeric Prediction
What are two supervised learning tasks?
COMP682
, COMP682
Association Rule Mining; Clustering
What are two unsupervised learning tasks?
Supervised Data Mining Requirements
The data has a target variable with well defined values; The values of the
target variable are available in training and testing data.
Supervised Learning
Category of data-mining techniques in which an algorithm learns how to
predict or classify an outcome variable of interest.
Unsupervised Learning
A type of model creation, derived from the field of machine learning, that
does not have a defined target variable.
gini index
A statistical formula that measures the amount of inequality in a society; its
scale ranges from 0 to 100, where 0 corresponds to perfect equality and 100
to perfect inequality
Entropy
A measure of disorder or randomness.
prop.table
converts a table object into a relative frequency table
summary
shows distributions of variables in a dataframe
Data Science, Business Analytics; Knowledge Discovery from Data
Terms used interchangeably with data mining
Descriptive, Predictive, Prescriptive
What types of analytics should use data mining?
COMP682