Data Analytics final exam
Three advanced data analytics techniques - ANSdecision trees
clustering
association rules
Decision trees - ANSType of classification method to determine membership of cases or values of an
outcome variable based on one or more predictor variables
What predicts whether a company will go bankrupt?
(using different variables to predict)
How likely is it going to rain tomorrow?
Outcome variable is categorical
In a real situation, the decision tree software has to deal with instances where - ANSThe same set of
predictors resulting in different outcomes
Multiple paths result in the same outcome
Not every combination of predictors is in the training set
Clustering - ANSUsed to determine distinct groups of data
Based on data across multiple dimensions
Uses:
-customer segmentation
-identifying patient care groups
-performance of business sectors
,1. define variables
2. install and load packages
3. read csv data
4. preprocressing data
-remove missing data
-normalizing
-remove outliers
5. cluster analysis
-comparing different solutions with number of clusters
-running cluster analysis with given number of clusters
-interpreting the results
Can we group our website visitors into types based on their online behaviors?
Can we identify different product markets based on customer demographics
Association rules - ANSFind out which events predict the occurrence of other events
Often used to see which products are bought together
Express relationships between itemsets
Uses:
-what products are bought together?
-Amazon's recommendation engine
-telephone calling patterns
-Facebook: who you may know
-where to place items on grocery store shelves
aka:
affinity analysis
market basket analysis
, when someone gets an A in this class, what other classes do they get an A in?
(same type of things to predict others)
If someone upgrades to an iPhone, do they also buy a new case?
(similar products, compliment each other)
Advanced data analytics - ANSthe examination of data or content using sophisticated techniques and
tools, to discover deeper insights, make predictions, or generate reccomendations
Goals:
1. extraction of implicit, previously unknown, and potentially useful information from data
2. exploration and analysis of large data sets to discover meaningful patterns
3. prediction of future events based on historical data
NOT:
-sales analysis
-profitability analysis
-sales force analysis
How does advanced data analytics differ from OLAP analysis? - ANSOLAP can tell you what is happening,
or what has happened
Summary of what happened in the past
-pivot tables
-sum
-avg
-min
-max
-time trend
Whatever can be done using pivot tables is NOT data analytics
Advanced data analytics can tell you why it is happening, and help predict what will happen
Three advanced data analytics techniques - ANSdecision trees
clustering
association rules
Decision trees - ANSType of classification method to determine membership of cases or values of an
outcome variable based on one or more predictor variables
What predicts whether a company will go bankrupt?
(using different variables to predict)
How likely is it going to rain tomorrow?
Outcome variable is categorical
In a real situation, the decision tree software has to deal with instances where - ANSThe same set of
predictors resulting in different outcomes
Multiple paths result in the same outcome
Not every combination of predictors is in the training set
Clustering - ANSUsed to determine distinct groups of data
Based on data across multiple dimensions
Uses:
-customer segmentation
-identifying patient care groups
-performance of business sectors
,1. define variables
2. install and load packages
3. read csv data
4. preprocressing data
-remove missing data
-normalizing
-remove outliers
5. cluster analysis
-comparing different solutions with number of clusters
-running cluster analysis with given number of clusters
-interpreting the results
Can we group our website visitors into types based on their online behaviors?
Can we identify different product markets based on customer demographics
Association rules - ANSFind out which events predict the occurrence of other events
Often used to see which products are bought together
Express relationships between itemsets
Uses:
-what products are bought together?
-Amazon's recommendation engine
-telephone calling patterns
-Facebook: who you may know
-where to place items on grocery store shelves
aka:
affinity analysis
market basket analysis
, when someone gets an A in this class, what other classes do they get an A in?
(same type of things to predict others)
If someone upgrades to an iPhone, do they also buy a new case?
(similar products, compliment each other)
Advanced data analytics - ANSthe examination of data or content using sophisticated techniques and
tools, to discover deeper insights, make predictions, or generate reccomendations
Goals:
1. extraction of implicit, previously unknown, and potentially useful information from data
2. exploration and analysis of large data sets to discover meaningful patterns
3. prediction of future events based on historical data
NOT:
-sales analysis
-profitability analysis
-sales force analysis
How does advanced data analytics differ from OLAP analysis? - ANSOLAP can tell you what is happening,
or what has happened
Summary of what happened in the past
-pivot tables
-sum
-avg
-min
-max
-time trend
Whatever can be done using pivot tables is NOT data analytics
Advanced data analytics can tell you why it is happening, and help predict what will happen