100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.6 TrustPilot
logo-home
Exam (elaborations)

Data Analytics Journey

Rating
-
Sold
-
Pages
9
Grade
A+
Uploaded on
20-06-2024
Written in
2023/2024

Data Analytics Journey










Whoops! We can’t load your doc right now. Try again or contact support.

Document information

Uploaded on
June 20, 2024
Number of pages
9
Written in
2023/2024
Type
Exam (elaborations)
Contains
Questions & answers

Content preview

Data Analytics Journey

Business Understanding
Planning, Discovery - ANS-Scope Project /Identify stakeholders and research
questions/KPIs Identify timeline, budget, and participants
problems - Lack of clear focus on stakeholders, timeline, limitations, and budget could
potentially derail an analysis

Data acquisition
Extraction, Data gathering, Data query, Data collection ETL (extract, transform, load)
Web scraping - ANS-Gather/collect data from a variety of sources, Provide structure to
data accessible via relational databases (SQL), Build data pipeline (ETL), Use of API to
download data from an external source
problems - Quality and type of data may make access more difficult

Data cleaning
Wrangling, Scrubbing, Munging - ANS-Fixing improperly formatted values, Dealing with
duplicates, missing data, and outliers, Data reduction
problems - Some cleaning techniques could dramatically change data/outcomes,
Outliers not dealt with can cause problems with statistical models due to excessive
variability.

Data exploration
Exploratory Data Analysis (EDA), Descriptive Statistics - ANS-Central Tendency/
Measures of center (e.g., mean, median, mode), variability (e.g., standard deviations
and quartiles) and distributions (e.g., normal, skewed, etc.), Identify basic correlations
between variables, Pattern discovery
problems - Skipping this step could enable faulty perceptions of the data which hurt
advanced analytics.

Predictive Modeling
Data Modeling, Correlation based models, Regression models, Time series -
ANS-Estimate/project future values or likelihood of an event. Extend correlations found
in EDA to mathematical models. Predict/determine output values based on input values.
Cross-validation of predictive models to ensure accuracy.
problems - Too many input variables (predictors) can cause problems. Correlation does
not imply causation. Time series models often need sufficient time data to offer precise

, trending. Predictive model accuracy should be assessed using cross-validation.
Data mining
Machine Learning, Deep Learning, AI (artificial intelligence), Supervised/ Unsupervised
Models - ANS-Creating training and testing datasets to build models from.
Identify/detect patterns. Determine if groups (clusters) exist in data. Classify data into
groups. Create models that "learn" and improve (e.g., machine/deep learning, AI, etc.)
problems - Running on entire data is problematic; need to subset data into training and
testing datasets to build models.

Reporting and visualization
Dashboards - ANS-Tell a story with data. Provide a summary of analytic analysis.
Provide insights to stakeholders. Create insightful graphs that showcase trends and
forecasts
problems - Due to potential large audience consumption, mistakes can cause bad
business decisions and loss of revenue. Improper scales used in graphs could push for
interpretations of the story that is inaccurate

Descriptive - ANS-Key focus: Observation
Main question: What happened?
Example: In a healthcare setting, an unusually high number of people are admitted to
the emergency room in a short period of time. Descriptive analytics tells you that this is
happening and provides real-time data with all the corresponding statistics (date of
occurrence, volume, patient details, etc.).

Diagnostic - ANS-Key focus: Explained reason
Main question: Why did it happen?
Example: In the healthcare example mentioned earlier, diagnostic analytics would
explore the data and make correlations. For instance, it may help you determine that all
of the patients' symptoms — high fever, dry cough, and fatigue — point to the same
infectious agent. You now have an explanation for the sudden spike in volume at the
ER.

Predictive - ANS-Key focus: Correlation
Main question: What will happen in the future?
Example: Back in our hospital example, predictive analytics may forecast a surge in
patients admitted to the ER in the next several weeks. Based on patterns in the data,
the illness is spreading at a rapid rate.

Prescriptive - ANS-Key focus: Causal/manipulate
£7.66
Get access to the full document:

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Get to know the seller
Seller avatar
AASOCR

Get to know the seller

Seller avatar
AASOCR American InterContinental University
View profile
Follow You need to be logged in order to follow users or courses
Sold
6
Member since
2 year
Number of followers
2
Documents
4856
Last sold
7 months ago

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their exams and reviewed by others who've used these revision notes.

Didn't get what you expected? Choose another document

No problem! You can straightaway pick a different document that better suits what you're after.

Pay as you like, start learning straight away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and smashed it. It really can be that simple.”

Alisha Student

Frequently asked questions