WGU D 204 The Data Analytics Journey Questions With 100% Correct Answers.
Analyses for Data Science: Descriptive: - Humans are good at finding patterns, but limited bandwidth - so we need to narrow the data. Look at the data. 1) Visualize the data - graphs, histograms, bell curve 2) Compute Univariate Descriptive Statistics: mean (average), mode (most common), median (splits into two equal halves). So ONE Value. 3) Measures of association: connection between the variables in your data. Range: high and low, Quartiles, Variance, Standard Deviation, Correlation coefficients, regression analysis. (this was the click to sales conversions question) Must be attentive to outliers, open-ended and undefined scores (screen data!) Use your words: explain describe Data Analytics Lifecyle per course requirements - Discovery Phase (Business understanding) Data Acquisition Data Exploration Predictive Modeling Data Mining Data Reporting & Representation Data Analytics Lifecycle: Discovery - This is where you identify business needs. Defining the business reason that any analysis is needed. Working with stakeholders to help them ask better questions so that both they and you understand the outcome. Do you have the data you need to answer this question? If not, is there another way to accomplish it with what you do have? Data Analytics Lifecycle: Data Acquisition - Collect/Get Data from various sources and cleaning it. Getting Data may include creating SQL queries of data within the tables; or data warehouses. Cleaning is the most labor-intensive phase (both in time and effort). You may have to identify outliers here or detect missing values (null values in columns). An analyst will use SQL, Python, R, or Excel to perform various data modifications and transformations. When cleaning is skipped, or ignored, the results from the analysis may become irrelevantData Analytics Lifecycle: Data Exploration - Now that the data is cleaned, you can begin to familiarize yourself with it. You are beginning to understand the basic nature of data and the relationships. Making histograms and generally understanding what is included. Creating bar graphs will give you verification through visualization. This may include applying a statistical formula to obtain the avg. temp of a city over the last 50 years. Poor attention to detail in this phase will give you a lack of insight into the structure of the data set. Data Analytics Lifecycle: Predictive Modeling - Is where you are taking those insights and operationalizing them to predict future outcomes (oil company uses robots to detect corrosion over time - reducing shutdown/interruptions). You go beyond describing to actual modeling. Churn Analysis could be performed (evaluation of a company's customer loss repeat in order to reduce it (analyses your product and how people use it). Common mistake is to develop a model before the research question is known. Again, Python and R play key roles here. Data Analytics Lifecycle: Data Mining - Find patterns and insights. Find correlations and test hypothesis. Example: data analyst has identified combinations of sales transactions that frequently occur together in data over the past 5 years. It is possible in this phase to reduce significantly the data which results in a sample size that is too small. Some call this machine learning. Python and R also play key roles here. Data Analytics Lifecycle: Data Reporting & Representation - Analysts creates a story to report on data findings. Provide actionable insights that can inform decision makers; provide conclusion from the analysis in engaging manner. Effective data reporting is to exclude unrelated data. Here you may use Tableau or PowerBi. Data Science Pathway (per video) - Planning Wrangling Modeling Applying the Model Data Science Pathway: Planning: - 1.- Define Goals: what do you want to accomplish? 2.- Organize Resources: right computers, people. 3.- Coordinate People: team effort. 4.- Schedule Project: timeboxing - Gantt ChartData Science Pathway: Wrangling - 5.- Get Data: internal, external, APIs. Raw materials. 6.- Clean Data (enormous Task): ready to fit the application you are using 7.- Explore Data: visualizations, numerical summaries. 8.- Refine data: based on #7 - may need to recategorize cases, combine variables into new scores. Data Science Pathway: Modeling - 9.- Create Model: statistical models, linear regressions, decision trees, neural networks. 10.- Validate Model: how well this will generalize the data set; when left out conclusions fall apart. 11.- Evaluate the Model: how well does it fit the data, what is the ROI? How usable is it? 12.- Refine the Model: adjusting processes. Data Science Pathway: Applying - 13.- Present the Model: show to investors, clients. What you learned 14.- Deploy the model: put online or dashboard to produce recommendations 15.- Revisit the Model. How well is it performing? May have to redo 16.- Archive the assets. Documentation of the project. Analyses for Data Science: Predictive: - Predict the future. 1) Use relevant past data 2) Model the outcome 3) Apply to new data. 4) Validate model with new data (often neglected, but CRITICAL) Predict Illness, payoff debt; recommend products online shopping. One Definition: Future events: use present data to predict into the future. Use prediction to explain alternative events - estimate . Methods: classification (K, Centroid, Clustering methods); Decision Trees, Random Forest; Neural Networks (machine learning that mimics the brain). Regression analysis: gives you an understandable equation to predict a single outcome based on multiple predictor variables. Very flexible, usually linear, easy to interpret.Analyses for Data Science: Trend Analysis - Graph of changes over time - connect the dots to get a road map. Look at change over time. Can be linear, exponential, logarithmic (grows rapidly until it hits a ceiling), sigmoid (logistic function), sinusoidal (cyclical). Change points: changes in the resting state of the data. Decomposition (break the trend over time into several separate elements) Analyses for Data Science: Clustering - Cluster in groups of similar in important ways. K Dimensional Space; Measure Distances. K-means, group centroid models, density models, distribution. 1) The Data Analytics Life Cycle - Trace the phases of the data analytics life cycle. Summarize the discovery phase of the data analytics life cycle. Summarize the data acquisition phase of the data analytics life cycle. Summarize the data exploration phase of the data analytics life cycle. Summarize the predictive modeling phase of the data analytics life cycle. Summarize the data mining phase of the data analytics life cycle. Contextualize each phase within the data analytics life cycle.
Written for
- Institution
- D204 The Data Analytics Journey
- Course
- D204 The Data Analytics Journey
Document information
- Uploaded on
- November 5, 2023
- Number of pages
- 12
- Written in
- 2023/2024
- Type
- Exam (elaborations)
- Contains
- Questions & answers
Subjects
-
wgu d 204 the data analytics journey
Also available in package deal