answers latest top score.
Data Transformation - correct answers.Data mapping: converting data from one format
to another.
Data deduplication: eliminating repeated or redundant data.
Derived variables: creating new variables from existing ones.
Data sorting or ordering: arranging data in a specific sequence.
Data Transformation occurs in the phase, the role this applies
to is the A - correct answers.Preparation phase, Data Analyst
is great at combining unstructured data feeds from multiple sources. -
correct answers.Hadoop
Examples of when to use : stream processing, fraud detection, and prevention,
content management, risk management. - correct answers.Hadoop
Set up sandbox, extract and transform data, condition data and exploring visually
occurs in the phase - correct answers.Data Preparation
When you convert a Microsoft Word file to a PDF, for example, you are data -
correct answers.transforming
Running a virtual machine on Linux operating system on Windows is an example of -----
- ----- correct answers.sandboxing
,Some key features of an Analytical Sandbox may include tools and features for c---------
- and s---ing work with colleagues. Flexibility to allow analysts to try out different
analytical approaches and techniques. Clear documentation and support resources to
help analysts get up to speed quickly. - correct answers.collaboration, sharing
Why is it important to collect data in a certain time frame? - correct answers.Result:
more precise findings than working with an open-ended timeframe.
testing works by randomly showing two versions of the same asset (ad, website,
pop-up, offer, etc.) to different users - correct answers.A/B
What does it mean for a dependent variable to be binary?
(this is always applied to logistic regression) - correct answers.A binary variable is a
categorical variable that can only take one of two values, usually represented as a
Boolean — True or False — or an integer variable — 0 or 1, yes or no, sick or not sick,
obese or underweight, etc., depending on the independent variable.
Analysis when you're looking to segment or categorize a dataset into groups
based on similarities, but aren't sure what those groups should be. - correct
answers.Cluster Analysis
Preprocessing (of data) - correct answers.the process of transforming raw data into an
understandable format
Bounce Rate - correct answers.the percentage of visitors to a particular website who
navigate away from the site after viewing only one page.
Logistic Regression - correct answers.A statistical analysis which determines an
individual's risk of the outcome as a function of a risk factor. The outcome of interest
has two categories (yes or no, obese or not obese, at risk of cancer or not at risk of
cancer, happens or does not happen, etc.).
K-means clustering - correct answers.Informally, goal is to find groups of points that are
close to each other but far from points in other groups.
• Each cluster is defined entirely and only by its centre, or mean value µk
Random Forest - correct answers.An algorithm used for regression or classification that
uses a collection of tree data structures trees "vote" on the best model.
Examples of when to use Random Forest - correct answers.In HC: to identify the correct
combination of components in medicine and to analyze a patient's medical history to
identify diseases (for example using symptoms to predict whether a person's symptoms
are more closely tied to malaria or a simple fever, another example can be a cold or a
sinus infection).
, Centroid Clustering - correct answers.clusters are represented by their centroids.
hierarchical clustering with cluster distance defined by a centroid/assigned center
has many applications in diverse fields such as face recognition, computer vision,
image compression, bioinformatics, and fraud detection. - correct answers.PCA
Density Clustering - correct answers.detecting areas where points are concentrated and
where they are separated by areas that are empty or sparse.
Data Wrangling is - correct answers.the process of removing errors and combining
complex data sets to make them more accessible and easier to analyze.
Data Wrangling Examples - correct answers.Merging several data sources into one
data-set for analysis.
Identifying gaps or empty cells in data and either filling or removing them.
Deleting irrelevant or unnecessary data.
Identifying severe outliers in data and either explaining the inconsistencies or deleting
them to facilitate analysis.
Merging several data sources into one data-set for analysis.
Identifying gaps or empty cells in data and either filling or removing them.
Deleting irrelevant or unnecessary data.
Identifying severe outliers in data and either explaining the inconsistencies or deleting
them to facilitate analysis. - correct answers.Data Wrangling Examples
Maintaining Databases examples/purpose - correct answers.Routines meant to help
performance, free up disk space, check for data errors, check for hardware faults,
update internal statistics, and many other obscure (but important) things.
Maintaining DB2® and Oracle databases involves updating statistics, monitoring
database, server, and space utilization, and planning backup and recovery strategies. -
an example of ? - correct answers.Maintaining databases
Project initiation falls under what role? - correct answers.Project Sponsor
How to find p-value? - correct answers.-look at alternative
-if less than, find area to the left of z value
-if greater than, find area to the right of z value
-if not equal to, if positive, find area to the right and double it. if negative, find area to the
left and double it.
creating tables and establishing relationships between those tables according to rules
designed both to protect the data and to make the database more flexible by eliminating
redundancy and inconsistent dependency. - correct answers.normalization
OpenLayers - correct answers.