Guaranteed Success
What is the difference between Scripting and Programming used in data analytics Scripting
languages are interpreted and programming languages are compiled
Describe the methods used to validate models cross-validation and testing new data
A technique that allows us to predict an outcome based on a set of predictor variables
Regression
Regression analysis and a function of time in a value Trend Analysis
A statistical tool that deals with a sequence of data in chronological order. This is a technique
that looks for trends in data over time. Time Series
What is breaking time series data into components, and its procedures are used in time series
to describe the reasons for variations in trends. Decomposition
What graph shows frequencies related to the autocovariance time domain Spectral Density
Describe machine learning and artificial intelligence... Machine learning involves using
algorithms and statistical models to analyze and draw inferences from patterns in data, it
focuses on the development of computer programs that can access data and use it to learn for
themselves
A technique in which the analyst wants to assign an item to a specific category based on various
conditions Classification
Groupings are unknown and the analyst wishes to determine if the objects belong to any group
Clustering
, Is the probability of observing various data, given the hypotheses and the observed data. It
gives you the after-the-data probability of a hypothesis as a function of the likelihood of the
data. Bayes Theorem
Founded by Thomas Bayes, an algorithm that applies a theorem to estimate the conditional
probability of an outcome Naive Bayes
When an analyst attempts to find out if the variables themselves group in any meaningful way,
it is a data reduction method used to reduce the dimensionality of large data sets, by
transforming a large set of variables into a smaller one that still contains most of the
information in the large set Principal Component Analysis (PCA)
Reduces the number of variables and the amount of data, but you will deal with a single score
and not multiple scores or a lot of data. Dimensionality Reduction
Is simply reducing the amount or volume of data in each storage or database. Data
Reduction
An algorithm that groups similar objects into groups that are called clusters Hierarchal
Clustering
The identification of rare items, events or observations in a dataset which differ from the norm
or raise suspicions. Anomaly Detection
Algorithm that mimics the operations of a human brain to recognize relationships between vast
amounts of data Neural Network
A type of neural network capable of performing test classification Deep Learning