IBM Data Science - (Google IBM) Exam With
Complete Solutions 100% Accurate
6 practices of EDA - ANSWER The process of exploring data sets and summarizing
their important features, generally by using methods such as data wrangling and
visualization. The six practices used by INSTRUCTOR: discover, structure, clean, join,
validate, present Box plot - ANSWER A visualization that show the locality and spread
and skew in quartiles of groups of values
Categorical Data - ANSWER Data which is separated into a finite number of qualitative
groups
Collective outliers - ANSWER A subset of anomalous points, demonstrating a similar
behaviour and far away from the remainder of the population
Contextual outliners - ANSWER Data points which, normally would be regarded as
normal data items under certain conditions, become anomalies under almost all other
conditions
Discrete - ANSWER a mathematical concept indicating that a measure or dimension
has a finite and countable number of outcomes Typically whole numbers but not always.
Shoe size is example of non int
Continuous - ANSWER mathematical concept indicating that a measure or dimension
has an infinite and uncountable number of outcomes
Docstrings - ANSWER a group of text that describes what a method or function does,
or can be referred to as a "docstring"
Dummy variables - ANSWER Variables that have values of 0 or 1 which indicate if
something is or is not present
,Global Outliers - ANSWER Values that are completely unlike the rest of the group of
data and do not relate to other outliers
Heat map - ANSWER A graphical representation of an example or series of values by
magnitude using two colors
Histogram - ANSWER A data visualization that shows a rough approximation of the
distribution of values into a dataset
Input validation - ANSWER The process of closely examining and rechecking to ensure
data is full, error-free and high quality
Int64 - ANSWER An example of a typical integer data type, it represents integers
between negative nine quintillion and positive nine quintillion.
JSON file - ANSWER A file used for storing and saving data in a format as it would
appear in JavaScript.
Label Encoding - ANSWER It is a technique for the transformation of data where each
category is assigned a unique number instead of a qualitative value.
Joining - ANSWER A way of integrating two (or more) different data frames along an
identified starting column(s)
One-hot encoding - ANSWER A data transformation technique that converts one
categorical variable into several binary variables
Second-party data - ANSWER Data that was collected outside your organization but
directly from the original source
, Third-party Data - ANSWER Data collected outside your organization and aggregated
A/B Testing A method of testing two variants of something in order to decide which one
works better
Addition rule (for mutually exclusive events) The idea that if the events A and B are
mutually exclusive, then the probability of A or B occurring is equal to the sum of the
probabilities of A and B
Bayes' Theorem - ANSWER A mathematical formula for stating that for any two events
A and B, the probability of A given B equals the probability of A multiplied by the
probability of B given A divided by the probability of B; also known as Bayes' rule.
Binomial distribution: - ANSWER A discrete distribution modeling the probability of
events with two possible outcomes : success or failure
Central Limit Theorem - ANSWER The concept of the sampling distribution of the
mean, which is approximately normally distributed if the sample size is sufficiently large
(Sample size must be a minimum of 30 to hold true)
Classical Probability - ANSWER A probability based on the analysis of the logical
structure of outcomes with equally likely events.
Cluster Random Sample - ANSWER A probability sampling method that first divides the
population into clusters and then randomly selects certain clusters, including in the
sample all the members from those selected clusters
Complement of an event - ANSWER In statistics the complement of an event is an event
not occurring
Complete Solutions 100% Accurate
6 practices of EDA - ANSWER The process of exploring data sets and summarizing
their important features, generally by using methods such as data wrangling and
visualization. The six practices used by INSTRUCTOR: discover, structure, clean, join,
validate, present Box plot - ANSWER A visualization that show the locality and spread
and skew in quartiles of groups of values
Categorical Data - ANSWER Data which is separated into a finite number of qualitative
groups
Collective outliers - ANSWER A subset of anomalous points, demonstrating a similar
behaviour and far away from the remainder of the population
Contextual outliners - ANSWER Data points which, normally would be regarded as
normal data items under certain conditions, become anomalies under almost all other
conditions
Discrete - ANSWER a mathematical concept indicating that a measure or dimension
has a finite and countable number of outcomes Typically whole numbers but not always.
Shoe size is example of non int
Continuous - ANSWER mathematical concept indicating that a measure or dimension
has an infinite and uncountable number of outcomes
Docstrings - ANSWER a group of text that describes what a method or function does,
or can be referred to as a "docstring"
Dummy variables - ANSWER Variables that have values of 0 or 1 which indicate if
something is or is not present
,Global Outliers - ANSWER Values that are completely unlike the rest of the group of
data and do not relate to other outliers
Heat map - ANSWER A graphical representation of an example or series of values by
magnitude using two colors
Histogram - ANSWER A data visualization that shows a rough approximation of the
distribution of values into a dataset
Input validation - ANSWER The process of closely examining and rechecking to ensure
data is full, error-free and high quality
Int64 - ANSWER An example of a typical integer data type, it represents integers
between negative nine quintillion and positive nine quintillion.
JSON file - ANSWER A file used for storing and saving data in a format as it would
appear in JavaScript.
Label Encoding - ANSWER It is a technique for the transformation of data where each
category is assigned a unique number instead of a qualitative value.
Joining - ANSWER A way of integrating two (or more) different data frames along an
identified starting column(s)
One-hot encoding - ANSWER A data transformation technique that converts one
categorical variable into several binary variables
Second-party data - ANSWER Data that was collected outside your organization but
directly from the original source
, Third-party Data - ANSWER Data collected outside your organization and aggregated
A/B Testing A method of testing two variants of something in order to decide which one
works better
Addition rule (for mutually exclusive events) The idea that if the events A and B are
mutually exclusive, then the probability of A or B occurring is equal to the sum of the
probabilities of A and B
Bayes' Theorem - ANSWER A mathematical formula for stating that for any two events
A and B, the probability of A given B equals the probability of A multiplied by the
probability of B given A divided by the probability of B; also known as Bayes' rule.
Binomial distribution: - ANSWER A discrete distribution modeling the probability of
events with two possible outcomes : success or failure
Central Limit Theorem - ANSWER The concept of the sampling distribution of the
mean, which is approximately normally distributed if the sample size is sufficiently large
(Sample size must be a minimum of 30 to hold true)
Classical Probability - ANSWER A probability based on the analysis of the logical
structure of outcomes with equally likely events.
Cluster Random Sample - ANSWER A probability sampling method that first divides the
population into clusters and then randomly selects certain clusters, including in the
sample all the members from those selected clusters
Complement of an event - ANSWER In statistics the complement of an event is an event
not occurring