descriptive statistics - Answers numerical data used to measure and describe characteristics of
groups. Includes measures of central tendency and measures of variation.
inferential statistics - Answers draws conclusions about population based on sample data from
the population.
Statistics - Answers describes the sample
Parameter - Answers a number that describes a population
time series - Answers a time-ordered sequence of observations taken at regular intervals
Which measure of center can be used for a categorical (qualitative) random variable? - Answers
Mode
If an observation is added to the low end of a distribution, the standard deviation - Answers
Increases
Interpret what it means for an observation from this distribution to have a Z-score of -2. -
Answers This observation is 2 standard deviations below average.
coefficient of variation - Answers -To compare the variation of two distributions with very
different means use
-The standard deviation expressed as a percent of the mean
Kurtosis - Answers Measure of the fatness of the tails of a probability distribution relative to
that of a normal distribution. Indicates likelihood of extreme outcomes.
What is the baseline of kurtosis? - Answers 3 on normal distribution
If an investor wanted a less risky stock, they should choose the one with - Answers Higher
return to risk
Return to risk - Answers 1/CV (the reciprocal of the coefficient of variation)
Kurtosis characterisitcs - Answers =3, has similar tails to normal distribution
>3, more extreme values
<3 fewer extreme values
> mean(billing)
[1] NA - Answers There is at least one missing value in "billing"
The summary() function applied to a categorical variable will return - Answers A count of all
, unique values
The summary() function applied to a numerical variable will return - Answers A five number
summary and mean
Identifier variable - Answers A unique number is assigned to each transaction in a database. We
would say that this transaction number.
Data lake - Answers collection of structured, semi-structured, and unstructured data stored in a
single location
data warehouse - Answers stores structured that is ready for data analytics.
Big data - Answers describes data sets so large that traditional methods of storage and analysis
are in adequate.
categorical nominal - Answers name of a category
ex. gender, hair color
categorical ordinal - Answers categories have a natural ordering
quantitative discrete - Answers countable, jumpable, whole numbers
quantitative continuous - Answers Measuring something with an infinite # of possible values ex.
weight, decimals
panel data - Answers A stack of time series for multiple observation units.
frequency vs relative frequency - Answers frequency refers to how many individuals are within a
certain category, whereas relative frequency is the frequency divided by the total number of
individuals.
Pie chart - Answers Categorical (qualitative) data. Displays parts of a whole. Not good when
there are too many categories. Don't ever make 3-d or tilted. Not good for comparisons.
bar graph - Answers Categorical (qualitative) data. Can be horizontal or vertical. Can display
parts of a whole or separate values. For nominal data, put bars in ascending order or
descending order. For ordinal data, put bars in order of categories. Bars should be same width;
heights are the thing to compare
line graph - Answers Quantitative data changing over time. Time should go on the horizontal
axis and variable on the vertical axis. Use different lines to denote separate categories or
groups. Beware of plotting on different scales.
histogram - Answers Work for medium to large quantitative data sets. Bins touch. Are nice
because you can visualize the shape of distribution, even multimodality. Not good for side-by-