1. Chapter 1: Basics
Measurement Levels
Type Distinction Order Unit Origin
Nominal x
Ordinal x x
Interval x x x
Ratio x x x x
Able to distinguish
Can be ordered Has a unit Has a origin
categories
Statistics Nominal/Ordinal Data
● Frequency: amount of times a category occurs
● Mode: the value that is observed the most
● Median: the middle value
Plots Nominal/Ordinal Data
● Bar chart
● Pie chart
Statistics For Interval/Ratio Data
● Mean: the average
● Mode: the value that is observed the most
● Median: the middle value
● Quantile: points that divide data in intervals (i.e. 25%, 50%, 75%, 100%)
● Range: the difference between the largest and smallest value
● Inter Quartile Range (IQR): the difference between the first and the third quartile
● Mean Absolute Deviation (MAD): the average of how far every observation is from the
mean
● Mean Squared Deviation (MSD): the average of how far every observation is from the
mean squared
● Variance ( S 2 ): how far observation are spread out from their average
● Standard Deviation (SD): the amount of variation
● Skewness: the amount that the data is distributed to the left or the right
○ > 0 : skewed to the right
○ < 0 : skewed to the left
● Kurtosis: how heavy the tails of a distribution differ from the tails of a normal distribution
○ > 0 : heavy tails
○ < 0 : light tails
1
, Plots For Interval/Ratio Data
● Box plot
● Histogram plot
● Density plot
● Scatter plot
2. Chapter 2: Sampling
Representative Sample
A sample that has approximately the same distribution characteristics as the population.
● Simple Random Sampling: each unit in the population has the same probability of
ending up in a sample
● Systematic Sampling: the population is divided into n groups, then one random number
is used to draw a unit from each group at the same index
● Stratified Sampling: the population is divided into n groups, then a percentage of units is
taken from each group
● Cluster Sampling: the population is divided into clusters, then a random sample from
each of the clusters is taken
○ Single-stage: when a random sample is taken from the clusters and from every
sampled cluster all units are taken
○ Multi-stage: when units from the sampled clusters are also randomly sampled
Non-representative Sample
A sample that doesn’t have approximately the same distribution characteristics as the
population.
● Convenience Sampling: a sample that is easy to obtain i.e. in psychology studies
samples are generalized to fit the entire population
● Haphazard Sampling: a sample that may look like a random sample but actually is not
truly a random sample
● Purposive Sampling: a sample that is picked for a specific purpose i.e. customer
satisfaction
2
Measurement Levels
Type Distinction Order Unit Origin
Nominal x
Ordinal x x
Interval x x x
Ratio x x x x
Able to distinguish
Can be ordered Has a unit Has a origin
categories
Statistics Nominal/Ordinal Data
● Frequency: amount of times a category occurs
● Mode: the value that is observed the most
● Median: the middle value
Plots Nominal/Ordinal Data
● Bar chart
● Pie chart
Statistics For Interval/Ratio Data
● Mean: the average
● Mode: the value that is observed the most
● Median: the middle value
● Quantile: points that divide data in intervals (i.e. 25%, 50%, 75%, 100%)
● Range: the difference between the largest and smallest value
● Inter Quartile Range (IQR): the difference between the first and the third quartile
● Mean Absolute Deviation (MAD): the average of how far every observation is from the
mean
● Mean Squared Deviation (MSD): the average of how far every observation is from the
mean squared
● Variance ( S 2 ): how far observation are spread out from their average
● Standard Deviation (SD): the amount of variation
● Skewness: the amount that the data is distributed to the left or the right
○ > 0 : skewed to the right
○ < 0 : skewed to the left
● Kurtosis: how heavy the tails of a distribution differ from the tails of a normal distribution
○ > 0 : heavy tails
○ < 0 : light tails
1
, Plots For Interval/Ratio Data
● Box plot
● Histogram plot
● Density plot
● Scatter plot
2. Chapter 2: Sampling
Representative Sample
A sample that has approximately the same distribution characteristics as the population.
● Simple Random Sampling: each unit in the population has the same probability of
ending up in a sample
● Systematic Sampling: the population is divided into n groups, then one random number
is used to draw a unit from each group at the same index
● Stratified Sampling: the population is divided into n groups, then a percentage of units is
taken from each group
● Cluster Sampling: the population is divided into clusters, then a random sample from
each of the clusters is taken
○ Single-stage: when a random sample is taken from the clusters and from every
sampled cluster all units are taken
○ Multi-stage: when units from the sampled clusters are also randomly sampled
Non-representative Sample
A sample that doesn’t have approximately the same distribution characteristics as the
population.
● Convenience Sampling: a sample that is easy to obtain i.e. in psychology studies
samples are generalized to fit the entire population
● Haphazard Sampling: a sample that may look like a random sample but actually is not
truly a random sample
● Purposive Sampling: a sample that is picked for a specific purpose i.e. customer
satisfaction
2