Topic: Use: Formula:
The mean of a set of numbers x1,…, xk is The median of a set of numbers 𝑥1 ≤ ⋯ ≤ The mode is the most frequent outcome from a set of observations.
the sum of the set divided by its size. 𝑥𝑘 (arranged in • Typically used in the analusis of categorical data (e.g. gender,
𝑥1 + ⋯ + 𝑥𝑘 ∑𝑘𝑖=1 𝑥𝑖 ascending order) is: nationality, degree program)
= 𝑘+1
• 1 mode: unimodal
𝑘 𝑘 • If k is odd, the 𝑡ℎ observation.
2
𝑘 𝑘+2 • 2 modes: bimodal
If k is even, the average of the 2 𝑡ℎ and 2 𝑡ℎ
3+ modes: multimodal
observations.
Range The difference between the largest and the smallest observations.
• +: easy to calculate.
• -: depends only on two values in our sample. This is problematic, especially when here are extreme observations.
Population variance The average squared differences between each 𝑁
∑𝑖=1(𝑥𝑖 − 𝜇)2
2
observation and the population mean. 𝜎 =
𝑁
The sum of the squared differences between
each observation and the population mean,
divided by the population size.
Population Standard deviation The positive square root of the variance. 𝜎 = √𝜎 2
Population Coefficient of Variation Expresses dispersion as a percentage of the 𝜎
𝐶𝑉 = (𝑖𝑓 𝜇 > 0)
mean. 𝜇
𝑛
Sample variance The sum of the squared differences between 2
𝛴𝑖=1 (𝑥𝑖 − 𝑥̅ )2
each observation and the sample mean, divided 𝑠 =
𝑛−1
by the sample size minus 1.
We divide by n-1 instead of n because it
generates an unbiased estimator for the
population variance. (don’t need to understand
it now)
Frequency distributions Rule 1: Classes are non-overlapping and inclusive.
Rule 2: Define k, the number of classes.
𝑅𝑎𝑛𝑔𝑒 𝐿𝑎𝑟𝑔𝑒𝑠𝑡 𝑁𝑢𝑚𝑏𝑒𝑟−𝑆𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑁𝑢𝑚𝑏𝑒𝑟
Rule 3: The class width is given by: =
𝑘 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑠
Histogram A histogram is a bar graph where each bar corresponds to a class and its height is proportional to the frequency in that class.
Numerical Summaries for Grouped Data: 𝑘 𝑘 1<
𝑘
∑𝑖=1 𝑓𝑖 (𝑚𝑖 − 𝜇) 2 ∑ 𝑓𝑖 𝑚𝑖 2
Population 𝑁 = ∑ 𝑓𝑖 𝑖=1 𝑓𝑖 2
𝜎2 = = − 𝜇2 = ∑ 𝑚 − 𝜇2
𝑖=1 𝑁 𝑁 𝑁 𝑖
𝑖=1
𝑘
Σ𝑖=1 𝑓𝑖 𝑚𝑖
𝜇= =
𝑁
, Numerical Summaries for Grouped Data: 𝑘 𝑘
∑𝑖=1 𝑓𝑖 (𝑚𝑖 − 𝑥̅ )2 𝑛 𝑘 𝑓
𝑖
2 (𝑚 − 𝑥̅ )2
Sample 𝑛 = ∑ 𝑓𝑖 𝑠 = = ∑
𝑛−1 𝑛 − 1 𝑖=1 𝑛 𝑖
𝑖=1
𝑘
Σ𝑖=1 𝑓𝑖 𝑚𝑖
𝑥̅ =
𝑛
Skewness Finding the shape of the distribution.
𝑛
Sample Skewness Right skewness → Skewness > 0
Left skewness → Skewness < 0 𝑛 (𝑥 − 𝑥̅ )3
𝑆𝑘𝑒𝑤𝑛𝑒𝑠𝑠 = ∑
(𝑛 − 1)(𝑛 − 2) 𝑠3
Typically, right (left) skewness implies mean > 𝑖=1
(<) median > (<) mode.
Sample Covariance Measure of linear relationship between two ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )(𝑦𝑖 − 𝑦̅)
variables. 𝑐𝑜𝑣(𝑥, 𝑦) =
𝑛−1
• 𝑐𝑜𝑣(𝑥, 𝑦) < 0 → negative linear association
• 𝑐𝑜𝑣(𝑥, 𝑦) > 0→ positive linear association
Sample Coefficient of Correlation The coefficient of correlation ranges from -1 to 𝑟𝑥𝑦 = 𝐶𝑜𝑣(𝑥, 𝑦)/𝑠𝑥 𝑠𝑦
+1 and
• rxy = 1 indicates a perfect positive linear
relationship;
• rxy = 0 indicates no linear relationship;
• rxy = -1 indicates a perfect negative
linear relationship.
The mean of a set of numbers x1,…, xk is The median of a set of numbers 𝑥1 ≤ ⋯ ≤ The mode is the most frequent outcome from a set of observations.
the sum of the set divided by its size. 𝑥𝑘 (arranged in • Typically used in the analusis of categorical data (e.g. gender,
𝑥1 + ⋯ + 𝑥𝑘 ∑𝑘𝑖=1 𝑥𝑖 ascending order) is: nationality, degree program)
= 𝑘+1
• 1 mode: unimodal
𝑘 𝑘 • If k is odd, the 𝑡ℎ observation.
2
𝑘 𝑘+2 • 2 modes: bimodal
If k is even, the average of the 2 𝑡ℎ and 2 𝑡ℎ
3+ modes: multimodal
observations.
Range The difference between the largest and the smallest observations.
• +: easy to calculate.
• -: depends only on two values in our sample. This is problematic, especially when here are extreme observations.
Population variance The average squared differences between each 𝑁
∑𝑖=1(𝑥𝑖 − 𝜇)2
2
observation and the population mean. 𝜎 =
𝑁
The sum of the squared differences between
each observation and the population mean,
divided by the population size.
Population Standard deviation The positive square root of the variance. 𝜎 = √𝜎 2
Population Coefficient of Variation Expresses dispersion as a percentage of the 𝜎
𝐶𝑉 = (𝑖𝑓 𝜇 > 0)
mean. 𝜇
𝑛
Sample variance The sum of the squared differences between 2
𝛴𝑖=1 (𝑥𝑖 − 𝑥̅ )2
each observation and the sample mean, divided 𝑠 =
𝑛−1
by the sample size minus 1.
We divide by n-1 instead of n because it
generates an unbiased estimator for the
population variance. (don’t need to understand
it now)
Frequency distributions Rule 1: Classes are non-overlapping and inclusive.
Rule 2: Define k, the number of classes.
𝑅𝑎𝑛𝑔𝑒 𝐿𝑎𝑟𝑔𝑒𝑠𝑡 𝑁𝑢𝑚𝑏𝑒𝑟−𝑆𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑁𝑢𝑚𝑏𝑒𝑟
Rule 3: The class width is given by: =
𝑘 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑠
Histogram A histogram is a bar graph where each bar corresponds to a class and its height is proportional to the frequency in that class.
Numerical Summaries for Grouped Data: 𝑘 𝑘 1<
𝑘
∑𝑖=1 𝑓𝑖 (𝑚𝑖 − 𝜇) 2 ∑ 𝑓𝑖 𝑚𝑖 2
Population 𝑁 = ∑ 𝑓𝑖 𝑖=1 𝑓𝑖 2
𝜎2 = = − 𝜇2 = ∑ 𝑚 − 𝜇2
𝑖=1 𝑁 𝑁 𝑁 𝑖
𝑖=1
𝑘
Σ𝑖=1 𝑓𝑖 𝑚𝑖
𝜇= =
𝑁
, Numerical Summaries for Grouped Data: 𝑘 𝑘
∑𝑖=1 𝑓𝑖 (𝑚𝑖 − 𝑥̅ )2 𝑛 𝑘 𝑓
𝑖
2 (𝑚 − 𝑥̅ )2
Sample 𝑛 = ∑ 𝑓𝑖 𝑠 = = ∑
𝑛−1 𝑛 − 1 𝑖=1 𝑛 𝑖
𝑖=1
𝑘
Σ𝑖=1 𝑓𝑖 𝑚𝑖
𝑥̅ =
𝑛
Skewness Finding the shape of the distribution.
𝑛
Sample Skewness Right skewness → Skewness > 0
Left skewness → Skewness < 0 𝑛 (𝑥 − 𝑥̅ )3
𝑆𝑘𝑒𝑤𝑛𝑒𝑠𝑠 = ∑
(𝑛 − 1)(𝑛 − 2) 𝑠3
Typically, right (left) skewness implies mean > 𝑖=1
(<) median > (<) mode.
Sample Covariance Measure of linear relationship between two ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )(𝑦𝑖 − 𝑦̅)
variables. 𝑐𝑜𝑣(𝑥, 𝑦) =
𝑛−1
• 𝑐𝑜𝑣(𝑥, 𝑦) < 0 → negative linear association
• 𝑐𝑜𝑣(𝑥, 𝑦) > 0→ positive linear association
Sample Coefficient of Correlation The coefficient of correlation ranges from -1 to 𝑟𝑥𝑦 = 𝐶𝑜𝑣(𝑥, 𝑦)/𝑠𝑥 𝑠𝑦
+1 and
• rxy = 1 indicates a perfect positive linear
relationship;
• rxy = 0 indicates no linear relationship;
• rxy = -1 indicates a perfect negative
linear relationship.