Intro Info
- Make Informed Decision
- The key elements is Data
- Overview of Topics:
- Descriptive Statistics: Describing and summarizing the data.
- Inferential Statistics: Generalizing the results to the population. The collected data most
likely resembles the true reality only
- Basic Statistics (Statistical methods):
- Why do we have Different statistical methods:
- Because the data consists of different types, for example: Blood pressure is measured
over a range of numbers
- Heart attack is characterized by a dichotomy between happened and not happened
- For each type, a different analysis method is suitable.
MOST IMP SELF STUDY TO DO:
- Video clips
- Course notes
- Book
Other:
- Lectures: link the videos and topics together
- Non new materials will be discussed in the lectures. Lectures will be recorded
- Training: just practice, not mandatory
- Seminars: Discussions of the statistical problems
Final:
- Exam: revise clips, seminars and do formative questions.
- Team assignment
, Descriptive Statistics
Lecture 1
Descriptive statistics: Turning Data into information
Week 1: Descriptive Statistics: parts
- Part 1: Describe and summarize a single variable?
- Part 2: Describe and summarize the relation between two quantitative variables?
Part 1:
Descriptive statistics: describing and summarizing a single variable.
- Types of data: nominal, ordinal, interval, ration
- Depending on the level of measurements, we decide what type of summary statistics
is suitable
- Graphs: Histograms, bar charts
- Theoretical distribution: skewed, symmetric
- Measures of variability: Variance, standard deviation
- Standardized score: Z score.
- Frequency tables: In distance, cumulative percent can not be obtained. In ordered data
from small to big we can obtain the cumulative percent.
- Cumulative Frequency
- Histogram: area under the curve add up to 100
➔ Describing the shape of the distribution in the graph:
- Right/Positive skew
- Symmetric
- Left/negative skew
, ➔ Summary measures for a Single Variable are typically expressed by:
1. Measures of central tendency (or measures of center):
- Mean
- Median
- Mode
2. Measures of spread:
- Range
- IQR
- Standard Deviation (SD): How far are the scores away from the mean
(How scores are deviated from the center). Low SD, All data are around the mean,
central peak of the graph. When we are comparing 2 things, if the average is the same, we
need another measure for the comparison. Only looking at the mean is not enough, so we
can use Standard deviation to make the comparison.
(If 2 means are identical, we use standard deviation, and we always choose the one of the
smaller standard deviation)
➔ Measure of Center:
➔ Summary of the Distribution: Boxplot:
- Only applies for one variable. Or if we have 2 variables we can have 2 box plots near in
parallel for comparison.
- It is a more quantitative way of describing the distribution of a variable.
- IQR= Q3-Q1. The median has to be in the middle of Q1 and Q3.
- If the whiskers are longer or not equal on both sides, then it is a Skewed Distribution.
➔ Bell-Shaped Distribution (Normal Distribution):
- Characteristic 1: Symmetrical: In Normal Distribution: Mean, Median and mode are
equal.
- Characteristic 2: Empirical Rule (68/95/99.7 % rule). Insert pic