Section 1: Introduction to Statistics and Probability Theory
Statistics the science of analyzing data in whose generation chance has taken some part
Central to modern science
o Why? Science involves lots of randomness. Ex: blood pressure
Because of the randomness, statistics is tied to probability theory
We’re not prepared to handle probability
o Coin flip problem
o Monty Hall problem
Example 1: Is there a difference in the mean blood pressure between men and women?
Men: 123, 142, 122, 161, 119, 127, 136
Women: 144, 118, 122, 131, 155, 152, 110
There are person-to-person differences
not all men and not all women have the same blood pressure
A week later, we probably would not get exactly the same set of values from the same
people
This is due to the randomness of the sampling procedure
Example 2: What is the effect of the amount of water given to a plant and its eventual growth height?
Random differences due to sun, soil, humidity, etc.
lead to varied data
Probability Theory:
Example 1: Coin Flip Question
Q: You flip two coins, and at least one of them is H. What is the probability that both are H?
A: 1/3rd.
H H
H T
T H
T T This cannot be a possibility because we know one must be H.
Left with 3 different outcomes. 1/3rd chance of each.
,Example 2: Monty Hall Problem
Question: There are 3 doors…. Etc.
Outcome: You are twice as likely to get the prize by switching doors as by staying.
Probability vs. Statistics:
Probability is deductive. uses deductions or implications
If this coin is fair, the probability of getting 1072 or more H is 0.0006
Starts with some assumption about reality Calculates probabilities associated with probable data
Statistics is inductive. uses inductions or inferences
I flip the coin 2000x, and I got 1072 H (data). Based on this probability calculation
(0.0006), I have good evidence that this coin is not fair.
Makes some inference about reality Starts with data
Statistical inference / induction any conclusion that we draw from data derived in a situation
involving chance, or randomness
Section 2: Probability Theory
Event something which does or does not happen when some experiment is performed
Example 1: Ask 2000 people who they will vote for in the upcoming election
Events that could occur
More say they will vote for Hillary than Donald
> 1200 vote Hillary
1124 vote Donald
Event Notation: Uppercase letters
A is the event ___ happens.
B is the event ___ happens.
S = event sure to happen
ᴓ = empty event = impossible
,Unions of Events: (D U E)
Both D and E occur
Intersection of Events: (D ∩ E)
The overlap between D and E occurs
Complement of D: Dc
D does not occur
Section 3: Probabilities of Events
Derived Events
Prob (D U E) = Prob (D) + Prob (E) – Prob (D ∩ E)
Prob (Dc) = 1 – Prob (D)
Prob (S) = 1
Prob (ᴓ) = 0
Mutually Exclusive Events
Two events are mutually exclusive if they cannot occur together
Must be dependent
Prob (D U E) = Prob (D) + Prob (E)
Prob (D ∩ E) = 0, if (D ∩ E) = ᴓ
Independent events
Two events are independent if Prob (D ∩ E) = Prob (D) x Prob (E)
If two events are independent, and you know that one of them has occurred, it does not change the
probability of the other event from happening
Example Probability Calculation
Q: A fair sided die is to be rolled twice. What is the probability that the sum of the two numbers
to turn up is 6?
A: The sum can be 6 in 5 mutually exclusive ways:
1 and 5 4 and 2 3 and 3
2 and 4 5 and 1
, The number to turn up on the first roll is independent of the number to turn up on the second
1 1 1
roll. Thus, the probability of each of the above 5 events is: × = .
6 6 36
1 1 1 1 1 5
36 36 36 36 36 36
Prob (Sum = 6) = + + + + =
Conditional Probabilities
The probability of some event D, given that some other event E has already occurred.
Denoted Prob (D|E)
Prob (D ∩ E)
Prob (E)
Prob (D|E) =
If events D and E are independent, Prob (D|E) = Prob (D)
If Prob (D|E) ≠ Prob (D), then D and E are not independent
Mutually exclusive Events must be dependent
Coin Flip Example:
Q: A fair coin is to be flipped twice. What is the probability of a H on both flips, given that there
is at least one H?
D: both H HH
E: one H HH, HT, TH
(D ∩ E) = D
1 1 1
Prob (D ∩ E) ( )( )
2 2 4
Prob (E) 1 1
3( )( 2)
3
A: Prob (D|E) = = = = 1/3
2 4
Section 4: Probability, One Discrete Random Variable
A RV is either discrete or continuous
Discrete RV – a conceptual and numerical quantity that, in some future experiment involving
chance, or randomness, will take one value from some discrete set of possible values
can only take one of a discrete set of numbers
Number of H we will get tomorrow if we flip n coins
the probability of this set of numbers can be known or unknown
A RV is “conceptual” because it is something that is only in our mind