1. Regression, correlation and hypothesis testing
1. 1. Exponential models
Regression lines can be used to model linear relationships between 2 variables. Sometimes
experimental data doesnt fit a linear model but still shows a pattern. Logarithms and coding
can be used to examine trends.
- Data in the form y = axn → log y = log a + nlog x
Some data can be modelled by exponential relationships of the form y = ab x. For this, the data
has to be coded using Y = log y and X = x to get a linear relationship
- If y = kbx → log y = log k + xlog b
1. 2. Measuring correlation
The strength of linear correlation between two variables can be quantified and measured with
the product moment correlation coefficient (r), which will be between -1 and 1.
- r = -1: perfect negative linear correlation
- r < 0: negative correlation
- r = 0: no linear correlation
- r > 0: positive correlation
- r = 1: perfect positive linear correlation
The closer r is to 1 or -1, the stronger the positive or negative correlation respectively.
PMCC (r) is calculated on the calculator. Menu 6 →2→optn→4.
If data is n/a → ignore (don't sub for 0)
1. 3. Hypothesis testing for 0 correlation
You can use a hypothesis test to determine if r for a particular sample indicates that there will
be a linear relationship within the whole population.
- r is used for the PMCC of a sample
- ρ is used for the PMCC of a whole population
- To know if ρ is > or < than 0, use a one tailed test─either
H0 : ρ = 0 H1 : ρ > 0
or H0 : p = 0 H1 : ρ < 0
- To know if ρ = 0 or not, use a two tailed test─
H0 : ρ = 0 H1 : ρ ≠ 0
The critical region for r for the hypothesis test can be found using the table of critical values
(formula booklet). The critical region depends on the significance level and the sample size.
1. State H0 and H1
2. Sample size
3. Find critical value from significance level (remember if 2 tailed significance/2)
4. Find r (what you’re testing)
, 5. accept/reject H0 and explain what this means
2. Conditional probability
2. 1. Set notation
● Mutually exclusive: P(A U B) = P(A) + P(B)
● Independent: P(A ∩ B) = P(A) x P(B) (given)
○ When events are not independent, use tree diagrams instead of venn diagrams
2. 2. Conditional probability
The probability of an event can change depending on the outcome of previous
events─modelled with conditional probability.
● Probability of B given A occurs: P(B I A)
○ Probability of B given A does not occur: P(B I A’)
● Independent events: P(A I B) = P(A) and P(B I A’) = P(B)
○ Since A or B occuring doesn’t affect the other
● P(B I A) = P(A ∩ B) → P(A ∩ B) = P(A) x P(B I A) (given)
, P(A)
Reasoning:
A B
P (B I A) = ____i____
a-i+i
a-i i b-i P(B I A) = i i
a
= P(B I A) = P(A ∩ B)
P(A)
→ P(A ∩ B) = P(A) x P(B I A)
2. 3. In Venn diagrams
1. 1. Exponential models
Regression lines can be used to model linear relationships between 2 variables. Sometimes
experimental data doesnt fit a linear model but still shows a pattern. Logarithms and coding
can be used to examine trends.
- Data in the form y = axn → log y = log a + nlog x
Some data can be modelled by exponential relationships of the form y = ab x. For this, the data
has to be coded using Y = log y and X = x to get a linear relationship
- If y = kbx → log y = log k + xlog b
1. 2. Measuring correlation
The strength of linear correlation between two variables can be quantified and measured with
the product moment correlation coefficient (r), which will be between -1 and 1.
- r = -1: perfect negative linear correlation
- r < 0: negative correlation
- r = 0: no linear correlation
- r > 0: positive correlation
- r = 1: perfect positive linear correlation
The closer r is to 1 or -1, the stronger the positive or negative correlation respectively.
PMCC (r) is calculated on the calculator. Menu 6 →2→optn→4.
If data is n/a → ignore (don't sub for 0)
1. 3. Hypothesis testing for 0 correlation
You can use a hypothesis test to determine if r for a particular sample indicates that there will
be a linear relationship within the whole population.
- r is used for the PMCC of a sample
- ρ is used for the PMCC of a whole population
- To know if ρ is > or < than 0, use a one tailed test─either
H0 : ρ = 0 H1 : ρ > 0
or H0 : p = 0 H1 : ρ < 0
- To know if ρ = 0 or not, use a two tailed test─
H0 : ρ = 0 H1 : ρ ≠ 0
The critical region for r for the hypothesis test can be found using the table of critical values
(formula booklet). The critical region depends on the significance level and the sample size.
1. State H0 and H1
2. Sample size
3. Find critical value from significance level (remember if 2 tailed significance/2)
4. Find r (what you’re testing)
, 5. accept/reject H0 and explain what this means
2. Conditional probability
2. 1. Set notation
● Mutually exclusive: P(A U B) = P(A) + P(B)
● Independent: P(A ∩ B) = P(A) x P(B) (given)
○ When events are not independent, use tree diagrams instead of venn diagrams
2. 2. Conditional probability
The probability of an event can change depending on the outcome of previous
events─modelled with conditional probability.
● Probability of B given A occurs: P(B I A)
○ Probability of B given A does not occur: P(B I A’)
● Independent events: P(A I B) = P(A) and P(B I A’) = P(B)
○ Since A or B occuring doesn’t affect the other
● P(B I A) = P(A ∩ B) → P(A ∩ B) = P(A) x P(B I A) (given)
, P(A)
Reasoning:
A B
P (B I A) = ____i____
a-i+i
a-i i b-i P(B I A) = i i
a
= P(B I A) = P(A ∩ B)
P(A)
→ P(A ∩ B) = P(A) x P(B I A)
2. 3. In Venn diagrams