MA317 - Class 1 - Wk 17
MA317 - Class 1 - Week 17
1. Six people each had the length of their right thumb and right middle finger
measured in cm. The results are given in the following table:
Person : A B C D E F
Length of thumb (cm) : 6 7 9 8 10 8
Length of middle finger (cm): 7 11 8 10 9 12
It is hypothesised that length of middle finger is linearly dependent on length
of thumb.
(a) State which of the above two variables is the predictor (regressor) and
which is the response variable in the above dataset. Justify your answer.
Solution:
Since it is ‘hypothesised that the length of middle finger is linearly depen-
dent on length of thumb’, this means that the middle finger depends on
the thumb length and hence, length of thumb is the predictor (x) and
length of middle finger is the response (y)
(b) Calculate the least-square estimate of the y-intercept a and of the slope b
for the simple linear regression y = a + bx.
Solution:
Sxy
The LSE are given by â = ȳ − b̂x̄ and b̂ = Sxx
.
1
x̄ = (6 + 7 + 9 + 8 + 10 + 8) = 8
6
1
ȳ = (7 + 11 + 8 + 10 + 9 + 12) = 9.5
6
X
Sxy = xi yi − nx̄ȳ
i
= [(6 × 7) + (7 × 11) + (9 × 8) + (8 × 10) + (10 × 9) + (8 × 12)] − 6 × 8 × 9.5
=1
X
Sxx = (xi )2 − n(x̄)2
i2
= 6 + 72 + 92 + 81 0 + 102 + 82 − 6 × 82
= 10
1
b̂ = = 0.1, â = 9.5 − 8 × 0.1 = 8.7
10
Hence, ŷ = 8.7 + 0.1x
2
, MA317 - Class 1 - Wk 17
(c) Work out SSR , the sum of squares of the residuals.
X
SSR = (yi − ŷ)2
i
X
= (yi − 8.7 − 0.1x)2
i
= (7 − 8.7 − 0.1 × 6)2 + (11 − 8.7 − 0.1 × 7)2 + . . . + (12 − 8.7 − 0.1 × 8)2
= 17.4
(d) Carry out a t-test where the null hypothesis is H0 : b = 0 and the
alternative hypothesis is H1 : b 6= 0. State your conclusion clearly.
Solution:
Test statistic we use is derived in Problem 3.2 in Week 17 Lecture Notes
is
s
(n − 2)Sxx
T = b̂
SSR
r
(6 − 2) × 10
= 0.1
17.4
= 0.1516
We comapre this to the value from the t-distribution given by tν;1−γ/2 ,
where ν is the degrees of freedom in our model, in this case this is
(n − 2) = 4 and γ is found by considering the (1 − γ)% significance
level. If we assume we wish to perform a 95% significance level, then
we have γ = 0.05. Hence, we look up the value of t4;0.975 = 2.776. As
T = 0.1516 < 2.776 = t4;0.975 we fail to reject H0 (the null hypothesis) at
the 95% significance level.
(e) Compute a 95% confidence interval for the slope parameter b, and interpret
this interval.
Solution:
CI for b are given at the bottom of page 16 in the lecture notes (week 17):
" s s #
SSR SSR
b̂ − tn−2;1−γ/2 , b̂ + tn−2;1−γ/2
(n − 2)Sxx (n − 2)Sxx
All values have been found previously (note we are using the same sig.
level as in part (d) and so we know tn−2;1−γ/2 = t4;0.975 = 2.776).
Subbing everything in gives the 95% CIs as [−1.731, 1.931]
3
MA317 - Class 1 - Week 17
1. Six people each had the length of their right thumb and right middle finger
measured in cm. The results are given in the following table:
Person : A B C D E F
Length of thumb (cm) : 6 7 9 8 10 8
Length of middle finger (cm): 7 11 8 10 9 12
It is hypothesised that length of middle finger is linearly dependent on length
of thumb.
(a) State which of the above two variables is the predictor (regressor) and
which is the response variable in the above dataset. Justify your answer.
Solution:
Since it is ‘hypothesised that the length of middle finger is linearly depen-
dent on length of thumb’, this means that the middle finger depends on
the thumb length and hence, length of thumb is the predictor (x) and
length of middle finger is the response (y)
(b) Calculate the least-square estimate of the y-intercept a and of the slope b
for the simple linear regression y = a + bx.
Solution:
Sxy
The LSE are given by â = ȳ − b̂x̄ and b̂ = Sxx
.
1
x̄ = (6 + 7 + 9 + 8 + 10 + 8) = 8
6
1
ȳ = (7 + 11 + 8 + 10 + 9 + 12) = 9.5
6
X
Sxy = xi yi − nx̄ȳ
i
= [(6 × 7) + (7 × 11) + (9 × 8) + (8 × 10) + (10 × 9) + (8 × 12)] − 6 × 8 × 9.5
=1
X
Sxx = (xi )2 − n(x̄)2
i2
= 6 + 72 + 92 + 81 0 + 102 + 82 − 6 × 82
= 10
1
b̂ = = 0.1, â = 9.5 − 8 × 0.1 = 8.7
10
Hence, ŷ = 8.7 + 0.1x
2
, MA317 - Class 1 - Wk 17
(c) Work out SSR , the sum of squares of the residuals.
X
SSR = (yi − ŷ)2
i
X
= (yi − 8.7 − 0.1x)2
i
= (7 − 8.7 − 0.1 × 6)2 + (11 − 8.7 − 0.1 × 7)2 + . . . + (12 − 8.7 − 0.1 × 8)2
= 17.4
(d) Carry out a t-test where the null hypothesis is H0 : b = 0 and the
alternative hypothesis is H1 : b 6= 0. State your conclusion clearly.
Solution:
Test statistic we use is derived in Problem 3.2 in Week 17 Lecture Notes
is
s
(n − 2)Sxx
T = b̂
SSR
r
(6 − 2) × 10
= 0.1
17.4
= 0.1516
We comapre this to the value from the t-distribution given by tν;1−γ/2 ,
where ν is the degrees of freedom in our model, in this case this is
(n − 2) = 4 and γ is found by considering the (1 − γ)% significance
level. If we assume we wish to perform a 95% significance level, then
we have γ = 0.05. Hence, we look up the value of t4;0.975 = 2.776. As
T = 0.1516 < 2.776 = t4;0.975 we fail to reject H0 (the null hypothesis) at
the 95% significance level.
(e) Compute a 95% confidence interval for the slope parameter b, and interpret
this interval.
Solution:
CI for b are given at the bottom of page 16 in the lecture notes (week 17):
" s s #
SSR SSR
b̂ − tn−2;1−γ/2 , b̂ + tn−2;1−γ/2
(n − 2)Sxx (n − 2)Sxx
All values have been found previously (note we are using the same sig.
level as in part (d) and so we know tn−2;1−γ/2 = t4;0.975 = 2.776).
Subbing everything in gives the 95% CIs as [−1.731, 1.931]
3