Summary

Advanced Statistics (MAT20306) summary ALL LECTURES

Rating

Sold

Pages

Uploaded on

18-01-2026

Written in

2025/2026

Wageningen university, Advanced Statistics (MAT20306) summary of ALL lectures. Summary with lecture notes and the examples mentioned during the lectures. This document covers Confidence intervals and Hypothesis Testing, Sample size calculations Wilcoxon rank tests, One and two porportions, Chi square test & correlation, Linear models:: simple linear regression, Multiple Linear Regression, Multiple linear regression , One-way analysis of variance, pairwise comparisons, non-parametric F-test, Two-way ANOVA aka factorial ANOVA, Block design & relative efficiency (RE, Quantitative and categorical x-variables ANCOVA / General Linear Model

Show more Read less

Institution

Course

Whoops! We can’t load your doc right now. Try again or contact support.

Report Copyright Violation

Connected book

Micheal Longnecker, R. Ott An Introduction to Statistical Methods and Data Analysis

Edition:Unknown
ISBN:9780357670620
Edition:7

Written for

Institution: Wageningen University (WUR)
Study: Aquaculture and Marine Resource Management
Course: Advanced Statistics (MAT20306)

All documents for this subject (1)

Document information

Summarized whole book?: No
Which chapters are summarized?: Everything covered during the course
Uploaded on: January 18, 2026
Number of pages: 73
Written in: 2025/2026
Type: Summary

Subjects

statistiek
statistische analyses
regressie analyses
liniaire modellen
non parametric test
anova
ancova
chi square
wilcoxon
regresion analysis
statistics
advanced statistics
general linear models
block

Content preview

Content
Lecture 1: Confidence intervals and Hypothesis Testing........................................2
Lecture 2: Sample size calculations Wilcoxon rank tests........................................9
Lecture 3: One and two porportions.....................................................................18
Lecture 4: Chi square test & correlation...............................................................24
Lecture 5: Linear models:: simple linear regression.............................................31
Lecture 6: Multiple Linear Regression 1................................................................41
Lecture 7: Multiple linear regression 2.................................................................44
Lecture 8: One-way analysis of variance, pairwise comparisons, non-parametric F-
test....................................................................................................................... 48
Lecture 9: Two-way ANOVA aka factorial ANOVA.................................................53
Lecture 10: Block design & relative efficiency (RE)..............................................60
Lecture 11: Quantitative and categorical x-variables ANCOVA / General Linear
Models.................................................................................................................. 66

, Lecture 1: Confidence intervals and Hypothesis Testing

What is a confidence interval? A confidence interval for a population parameter gives a range of
plausible values for that parameter based on the sample. Values inside the interval are plausible
parameter values given the observed sample.

Frequentist interpretation: A 1−α (for example, 95%) confidence interval procedure means: If we
repeated the exact sampling and interval-construction process many times (say 100 times), then
about 100×(1−α) of those intervals would contain the true population parameter.
So for a 95% CI: “We are 95% confident that the true parameter is inside this interval.” This is not the
same as saying there is a 95% probability that the particular interval you computed contains the
parameter, the probability statement refers to the procedure over repeated samples.

General formula for a two-sided t-based CI for a mean or difference of means
For many t-procedures the two-sided 100(1−α)% confidence interval has the form:

estimate ± t df (α /2)×standard error
 estimate = the point estimate (e.g., x́ for a single mean, or x́ 1−x́ 2for a difference of means).
 t df (α /2)= critical value from the Student’s t distribution with appropriate degrees of freedom, for
the two-tailed α-level.
 standard error = depends on the problem (see formulas below).
Factors that make a CI narrower (more precise): larger sample size n ,
smaller variability in the data (smaller s), and lower confidence level
(smaller 1−α) — but lowering confidence level reduces reliability.

The t distribution and degrees of freedom: The t distribution is
similar to the normal distribution but has heavier tails; it is used
when the population standard deviation σ is unknown and estimated
from the data. As sample size (or degrees of freedom) grows, the t
distribution approaches the normal distribution. degrees of freedom determine exact shape of t-
distribution

Degrees of freedom (df) quantify how well the standard deviation sis estimated; more df → closer to
normal. Typical df:
o One-sample mean or paired differences: df = n−1.
o Two-sample pooled t (equal variances assumed): df = n1 +n 2−2.
o Welch’s (unequal variances): a complicated approximation (Welch–Satterthwaite
formula), typically non-integer. See formula below.
Intuitively: df reflect how much independent information you had to estimate variability.

,Standard errors: formulas you must know
s
1. One-sample mean: SE(x́)=
√n
where sis the sample standard deviation and n is sample size.

2. Paired t (differences)
o Convert paired observations to differences d i =x i , after −x i ,before .
1 sd
o Use one-sample formulas on differences: d́= ∑d i ,SE( d́ )=
n √n
o df = n−1where n is the number of pairs.

3. Two-sample t with equal variances (pooled)
(n1−1) s12+(n2−1) s 22
2
o Pool the sample variances to get a pooled standard deviation: s = p
n1+ n2−2
√
and s p= s2p .

o Standard error of the difference of means: SE(x́ 1−x́ 2)=s p

df = n1 +n 2−2.
√ 1 1
+
n1 n 2

4. Two-sample t without equal variances (Welch’s t)

o Do not pool variances. Use: SE=
√ s21 s22
+
n1 n2
s1 s 2
2 2 2
+ ) (
o Approximate degrees of freedom using Welch–Satterthwaite: n1 n2
df ≈
¿¿¿
(This yields a positive real number; statistical software uses this.)
o Welch’s test is default in R and is safer when variances differ.

Sampling distribution of the difference between two sample means
We consider two independent samples:
 Sample 1: size n1 , sample mean ý 1, population variance σ 1
2

 Sample 2: size n2 , sample mean ý 2, population variance σ 22
We are interested in the statistic: ý 1− ý 2
This is an estimator of the population difference: μ1−μ 2

The sampling distribution of ( ý 1− ý 2 )is approximately normal for large samples because of the
Central Limit Theorem (CLT): Each sample mean is approximately normal when the sample size is
large or the population is normal. And the difference of two normally distributed variables is also
normally distributed. So: ý 1− ý 2 ≈ Normal distribution

The expected value (mean) of ý 1− ý 2is: μ ý − ý =μ1 −μ 2
1 2

This makes intuitive sense because on average, a sample mean estimates its population mean.

, Therefore, the difference of two sample
means estimates the difference of two population means.

√
2 2
σ σ
The standard error of the sampling distribution is: σ ý − ý = 1 + 2
1 2
n1 n 2
Why this formula? The variance of a sample mean is σ 2 /n. Since the samples are independent,
variances add then take the square root to get the standard error. This formula is the general case
when variances are not assumed equal.

When we assume the two population variances are equal, we simplify: σ 21=σ 22 =σ 2

1 2
√
In that case: σ ý − ý = σ (
2 1 1
+ )
n1 n2
But we do not know σ 2, it’s a population value. So we must estimate it using sample data. That’s
where the pooled standard deviation comes in.

Since we assume that both populations have the same variance and the best estimate of that
common variance is a pooled (combined) estimate.

√
2 2
(n1−1)s1 +(n2−1)s2
Definition shown in the slide: s p=
n 1+ n2−2

Meaning:
2 2
 We take each sample’s variance s1 , s 2
 Weight them by degrees of freedom ni −1
 Average them
 Then take the square root
This is a more accurate estimate of a shared variance than using either sample alone.
Degrees of freedom for the pooled variance: df =n1 +n2−2
This matches how many independent pieces of information were used in estimating the common
variance.

Once you have s p , the standard error of the sample difference becomes: SE( ý 1− ý 2 )=s p

This is the formula used for a pooled t-test or CI for two means with equal variances
√ 1 1
+
n 1 n2

Confidence interval for μ1−μ 2(equal variances)
The slide shows the formula:
Where:

$15.98

Get access to the full document:

100% satisfaction guarantee

Immediately available after payment

Both online and in PDF

No strings attached

Get to know the seller

lunafields

4.0

(3)

Get to know the seller

lunafields HAS Den Bosch

View profile

Sold

Member since

3 year

Number of followers

Documents

Last sold

7 months ago

4.0

3 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller lunafields. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $15.98. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 57791 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

Advanced Statistics (MAT20306) summary ALL LECTURES

Connected book

Written for

Document information

Subjects

Content preview

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?