100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Class notes

Advanced statistics - Notes from all the lectures

Rating
4.5
(2)
Sold
7
Pages
106
Uploaded on
28-10-2020
Written in
2020/2021

This document contains notes from all the lectures (1 to 12), some interesting notes and tips from the computer practicals, as well as notes from the pen and paper practicals (PPP). Information from some knowledge clips are included already in the notes. R output is included, so it is easier to know what to look at for the computer practicals and for the exam. These are notes of the academic year , so based on the new version of study guide. With these notes, you don't even need to watch the lectures. So it works perfectly for every period: morning or afternoon course.

Show more Read less
Institution
Course









Whoops! We can’t load your doc right now. Try again or contact support.

Connected book

Written for

Institution
Study
Course

Document information

Uploaded on
October 28, 2020
Number of pages
106
Written in
2020/2021
Type
Class notes
Professor(s)
Unknown
Contains
All classes

Subjects

Content preview

Lecture 1a – advanced statistics
Main aim: Inference (= draw conclusions about a population or about a general phenomenon based on a
limited number of observations, which are the sample data)

3 different situations for t-procedures (confidence interval and t-tests):
- one sample, one mean (e.g. the mean body weight of all 6 years old boys in the NL)
- paired observations, mean difference (e.g. data of twins or before and after study)
- two independent samples, difference in mean (e.g. two populations: difference in exam scores in
males and females, which is a typical observational study/ research)

1. Inference (1 sample)
Take a random sample (sample data which is representative for the whole population). The noise is different
for each sample data, but some noise makes them a bit different.
à Conclusions of inference are partly based on ‘noise’, introducing a level of uncertainty in the conclusions.
That is why we do tests with ‘significance level α’ and have 0.95 confidence intervals (necessary for the
uncertainty that the random samples take)

2. Confidence intervals
1) Explain what a confidence interval for a parameter means
2) Specify the general pattern of a confidence interval (the 4 elements of t-procedures)
a. Parameter of interest = what you want to know, what you want to draw a conclusion from
= something that describes the population
b. Estimator (= method of estimation) – how to estimate the parameter from the data (it’s a
method, a formula)
c. Standard error of the estimator (= how certain we can be about the estimate)
d. Degrees of freedom (= in estimating the spread) for the t-distribution
3) Apply this pattern to a specific problem (calculate the limits of the interval) à know “which
situation” to apply

Situation 1 – 1 sample situation
E.g. What is the mean body hight in Wageningen students?
à answered by doing a confidence interval

Step 1: take a random sample of male students of 25 males
à draw conclusions about a large population based on the 25 observations

Sampling terminology
• We are interested in the mean of one trait (body height) in one population (e.g. all male WUR
students)
• The students are the sampling units
• The response is body height, measure per student (so the student is also the observed or
measurement unit)
• The scientist draws conclusion about the population mean (of body weight) based on one random
sample = ‘one-sample situation’ = one population, one mean
• The population is a physical population
• The type of research is observational

Parameter of interest: mean body height of all male WUR students = mu or μy with y being the height

Step 2: to determine the confidence interval, we need the summary statistics of the data set
Sample size: n=25
Sample mean: y barre = 184
Sample standard deviation: s=9 (= how variable the values are)



1

, • A confidence interval is a range of values for a parameter, a range of values for the parameter that
we have “confidence” in
• The confidence level (1- α) is often 0.95 (α is 0.05 = 5%)
• The width of a confidence interval reflects the precisions of the estimate: precise estimate = narrow
interval
• Bounds or limits of the interval are random: they depend on the units that are drawn in the sample.

• The 0.95 (1- α): the interval is constructed such that the probability that the interval will contain the
true parameter value 0.95. Imagine many repeats of the experiment. In each repeat we have new
data and a new interval. Of all these intervals, 95% will then contain the true parameter value. In
practice we only have one sample. It’s about the method and not the outcome of the confidence
interval

• A CI is typically of the form: best guess (estimate) +- error margin




E.g. Is there a difference in mean body height of male students compared to 1980 (when it was 180cm)?
à answered by doing a t-test

Situation 2 – paired data
Blood pressure change: a physician records the blood pressure before (x) and after 2 weeks (y) of medication
use for 16 patients: d = x-y (regarded as a random sample)
Q1: What is (in general, or ‘in the population’) the change in mean blood pressure after medication use (μx – μy),
or what is the mean change in blood pressure (μd) after medication use?
à μx – μy is the change in mean and μd is the mean change à the two are the same
à we make a two-sided confidence interval for μd
à parameter of interest is the difference in mean blood pressure before and after medication use μd

Q2: does mean blood pressure in the population go down after medication use? = μx –μy > 0? or μd > 0? à we
need to do a one-sample t-test

NB1: for paired data, the observations (x and y) within the pair are not independent; they belong to the same
unit and will be correlated. This ‘problem’ is solved by using the d-values (values of the differences)
NB2: If the sample would be random (in this case it was not. That’s why it’s important that they regard this
sample as random), the patients are independent units

Paired data design = 1 sample situation for d
• Patients were not randomly selected. We should check gender, age, weight... to see if the sample
may well represent the population.



2

Reviews from verified buyers

Showing all 2 reviews
3 year ago

4 year ago

It contains everything you need to know in short, but detailed alinea's.

4.5

2 reviews

5
1
4
1
3
0
2
0
1
0
Trustworthy reviews on Stuvia

All reviews are made by real Stuvia users after verified purchases.

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
louise_s Wageningen University
Follow You need to be logged in order to follow users or courses
Sold
10
Member since
5 year
Number of followers
7
Documents
17
Last sold
2 months ago

4.5

2 reviews

5
1
4
1
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions