Class notes

Advanced Statistics and 'R' - lecture - up to week 5

Rating

Sold

Pages

Uploaded on

16-11-2024

Written in

2024/2025

These are the notes taken during the Continuing Statistics and “R” lectures. The course is taught at Utrecht University.

Institution

Course

Whoops! We can’t load your doc right now. Try again or contact support.

Report Copyright Violation

Connected book

Michael C. Whitlock, Dolph Schluter The Analysis of Biological Data

Edition:Unknown
ISBN:9781319154219
Edition:Unknown

Written for

Institution: Universiteit Utrecht (UU)
Study: Biologie
Course: Voortgezette Statistiek en R (BB2VSR)

All documents for this subject (21)

Document information

Uploaded on: November 16, 2024
File latest updated on: January 14, 2025
Number of pages: 40
Written in: 2024/2025
Type: Class notes
Professor(s): Erika tsingosi & yann hautier
Contains: All classes

Subjects

statisiek
hypothese
analysis
data
whitlock
schluter
third edition
advanced statistics and r
biology
advanced statistics
rstudio
function
formula
r

Content preview

LECTURE 1 – WEEK 1: about this course + R and RStudio VSR

SEMESTER 1 | ERIKA TSINGOSI + YANN HAUTIER

ABOUT THIS COURSE

● learning goals
○ expand statistics toolbox
○ reason about appropriate experimental approaches
and statistical tools
○ critically evaluate analyses and outputs
○ learn the basics of data science
○ master the tools for creating a reproducible analysis
in R
● final grade must be ≥ 5.5
○ attend all computer sessions ● ?lm or help(lm) gives you help on the lm function
○ complete weekly quizzes ● helpful sources
○ complete all 4 hand-in assignments (30%) ○ http://tryr.codeschool.com/
○ pass the exam with a grade ≥ 5.5 (70%) ○ http://www.cookbook-r.com/
○ https://thecrashcourse.com/courses/what-is%20stat
istics-crash-course-statistics-1/
R AND RSTUDIO
● outlier has a huge impact on linear regression

● statistics are done in R not RStudio; Rstudio is the tool
○ RStudio is an IDE for R
● you have to annotate your script using #
● R automatically creates a code when you click on Import
Dataset which you need to paste in the script and save
● library() #get a list of all installed packages
● install.packages("ggplot2") #to install a package
● library(‘’ggplot2’’) #to load a package
○ no need to install a package again after it has been
installed, but it’s important to load it again
● hand in assignments need to be in pdf

● ggplot2 = grammar of graphics
○ https://ggplot2.tidyverse.org/
○ http://www.cookbook-r.com/Graphs/

1

, LECTURE 2 – WEEK 1: fundamentals of statistics VSR

SEMESTER 1 | ERIKA TSINGOSI + YANN HAUTIER

○ n – 1 is used because you might get outliers by chance
FUNDAMENTALS OF STATISTICS when you take a sample
⎯ sample measurements are on average closer to their
own mean than to the true mean of the population;
SAMPLING
subtracting by 1 can correct for that bias when the
sample size is small
● flowchart of a study ○ Ȳ is a random variable
○ execution; while you're collecting data, you should already
make plots to see if the data makes sense

● to find the distribution of Ȳ we sample multiple times
○ when you add the different samples, you get the sampling
distribution of the sample mean Ȳ
○ sampling distribution of Ȳ is a t-distribution
⎯ t-distribution has a lower peak and fatter tails than the
normal distribution
● statistics quantifies uncertainty; statistics is about making sense
of the variation (in samples)
○ descriptive statistics quantify
⎯ location or central tendency of data; mean, median
⎯ spread of the data; range, standard deviation
○ comparative statistics
⎯ compare different groups
⎯ based on location and spread
⎯ How likely is the sample compatible with our
expectation?
● population distribution; ideal value we’d know if we’d have
perfect knowledge of measured individuals (almost always
impossible to measure)
○ most common measures for location and spread for a
population distribution are mean (μ) and standard
deviation (σ)
⎯ μ: sum of each individual measurement divided by the ○ mu hat: mean of the sampling distribution of Ȳ
total number of measurements which is an estimate for the population mean (μ)
⎯ σ: you square each measurement subtracted from the ⎯ you sum the sample means and divide it by the
mean → sum the squares → divide it by the total
number of samples
number of measurements → take square root;
○ σȲ = standard deviation of the population / square
standard deviation is the square root of the variance
⎯ population μ and σ are constant (they are not random root of the sample size
and don’t change, because the population is always ○ we usually don’t have the population standard
the same) deviation σ → that's why we estimate the spread of
Ȳ (standard error of the mean) with the sample
standard deviation s
⎯ the estimate is the standard error of the mean =
sample standard deviation / square root of the
sample size

● sample distribution; random subset of the population
○ sample mean (Ȳ) and sample standard deviation (s)

2

$6.04

Get access to the full document:

100% satisfaction guarantee

Immediately available after payment

Both online and in PDF

No strings attached

Get to know the seller

Humulus

3.9

(16)

Get to know the seller

Humulus Universiteit Utrecht

View profile

Sold

179

Member since

6 year

Number of followers

114

Documents

Last sold

1 month ago

Sterk studeren

Door middel van deze samenvattingen heb ik altijd hoge cijfers gehaald. Ze zijn nu voor iedereen beschikbaar, en ik hoop dat je er veel aan hebt! Hoewel ik nauwkeurig werk, kan het voorkomen dat er een foutje in zit. Laat het vooral weten, en dan pas ik het aan. Veel plezier!

3.9

16 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller Humulus. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $6.04. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 46153 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 15 years now

Advanced Statistics and 'R' - lecture - up to week 5

Connected book

Written for

Document information

Subjects

Content preview

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?