Summary

Samenvatting 1e Ma: R-studio commando’s + vb'en

Rating

Sold

Pages

Uploaded on

06-11-2024

Written in

2024/2025

Voor toegepaste biostatistiek in de 1e master Biomedische Wetenschappen. Alle commando's uit de 6 video's uitgeschreven in een document met extra uitleg & toegepaste voorbeelden uit de video's.

Institution

Course

Content preview

Commando’s R-studio

Inhoud
1. Simpele lineaire regressie: dataset met 1x en 1y ....................................................................2
1.1. Regressierechte weergeven ........................................................................................2
1.2. Samenvatting van regressie.........................................................................................2
1.3. F-test voor simpele lineaire regressie ...........................................................................2
2. Simpele lineaire regressie: dataset met meerdere x’en en 1y ..................................................2
2.1. Aanmaken subset ......................................................................................................2
2.2. Simpele lineaire regressie op 2 variabelen ....................................................................2
2.3. Meervoudige lineaire regressie ...................................................................................2
2.4. Berekenen van confidentie-interval voor x-variabele ......................................................2
2.5. Berekenen van predictie-interval voor y van 1 individu...................................................2
2.6. Berekenen van confidentie-interval voor gemiddelde x van een groep personen ...............2
3. Assumpties voor simpele & multipele lineaire regressie .........................................................3
3.1. Normale verdeling residuals voor x1 en y .....................................................................3
3.2. Residual plot .............................................................................................................3
3.3. Cooks afstand............................................................................................................3
3.3.1. Wanneer heeft een observatie een hoge invloed? .....................................................3
3.4. Simpele lineaire regressie op nieuwe dataset zonder outliers .........................................3
4. ANOVA, ANCOVA en correlatie ............................................................................................4
4.1. Kolom als categorische variabele instellen ....................................................................4
4.2. Nieuwe dataset maken zonder bepaalde waarden .........................................................4
4.3. ANOVA .....................................................................................................................4
4.4. T-testen ....................................................................................................................4
4.4.1. Zonder correctie ....................................................................................................4
4.4.2. Met Bonferroni correctie ........................................................................................5
4.5. Assumpties voor ANOVA ............................................................................................5
4.5.1. Normaliteit respons variabele in elke groep ..............................................................5
4.5.2. Homogeniteit van varianties: Bartlett test .................................................................5
4.6. Kruskal-Wallis test: non-parametrische test ..................................................................5
4.7. ANCOVA ...................................................................................................................5
4.8. Berekenen correlatiecoëfficiënt ...................................................................................6
5. Logistische regressie, sensitiviteit en specificiteit ...................................................................6
5.1. Logistische regressie ..................................................................................................6
5.2. Odds ratio, sensitiviteit en specificiteit .........................................................................6

, 5.2.1. Odds ratio .............................................................................................................6
5.2.2. Sensitiviteit & specificiteit.......................................................................................6

1. Simpele lineaire regressie: dataset met 1x en 1y
simple <-lm (NAAM y-variabele ~ NAAM x-variabele)
simple => geeft coëfficiënten rechte weer
1.1. Regressierechte weergeven
abline (simple, col = “pink/red/…”)

1.2. Samenvatting van regressie
summary (simple)
1.3. F-test voor simpele lineaire regressie
anova (simple)

2. Simpele lineaire regressie: dataset met meerdere x’en en 1y
2.1. Aanmaken subset
NAAM SUBSET <- subset(dataset, conditie)
Conditie = aan wat moet een x voldoen om in deze subset te horen?
2.2. Simpele lineaire regressie op 2 variabelen
plot (NAAM DATASET/SUBSET$x1, data=NAAM DATASET/SUBSET)
simple_naam <-lm (y ~x1 , data = NAAM DATASET/SUBSET)
summary (simple_naam)

2.3. Meervoudige lineaire regressie
multiple < -lm (y~ x1 + x2 + … + xk, data = NAAM DATASET/SUBSET)
 Geef hierin alle x’en waarmee je meervoudige regressie wilt uitvoeren
2.4. Berekenen van confidentie-interval voor x-variabele
confint (multiple, “NAAM X”, level = 0,…)
2.5. Berekenen van predictie-interval voor y van 1 individu
predict (multiple, data.frame( criteria), interval = “prediction”, level = 0,..)
2.6. Berekenen van confidentie-interval voor gemiddelde x van een groep personen
predict (multiple, data.frame (criteria), interval = “confidence” , level = 0,..)

,3. Assumpties voor simpele & multipele lineaire regressie
3.1. Normale verdeling residuals voor x1 en y
shapiro. test(residuals(simple_naam))
H0: normale verdeling vs H1: niet-normale verdeling

3.2. Residual plot
par (mfrow = c(GETAL, GETAL))
plot (simple_naam)

3.3. Cooks afstand
cooksd <- cooks.distance(simple_naam)
cooksd
dev.off()
plot(simple_naam, 4)
3.3.1. Wanneer heeft een observatie een hoge invloed?
3.3.1.1. Cooks afstand > 4/(n-p-1)
= cutoff1
cutoff1 <- 4/(654-1-1)
cutoff1
- Je krijgt een getal => kijk naar plot waar alle data ligt
- Vb. dataset 2: cutoff1 = 0,00613
- Outliers verwijderen:
influential <- as.numeric(names(cooksd)[cooksd > cutoff1]) = hoeveel
getallen hebben een Cooks afstand > cutoff1
NAAM_DATASETbis <- DATASET[-influential, ] = data – influential getallen
nrow(NAAM_DATASETbis) = hoeveel rijen
nrow(DATASET)-nrow(NAAM_DATASETbis) = observaties die verwijderd zijn
= alle observaties verwijderen die een Cooks afstand > 4/(n-p-1) hebben
3.3.1.2. Cooks afstand > 1
- Outliers verwijderen:
influential <- as.numeric(names(cooksd)[cooksd > 1])
NAAM_DATASETtre <- data2[-influential, ]
nrow(NAAM_DATASETtre)
nrow(DATASET)-nrow(NAAM_DATASETtre)
 Als dit te liberaal is, moet je andere cutoff kiezen
3.4. Simpele lineaire regressie op nieuwe dataset zonder outliers
- Kies eerst een goede cutoff (zie hierboven 2 scenario’s)

, - Verwijder outliers (zie hierboven) & geef dataset andere naam
- Voer simpele lineaire regressie uit:
simple_naam <-lm(y ~ x, data=NAAM_DATASETnr….)
summary(simple_naam)
- Zijn residuals normaal verdeeld?

4. ANOVA, ANCOVA en correlatie
4.1. Kolom als categorische variabele instellen
data$factor_kolom <- as.factor(NAAM_DATASET$kolom)
4.2. Nieuwe dataset maken zonder bepaalde waarden
newdata<-subset(NAAM_DATASET, x != getal & x !=getal)

sort(newdata$x)

Bv. stel je wilt alle data behalve waarden 99 en 13:

newdata<-subset(data, x != 99 & x!=13)

= specificeer aan welke condities er moet voldaan worden om in deze subset te horen

In dataset3 bv. wil je alle waarden in de kolom ‘maxfwt’ behalve die van 99 en 13, dan zou x =
maxfwt zijn (op examen gegeven indien dit nodig is)

4.3. ANOVA
= om na te gaan of er een verschil is tussen verschillende groepen
model1<-aov(respons variabele ~ categorische variabele, data=newdata3)
summary(model1)
- Geef naam aan model: bv. model1
- Tussen de haken: respons variabele + categorische variabele (factor)
- Summary geeft de p-waarde, Between SS, Within SS, Between MS en Within MS

- R-studio creëert dummy’s vanzelf => n groepen met n-1 dummy’s als je regressie
uitvoert ipv ANOVA

4.4. T-testen
= kijken of de verschillende groepen onderling verschillen

4.4.1. Zonder correctie
pairwise.t.test(NAAM_DATASET$responsvariabele, NAAM_DATASET$categorischevariabele,
p.adj= "no")

Report Copyright Violation

Written for

Institution: Katholieke Universiteit Leuven (KU Leuven)
Study: Biomedische Wetenschappen
Course: Toegepaste Biostatistiek

All documents for this subject (13)

Document information

Uploaded on: November 6, 2024
Number of pages: 7
Written in: 2024/2025
Type: SUMMARY

Subjects

r studio
commandos

$7.50

Get access to the full document:

100% satisfaction guarantee

Immediately available after payment

Both online and in PDF

No strings attached

Get to know the seller

Biomedstudent2002

3.5

(2)

Get to know the seller

Biomedstudent2002 Katholieke Universiteit Leuven

View profile

Sold

Member since

1 year

Number of followers

Documents

Last sold

5 days ago

3.5

2 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller Biomedstudent2002. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $7.50. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 50176 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

Samenvatting 1e Ma: R-studio commando’s + vb'en

Content preview

Written for

Document information

Subjects

Get to know the seller

Trending documents

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?