College aantekeningen

Lecture/seminar/powerpoint notes statistics (LCX046B05)

Name: Lecture/seminar/powerpoint notes statistics (LCX046B05)
SKU: doc_1356795
Rating: 3.00 (1 reviews)
Author: hLianne

Beoordeling

3,0

(1)

Verkocht

Pagina's

Geüpload op

26-10-2021

Geschreven in

2021/2022

An overview of theory of the course Statistics: a combination of notes from the lectures, seminars, practicum assignments, and the powerpoints. Furthermore it includes an overview of functions used in R, and formulas and abbreviations.

Meer zien Lees minder

Instelling

Vak

Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Meld schending auteursrecht

Geschreven voor

Instelling: Rijksuniversiteit Groningen (RuG)
Studie: Communicatie- En Informatiewetenschappen
Vak: Statistics (LCX046B05)

Alle documenten voor dit vak (8)

Documentinformatie

Geüpload op: 26 oktober 2021
Bestand laatst geupdate op: 3 november 2021
Aantal pagina's: 20
Geschreven in: 2021/2022
Type: College aantekeningen
Docent(en): Multiple professors
Bevat: Alle colleges

Onderwerpen

statistics
r studio
normal distribution
t test
z test
effect size
hypotheses
p value
non parametric tests

Voorbeeld van de inhoud

Introduction to R
Statistics is used to make sense of data: (I) descriptive statistics, summarizing the data, and (II)
inferential statistics, assessing relationships in data. For this, data needs to be variable.

R has some basic functionalities:
 calculator: addition (5 + 5), multiplication (5 * 3), or division (5/3)
 <- to store a value in a variable (= assignment operator).
- E.g. storing value 5+5 in variable a: a <- 5 + 5
 c() to create a vector.
 <- c() to store multiple values in a variable by using a vector.
- E.g. storing values 2, 4, 5 in variable b: b <- c(2, 4, 5)
 mean() to calculate the mean.
 help() or ?... to get information about a specific function.
 setwd() to set your working directory.
 getwd() to see where R stores the file.
 # to indicate that the code or the information stated behind it is an addition.
 read.csv(…, sep = “;”, dec = “.”) to import data (make sure you save your excel as a
csv document).
 load(“”) to load a dataset.
 ls() to list variables in your workspace.
 class() to give an overall structure.
 dim() to see number of rows and columns (1st number is rows – observations, 2nd number is
columns – variables).
 nrow() to see number of rows (= sample size).
 ncol() to see number of columns.
 object.size() to see how much space the dataset is occupying in memory.
 names() to return a character vector of column (i.e. variable) names.
 head() to preview top of the dataset.
 head(…, number) to get a certain number of first rows.
 tail() to preview end of the dataset.
 tail(…, number) to get a certain number of end rows.
 dat[x, y] to see parts of table: x indicating the selected rows, y the selected columns.
- E.g., dat[1, 5] to see values in first row, fifth column, or dat[c(1,2), c(1,2)] to
see values in rows 1 and 2, columns 1 and 2. Or trough using names: dat[c(1,2),
c(“participant”, “study”).
 summary() to see how each variable is distributed and how much of the dataset is missing.
 str() to see structure of the dataset.
 rowmeans() to get the mean of the rows.
 rowsum() to get the sum of the rows.
 round() to round the number.
 $ to select a single column from the dataset.
- E.g. head(dat$study, 30) to show the study of the first 30 participants.
 relevel() to rearrange values, which can be useful with one-sided tests.
 rbind(x, y) to combine two subsets.
 rm() to remove an object.
 is.na() to test for missing data.
 apply(x, 1, mean) to apply mean to every row.

In R, the codes work form the inside of the brackets: it first computes the function in the most inner
brackets, and then works to the outside.

1

, - E.g. round(mean(dat$english_grade)) to first compute the mean, and then
round this mean.

Data can be accessed by using conditional indexing which allows to select parts of the data on the
basis of
 one condition,
- E.g. dat[dat$gender == “M”, ] to select observations for man only.
 combined conditions (and: &, or: |), or
- E.g. dat[dat$gender == “M” & dat$study == “IS”, ] to select
observations for male participants with the study IS.
 inversed conditions (is not equal to: !=).
- E.g. dat[dat$gender != “M” | dat$english_grade > 7, ] to select
observations for only women (i.e. gender is not equal to men) or everybody with an
English grade above 7.
You can also add columns to the dataset (here: dat) with the help of the operator $.
- E.g. dat$diff <- dat$english_grade – dat$english_score to add the column
diff which states the difference between the English grade and score.
- E.g. dat$pass_fail <- “PASS” to add the column pass_fail which is pass for
everyone  dat[dat$english_grade < 5.5, ]$pass_fail <- “FAIL” selecting
that if the grade is lower than 5.5, it is fail.

R shows NA when a value is missing: add na.rm = TRUE to the function to ignore missing values.

R also had many options for basic visualization.
 barplot() to visualize frequencies of categorical variables. To do this, you first need to
create a frequency table through table(), and of this table you can create a barplot.
- E.g. 1. gender <- table(dat$gender)) 2. barplot(gender).
 plot().
 boxplot() to visualize frequencies of numerical variables.
 hist() to make a histogram, where the y-axis displays the absolute frequencies.
 qqnorm() and adding the qqline() to visualize the distribution and see if it is normally
distributed.
In addition, you can add various graphical parameters to customize your graphics.
 col = c() to give it a color.
 main = “” to give it a title.
 xlab = “” to name the x-axis.
 ylab = “” to name the y-axis.
 legend = c() to add a legend.
- E.g. barplot(gender, col = c(“pink”, “ligtblue”), main = “Distribution
gender”, xlab = “Gender”, ylab = “Frequency”, legend =
c(“Female”, “Male”))

A frequency table is made from one variable (one column): the upper row shows the values of the
variable, and the lower row shows the frequency. A cross-table is made from two variables.
- E.g. table(dat$gender, dat$study) showing the variables gender and study.

Refer to figures, graphs, or tables with a capital letter.
- E.g. Figure 1, Table 2.

2

, Descriptive statistics
Samples of a population are used, because some populations are too big to study completely. The
sample is best if it is randomly drawn from the population. Characteristics of the individuals that one
studies are the variables, which has values.
- E.g., the variable gender can have the values male, female.
Collected data is organized in a dataset at which the individuals are represented in rows, and the
variables (with their values) in columns.

There are two types of variables, each having corresponding measurement levels.
 Categorical data: unordered categories.
o Nominal measurement level: just categorization, no ordering.
o Ordinal measurement level: categories have order, but no known distances.
 Numerical data: ranked data.
o Interval measurement level: ordered categories with known distances.
o Ratio measurement level: ordered categories with known distances with a
meaningful 0.

In the table below, a few examples of measurement levels.

Variable: Values: Measurement level:

a. Amount of sports practice A lot, little, nothing Ordinal
b. Political party preference PvdA, CDA, VVD, other Nominal
c. Response time 0, 1, 2, 3, .... milliseconds Ratio
d. Attitude towards Europe positive, neutral, negative Ordinal
e. Age 1, 2, 3, ... 25 years Ratio
f. Age 0-10 years, 11-20 years, 21-30 Ordinal
years, ...
g. Year of birth 1920, 1921, 1922, ... Interval
h. Number of inhabitants per km2 0-100, 101-200, 201-300, ... Ordinal
i. Gender male, female, non-binary Nominal
j. Type of car truck, bus, passenger car Nominal
k. Political opinion left, centre, right Nominal

Categorical variables can be characterized and visualized by using the function
 table() to create a table showing the absolute frequencies,

3

€3,89

Krijg toegang tot het volledige document:

100% tevredenheidsgarantie

Direct beschikbaar na je betaling

Lees online óf als PDF

Geen vaste maandelijkse kosten

Maak kennis met de verkoper

hLianne

4,1

(21)

Beoordelingen van geverifieerde kopers

Alle reviews worden weergegeven

jopperu21 Communicatie- En Informatiewetenschappen

3 jaar geleden

3,0

1 beoordelingen

Betrouwbare reviews op Stuvia

Alle beoordelingen zijn geschreven door echte Stuvia-gebruikers na geverifieerde aankopen.

Maak kennis met de verkoper

hLianne Rijksuniversiteit Groningen

Bekijk profiel

Volgen

Verkocht

251

Lid sinds

6 jaar

Aantal volgers

158

Documenten

Laatst verkocht

2 maanden geleden

Communication and Information documents by Lianne

Op mijn profiel vind je allerlei samenvattingen en aantekeningen die ik maak voor mijn studie Communication and Information Studies en die jou zeker zullen helpen bij het studeren! Deze studie volg ik aan de Rijksuniversiteit Groningen. Vakken gericht op o.a. communicatie, taalkunde en academische vaardigheden komen aan bod. De documenten zijn natuurlijk ook te gebruiken voor andere studies.

4,1

21 beoordelingen

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via Bancontact, iDeal of creditcard en download je PDF-document meteen.

“Gekocht, gedownload en geslaagd. Zo eenvoudig kan het zijn.”

Alisha Student

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper hLianne. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €3,89. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews) Afgelopen 30 dagen zijn er 43715 samenvattingen verkocht Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen

Lecture/seminar/powerpoint notes statistics (LCX046B05)

Geschreven voor

Documentinformatie

Onderwerpen

Voorbeeld van de inhoud

Meer vakken binnen Rijksuniversiteit Groningen (RuG) > Communicatie- En Informatiewetenschappen

Beoordelingen van geverifieerde kopers

Maak kennis met de verkoper

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Niet tevreden? Kies een ander document

Betaal zoals je wilt, start meteen met leren

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?