100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Samenvatting

Statistics 1 summary

Beoordeling
-
Verkocht
2
Pagina's
51
Geüpload op
24-01-2023
Geschreven in
2022/2023

Summary of áll the lectures, computer practicals and seminars of Statistics 1 in the year 2022/2023. This includes personal notes and examples + figures from the PowerPoint presentations.












Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Documentinformatie

Geüpload op
24 januari 2023
Aantal pagina's
51
Geschreven in
2022/2023
Type
Samenvatting

Voorbeeld van de inhoud

STATISTICS 1
Week 1 - Lecture 1 & 2

Statistics is a guessing game.
You never know the parameter/ the truth about the population, you only hope that you are close.

Population = The group that you wish to describe (The entire set of elements)
Sample = The group for which you have data (A subset of elements from the population,
taken with the intention of making inferences about the population)

Why take a Sample?
› Describing the whole population is:
• Too expensive
• Impossible
• Sampling might be destructive
• Impractical
• Unnecessary

Parameter = Numerical property of the population (based on the entire population/ the truth)
Statistic = Numerical property of a sample (based on a statistic)

Sampling error
› A difference between the value of a parameter and the statistic computed to estimate that
parameter
› Result of:
• Variability
• Sampling Bias
• Nonsampling Error

Reducing Sampling Error
› Variability (this Lecture)
- Increase n
› Sampling Bias (this Lecture)
- Design of sampling procedure
› Nonsampling Error
- Validity, Accuracy, Precision of variables
- Prevent coding errors
- Prevent interpretation errors
- Also: good labelling, metadata

➔ You do have control over variability, sampling bias and nonsampling error, you want to
minimalize them.

Variability = The phenomenon whereby repeated sampling from the same population results in
different values for the statistic.

Example; ask 5 students age in course group. Ask again with different 5 students. The difference in
average age. How different?
= variability (size and diversity important). Statistically you want it to be as low as possible, increase
confidence in result. Solution is increase sample size.

1

,Sampling distribution = Describes how the statistic varies when sampling is repeated.
- In other words: describes (extent of) variability
- This is the basis for inference

Central Limit Theorem
Even if a variable X is not normally distributed in the population …
› … we may assume that …
Under certain conditions, such as a large number of cases and a fixed standard deviation σ
› ... the Sampling Distribution of the mean is approximately normal with standard error:




Sampling Bias = Result of procedures which favour the inclusion, in your sample, of elements from
the population with certain characteristics. (make sure you have the right people in your sample)

› Sources of Sampling Bias: (a combination of) the
- population
- researcher
- research design
- research topic
- respondent
› May result in:
- incomplete coverage: relevant elements not in sampling frame
- nonresponse: refusal or missing data

➔ Increasing the sample size increases the problem.


Population, reductant to participate, don’t trust science.
Researcher, are we capable to see population?




Difference between probability and non-probability sample: who is taking the decisions.


2

,Probability samples: driven by chance + reduced sampling bias.
Non-probability samples: researcher is in charge + risk of bias.
Judgemental: handpicked who you research, suitability.
Volunteer: hey I wanna be in your research.
Convenience: laziness, only ask people who are there/queuing> easy and nowhere else to go.
Cluster (random): assumption that you have groups in your population that are similar. Then it
doesn’t really matter who you pick.
Stratified: opposite of cluster, different groups. Maybe different approaches per group.
Systematic (random): population already ordered, example; student numbers. Every 5th person etc.
Simple random: ideal case, perfect list same probability. Clear population + list + randomly selected.
Independent: small population, trick. Independent, keep probability the same to being selected. Take
them out, ask questions, put them back in the group.
Quota: Targets, find me 100 people of this kind, without intend of representative. Just about getting
the numbers. Not representative.

Simple random and convenience difference; most convenient way disregarding the population you
would like to cover. Simple random different approach, work hard to cover population and choose
from that. If lucky; convenience can be representative.

Example Public Transport Bureau = stratification; different groups of commuters. Clustered design in
stratified group possible. Not systematic, cause you leave out all the people without passes.
➔ Exam: which groups do you want to research/ define population and sample, are they
different? Work your way up which strategy you would choose, cover each group.
+ Definitions from the book. Don’t remember formulas. Pick right formula and apply.


Geographic sampling:
- Traverse samples; lines
- Quadrat samples; squares
- Point samples; dots
You want it to be random.

Processing of data
› How to deal with nonresponse
Distinguish:
• Choice of respondent
- Can still be regarded as a value
- “no opinion” still informs about the respondents opinion
- “don’t know” still informs about the reason of nonresponse
• Other causes
- “no answer” does not inform about the position of the respondent

Types of data
Qualitative (Non-numerical values)
› Categories
Quantitative (Numerical values (counts, measurements)
› Discrete; Range of possible values is limited (how many cars do you have, no commas)
› Continuous; Intermittent values are also possible (height, can be specific. Also averages, inhabitants
have an average of .5 cars; variable is number of cars per household, not specifically about cars or
inhabitants anymore.)



3

, Measurement levels
› Nominal
- Categorical, no ranking
› Ordinal
- Categorical, ranked (low-high, bad-good etc.)
- Degrees of a certain phenomenon
- Width of intervals unknown
› Ratio (& Interval) = scale in SPSS
- Width of intervals known (= equidistance)
- We can compute differences
Interval and ratio difference; ratio has a natural/absolute/true zero point.
Example; Celsius = interval (below zero no absence of temperature) and Kelvin = ratio.




Example grey colours: ordinal.
Example countries: nominal.
Example German political parties: nominal. Variable more specific; number of seats/ degree of
conservativeness makes it different.
Example satisfaction: ordinal. Opinion, width unknown.

Binary variables (a.k.a.: Dummy, or Boolean) (rules out the measurement levels = nominal)
› Two possible values: True or not true, yes or no, 1 or 0, agree or disagree.
› Special case of a nominal variable: Mean = proportion of “1”. > Possibility to calculate useful
average!

Choose suitable variables and measurement levels.

Exploratory Data Analysis
› Study data in order to describe key properties
- What do you see?
› For each variable
- Diagrams and / or tables
- Numerical summaries of distributions
› No single best way of doing EDA
- BUT: the starting point of any decent quantitative analysis!

Distributions (> quality control, does the variable do what it is supposed to do)
› Shape
› Center
› Spread

4

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
Enya96 Rijksuniversiteit Groningen
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
18
Lid sinds
3 jaar
Aantal volgers
12
Documenten
10
Laatst verkocht
11 maanden geleden

Pre-master student at the RUG :)

3,0

1 beoordelingen

5
0
4
0
3
1
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen