100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Summary

Statistics 1 summary

Rating
-
Sold
2
Pages
51
Uploaded on
24-01-2023
Written in
2022/2023

Summary of áll the lectures, computer practicals and seminars of Statistics 1 in the year 2022/2023. This includes personal notes and examples + figures from the PowerPoint presentations.

Institution
Course











Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Course

Document information

Uploaded on
January 24, 2023
Number of pages
51
Written in
2022/2023
Type
Summary

Subjects

Content preview

STATISTICS 1
Week 1 - Lecture 1 & 2

Statistics is a guessing game.
You never know the parameter/ the truth about the population, you only hope that you are close.

Population = The group that you wish to describe (The entire set of elements)
Sample = The group for which you have data (A subset of elements from the population,
taken with the intention of making inferences about the population)

Why take a Sample?
› Describing the whole population is:
• Too expensive
• Impossible
• Sampling might be destructive
• Impractical
• Unnecessary

Parameter = Numerical property of the population (based on the entire population/ the truth)
Statistic = Numerical property of a sample (based on a statistic)

Sampling error
› A difference between the value of a parameter and the statistic computed to estimate that
parameter
› Result of:
• Variability
• Sampling Bias
• Nonsampling Error

Reducing Sampling Error
› Variability (this Lecture)
- Increase n
› Sampling Bias (this Lecture)
- Design of sampling procedure
› Nonsampling Error
- Validity, Accuracy, Precision of variables
- Prevent coding errors
- Prevent interpretation errors
- Also: good labelling, metadata

➔ You do have control over variability, sampling bias and nonsampling error, you want to
minimalize them.

Variability = The phenomenon whereby repeated sampling from the same population results in
different values for the statistic.

Example; ask 5 students age in course group. Ask again with different 5 students. The difference in
average age. How different?
= variability (size and diversity important). Statistically you want it to be as low as possible, increase
confidence in result. Solution is increase sample size.

1

,Sampling distribution = Describes how the statistic varies when sampling is repeated.
- In other words: describes (extent of) variability
- This is the basis for inference

Central Limit Theorem
Even if a variable X is not normally distributed in the population …
› … we may assume that …
Under certain conditions, such as a large number of cases and a fixed standard deviation σ
› ... the Sampling Distribution of the mean is approximately normal with standard error:




Sampling Bias = Result of procedures which favour the inclusion, in your sample, of elements from
the population with certain characteristics. (make sure you have the right people in your sample)

› Sources of Sampling Bias: (a combination of) the
- population
- researcher
- research design
- research topic
- respondent
› May result in:
- incomplete coverage: relevant elements not in sampling frame
- nonresponse: refusal or missing data

➔ Increasing the sample size increases the problem.


Population, reductant to participate, don’t trust science.
Researcher, are we capable to see population?




Difference between probability and non-probability sample: who is taking the decisions.


2

,Probability samples: driven by chance + reduced sampling bias.
Non-probability samples: researcher is in charge + risk of bias.
Judgemental: handpicked who you research, suitability.
Volunteer: hey I wanna be in your research.
Convenience: laziness, only ask people who are there/queuing> easy and nowhere else to go.
Cluster (random): assumption that you have groups in your population that are similar. Then it
doesn’t really matter who you pick.
Stratified: opposite of cluster, different groups. Maybe different approaches per group.
Systematic (random): population already ordered, example; student numbers. Every 5th person etc.
Simple random: ideal case, perfect list same probability. Clear population + list + randomly selected.
Independent: small population, trick. Independent, keep probability the same to being selected. Take
them out, ask questions, put them back in the group.
Quota: Targets, find me 100 people of this kind, without intend of representative. Just about getting
the numbers. Not representative.

Simple random and convenience difference; most convenient way disregarding the population you
would like to cover. Simple random different approach, work hard to cover population and choose
from that. If lucky; convenience can be representative.

Example Public Transport Bureau = stratification; different groups of commuters. Clustered design in
stratified group possible. Not systematic, cause you leave out all the people without passes.
➔ Exam: which groups do you want to research/ define population and sample, are they
different? Work your way up which strategy you would choose, cover each group.
+ Definitions from the book. Don’t remember formulas. Pick right formula and apply.


Geographic sampling:
- Traverse samples; lines
- Quadrat samples; squares
- Point samples; dots
You want it to be random.

Processing of data
› How to deal with nonresponse
Distinguish:
• Choice of respondent
- Can still be regarded as a value
- “no opinion” still informs about the respondents opinion
- “don’t know” still informs about the reason of nonresponse
• Other causes
- “no answer” does not inform about the position of the respondent

Types of data
Qualitative (Non-numerical values)
› Categories
Quantitative (Numerical values (counts, measurements)
› Discrete; Range of possible values is limited (how many cars do you have, no commas)
› Continuous; Intermittent values are also possible (height, can be specific. Also averages, inhabitants
have an average of .5 cars; variable is number of cars per household, not specifically about cars or
inhabitants anymore.)



3

, Measurement levels
› Nominal
- Categorical, no ranking
› Ordinal
- Categorical, ranked (low-high, bad-good etc.)
- Degrees of a certain phenomenon
- Width of intervals unknown
› Ratio (& Interval) = scale in SPSS
- Width of intervals known (= equidistance)
- We can compute differences
Interval and ratio difference; ratio has a natural/absolute/true zero point.
Example; Celsius = interval (below zero no absence of temperature) and Kelvin = ratio.




Example grey colours: ordinal.
Example countries: nominal.
Example German political parties: nominal. Variable more specific; number of seats/ degree of
conservativeness makes it different.
Example satisfaction: ordinal. Opinion, width unknown.

Binary variables (a.k.a.: Dummy, or Boolean) (rules out the measurement levels = nominal)
› Two possible values: True or not true, yes or no, 1 or 0, agree or disagree.
› Special case of a nominal variable: Mean = proportion of “1”. > Possibility to calculate useful
average!

Choose suitable variables and measurement levels.

Exploratory Data Analysis
› Study data in order to describe key properties
- What do you see?
› For each variable
- Diagrams and / or tables
- Numerical summaries of distributions
› No single best way of doing EDA
- BUT: the starting point of any decent quantitative analysis!

Distributions (> quality control, does the variable do what it is supposed to do)
› Shape
› Center
› Spread

4

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
Enya96 Rijksuniversiteit Groningen
Follow You need to be logged in order to follow users or courses
Sold
18
Member since
3 year
Number of followers
12
Documents
10
Last sold
11 months ago

Pre-master student at the RUG :)

3.0

1 reviews

5
0
4
0
3
1
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their exams and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can immediately select a different document that better matches what you need.

Pay how you prefer, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card or EFT and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions