100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Summary

Statistical Methods - Slides Summary

Rating
-
Sold
-
Pages
65
Uploaded on
03-01-2025
Written in
2020/2021

A summary of all the slides for the course Statistical Methods, BSc AI.

Institution
Course











Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Course

Document information

Uploaded on
January 3, 2025
Number of pages
65
Written in
2020/2021
Type
Summary

Subjects

Content preview

Statistical Methods - Summary

Lecture 1
● Statistics: science of data, the study of collecting, organizing, analyzing, interpreting and
presenting data.
○ Statistics are used to gain information about a group of objects (population)
and/or to make decisions and predictions when randomness is involved.
● Census: collection of data from every member of a population.
○ Usually too large to collect
○ Therefore, a sample, a selected subcollection (or subset) from the population is
studied.
■ A different sample results in different data. Hence, possibly different
conclusions about the population. A sample should be representative
(same characteristics as population) and unbiased (no systematic
difference with population)
○ Sample → Data → Analysis → Conclusion about population

1.2 Statistical and critical thinking
● A statistical study consists of the following steps:
1. Prepare
a. Context
b. Source
c. Sampling method (how to obtain samples?)
2. Analyse
a. Graph data
b. Explore data
c. Apply statistical methods
3. Conclude

1.4 Collecting sample data:
● There are different methods to collect sample data
○ Voluntary response sample: subjects decide themselves to be included in the
sample.
○ Random sample: each member of the population has equal probability of being
selected.
○ Simple random sample: each sample of size n has equal probability of being
chosen.
○ Systematic sampling: after starting point, select every k-th member.
○ Convenience sampling: easily available results.
○ Stratified sampling: divide population into subgroups (strata) such that subjects
within groups have the same characteristics, then draw a (simple) random sample
from each group.



1

,Statistical Methods - Summary


○ Cluster sampling: Divide population into sections (clusters), then randomly
select some of these clusters.
● Important concepts:
○ Variable: quantity that may vary
● In cause and effect studies:
○ Explanatory (independent) variable: variable which might cause the effect
being studied.
○ Response (dependent) variable: variable that represents the effect being studied.
○ Confounding: occurs when influences of different explanatory variables on
response variable mix and can not be distinguished anymore.
● Different types of study:
○ Observational study: characteristics of subjects are observed, but subjects are
not modified.
■ Retrospective (case-control): data from the past
■ Cross-sectional: data from one point in time
■ Prospective (longitudinal): data to be collected
○ Experiment: some treatment is applied to subjects.
■ Sometimes control and treatment group: single-blind and double-blind.
■ Placebo effect, experimenter effect.

1.3 Types of data
● Parameter: numerical measurement describing some characteristic of a population.
○ Notation: typically Greek symbols, e.g. μ, σ,....
● Statistic: numerical measurement describing some characteristic of a sample.
○ Notation: small letters, e.g. ̄x, s.
● Data is not only numbers
○ Quantitative (numerical) data: numbers representing counts or measurements
■ E.g., number of students’ siblings: 1, 0, 2, 2, 5...
○ Qualitative (categorical) data: names or labels (“1”, not 1) representing counts
or measurements
■ E.g., quality of a course: good/far/bad
● Quantitative data:
○ Discrete data: number of possible values is “countable”
■ E.g., word counts, number of coin tosses
○ Continuous data: collection of values is not countable
■ E.g., length, weight, distance
● Level of measurement of data is used to determine which statistical methods might apply
to the data.




2

,Statistical Methods - Summary


○ Qualitative data:
■ Nominal: names, labels, categories (no ordering).
● E.g. gender, eye color. Can not be used for computations.
■ Ordinal: categories with ordering, but no (meaningful) differences.
● E.g. U.S. grades (A-F), opinions (totally disagree / disagree / . . . /
totally agree)
○ Quantitative data:
■ Interval: ordering possible and differences between numbers are
meaningful, but there is no natural zero starting point.
● E.g. year of birth, temperatures (Celsius/Fahrenheit).
■ Ratio: ordering possible, differences are meaningful and there is a natural
starting point.
● E.g. body length, marathon times
● Determine the level of measurement for the following data:
○ M&M colours = nominal data (qualitative, no ordering)
○ Inauguration years of U.S. presidents = interval data (quantitative, no natural
starting point)
○ Brain volumes (in cm3) = ratio data (quantitative, natural starting point)
○ Level of lead in blood (low/medium/high) = ordinal data (qualitative, ordering)

Summarizing and graphing data
● From now on,we assume that data are from a representative and unbiased sample.
● Next: summarize data
○ Numerical summary
○ Graphical summary
● Every data set comes with a research question. Use your summary to answer your
research question.
● Typically we are interested in the data distribution — where does the data lie?
● Good summary shows:
○ what the data distribution looks like: location, spread/dispersion, range,extremes,
accumulations, gaps/holes, symmetry, . . .
● Depending on context and goal, also whether:
○ data could be sampled from a certain distribution
○ data is rounded
○ different groups are needed for further analysis
○ there are influences of other variables, e.g. time
○ there is dependence between variables.
● Summarise to describe or find structure in data distribution:
○ Graphical: tables, graphs, other figures of data distribution




3

, Statistical Methods - Summary


○ Descriptive
■ Qualitative: describe shape, location and dispersion/variation of data
distribution
■ Quantitative: numerical summaries of location and variation
○ NB: first step in every data analysis: make some figures of data (if possible) for
own use. Could prevent wrong choice of statistical methods.

Graphical summaries
→ Some of these summaries can only be used for some types of data.
● Frequency distribution (table)
○ Count occurrences of category or number of values in interval
○ freq=cbind(table(grades2[,2]))
freq=cbind(freq[,1],cumsum(freq[,1]),freq[,1]/length(grades),cumsum(freq[,1])/length(grades))
colnames(freq)=c("Frequency","Cumulative","Rel. frequency","Cum. rel. frequency")
options(digits=2)
print(freq)




● Bar chart
○ population=c(322,1372,147,127,65,81,1278,36,407,1111)
names(population)=c("US", "Chi", "Rus", "Jap", "GB",
"Ger", "Ind", "Can", "SAm","Afr")
par(mfrow=c(1,1))
barplot(population,main="Bar chart", ylab="Pop. size (mln)",col="red")




● Pareto bar chart
○ orders the categories with respect to frequency. Only applies to data of nominal
level of measurement.
par(mfrow=c(1,1))
barplot(sort(population,decreasing = TRUE), main="Pareto bar chart", ylab="Pop. size (mln)", col="blue")




● Pie chart
○ Size of pieces of pie is determined by relative frequency of
category. Mainly used for qualitative data.
○ pie(population/sum(population), col=c("green", "yellow" , "brown",
"blue","red", "grey","purple", "orange", "pink", "black"))




4

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
tararoopram Vrije Universiteit Amsterdam
Follow You need to be logged in order to follow users or courses
Sold
26
Member since
3 year
Number of followers
2
Documents
38
Last sold
1 month ago

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions