100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Summary

Summary Meth. Meas. and Statistics (424023-b-6) (Statistics part only)

Rating
5.0
(2)
Sold
3
Pages
24
Uploaded on
09-12-2020
Written in
2020/2021

All Statistic lectures given by Luc van Baest are included in this summary. The methods summary is also available on my account. (You can also buy them in a bundle, which is cheaper). If you are looking for an overview of all the slides with additional information told in the lectures, this summary should fit your needs. Good luck!

Show more Read less
Institution
Course










Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Course

Document information

Uploaded on
December 9, 2020
Number of pages
24
Written in
2020/2021
Type
Summary

Subjects

Content preview

Statistics lectures

Lecture 1
We use statistics to:
- Describe/summarize data: descriptive statistics
- Drawing inferences about populations: inferential statistics
- Studying complex multivariate relationships: statistical modeling

Measurement levels
1. Nominal data: numbers express group membership (in 2 or more categories)
ex. Marital status. 1= single, 2= married, 3= in a serious relationship, 4= not specified.
Categories must be exhaustive (all possibilities should be covered) and mutually
exclusive (i.e. every case fits into one category and one category only)
2. Ordinal data: numbers express an ordering (less/more)
Ex. Smoking intensity. 1= never, 2= occasionally, 3= regularly, 4= heavy.
Numbers expresses more or less of a quantity but the difference between 1 and 2 is
not the same in quantity than between 2-3, 3-4 etc.
3. Interval and Ratio (Scale level): numbers express differences in quantity using a
common unit. Ex. The difference between 70 and 80 in IQ points is comparable to the
difference between 100-110. Both span a difference of 10 units. Likewise, if on
Monday the temperature is 30 degrees, on Tuesday 25 degrees and Wednesday 15
degrees, then we can say that the temperature drop between Tuesday and
Wednesday is twice as large as the drop between Monday and Tuesday.
- Interval: No natural 0 point. (zero temperature is meaningless) But it is arbitrarily
chosen and can differ across scales (fahrenheit and celcius)
- Ratio: have a natural 0 point, as a result you can compare the relative magnitude of
things. You can say for ex. That a person is twice as large (length, income are
examples.)
All data that are not nominal or ordinal  Scale level.

Every analysis starts with data inspection  getting to know your data. In general, we want
to know more abour:
- Central tendency: What are the most common values?
- Variability: How large are the differences between the subjects? Are there extreme
values in the sample?
- Bivariate association: for each pair of variables, do they associate/covary.

Graphs and statistics
- Bar charts (nominal and ordinal data)
- Histogram (scale data)
- Scatterplots
- Numerical summaries: Frequency tables

Central Tendency – Mode, Median and Mean
- Mode, the score that is observed most frequently
- Median: 4,5,6,7,8,9,9 = 7. Or equal: 4,5,6,8,9,9 = 7
- Mean: M=SX:N (sum of all scores: total number of scores) = het gemiddelde

,Deviation scores
= the difference between a score x1 and the Mean score.
What is variability? Difference in scores, Sum of deviation scores from the mean = 0
(X1-M)2 (Sum of squares)

Measures of Variablity
Variance: S2= (X-M)2 = SS
---------- ----
N-1 df

Standard deviation (SD): S = S2`
SD=S

Minimum, maximum, range and interquartile range
Minimum: lowest observed value
Maximum: higher observed value
Range: maximum – minimum
IQR: ranges of scores that encompass 50% of the middle observations: thus excluding the
25% lowest and 25% highest observations

Z-score = A measure for relative distance
Deviation is almost the same. However, Z scores tells us how extreme or normal a certain
score is. A deviation score of 3,75 can be very large or just normal.
Z=X-M Z=X- Deviation = X-M so the same as Deviation :S
------- -------
S 
Z score is the distance between a score and the mean score, relative to the variability of the
scores.

The normal distribution
- Mathematical distribution (Gauss curve)
- Total area under the curve is equal to 1
- Symmetrical distribution

Probability a person is taller than 212 cm. M==180 cm and Deviation== 20
This involves 2 steps:
1. Compute Z sore: Z= X- :  = 1.6
Hence, we need to know the area to the right of 1.6 under the standard normal
distribution that is P(Z>1.6)
2. Use table in the book tool to compute the proportion.
= 0.0548 hence 5.48% of the population is taller than 212.

Emperical rules for the normal distribution
- 68% of the cases can be found within one SD = S from the mean
- 95% of the cases can be found within two SDs=2S from the mean
Example: if you know that weekly salary is normally distributed with mean 300 and sd of 15,
you know that 68% has an income between 285 and 315, and 95% between 270 and 330.

, Lecture 2; introduction to inferential statistics
When we want to know something about a population, we use a sample to observe.

Sampling fluctuations = The variability across samples (they are always a little different)
Also, the sample values will always be a little different from the population value.
Differences between sample values and population values are known as sampling errors.
 We use inferential statistics to take sampling errors into account when drawing
conclusions about populations from sample results.

Sampling distribution of the mean: We use sampling means to draw conclusions about
population means.
- However, each random sample has a different composition; by change you may have
heavy social media users in the sample, but if you draw a new sample, you may have
by chance persons who use social media less intensively.
- Thus, each time we draw a new sample, the sample mean will vary.
- Using statistics we can predict the sampling fluctuations; that is; we can describe the
variation in means across all possible samples!

Theoretical sampling distribution of the mean = describes how sample means will vary
across samples if we would repeat the sampling many many times (with the same N, from
the same population).
- Sample means fluctuate around the population mean, hence, the mean of all sample
means IS the population mean!
- The variation in samples means depends on sample size; the larger the sample the
smaller the variation in sample means
- Statisticians have shown that if X is normal distributed with mean  and , the
sample mean (M) is normally distributed with mean equal to the population mean ,
and standard deviation  equal to m=:N (z score) OR SEm=S: N (t score)
The latter is called; standard error.

Standard Errors = the standard deviation of the sampling distribution
They show the variation in sample values (means) across samples.
The larger the standard error, the more sensitive the results are to sampling fluctuations.
- In general; the larger the sample size, the smaller the standard error, the less
sensitive sample results are to sampling fluctuations.
- The more hetereogeneous the population (the larger ) the more sensitive sample
results are to sampling
- The standard error is related to the errors we make if we use the sample as an
estimate of the population value
Central limit Theorem
 Sampling of 30 or more is always safe to conclude that the sample means are normally
distributed
In small samples, you can only use the normal distribution for the means if you are willing to
assume that x is normally distributed in the population.

Lecture 1 = One person so we talk about Standard Deviation
Lecture 2 = sample so we talk about standard error
$7.81
Get access to the full document:

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached


Also available in package deal

Reviews from verified buyers

Showing all 2 reviews
3 year ago

4 year ago

5.0

2 reviews

5
2
4
0
3
0
2
0
1
0
Trustworthy reviews on Stuvia

All reviews are made by real Stuvia users after verified purchases.

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
maudbressers Tilburg University
Follow You need to be logged in order to follow users or courses
Sold
258
Member since
10 year
Number of followers
193
Documents
14
Last sold
1 month ago

4.0

30 reviews

5
14
4
8
3
5
2
1
1
2

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions