100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Class notes

MAT-22306 Lectures Quantitative Research Methodology and Statistics

Rating
4.0
(2)
Sold
5
Pages
31
Uploaded on
09-09-2021
Written in
2021/2022

Extensive lecture summary of the course Quantitative Research Methodology and Statistics (MAT) at Wageningen University (WUR). Slides included as examples to give an extensive overview.

Institution
Course











Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Course

Document information

Uploaded on
September 9, 2021
Number of pages
31
Written in
2021/2022
Type
Class notes
Professor(s)
Jos hageman
Contains
All classes

Subjects

Content preview

MAT22306 - Quantitative research methodology and statistics
Lecture 1.1
Data types and distributions:
Variables must be able to vary (have different values), e.g. gender (can be male/female). Male is not a variable, as it
cannot vary. Male is a level of variable.

Types of variables:
Categorical/nominal: there’s no order or magnitude. Solely distinguishes between levels.
Ordinal: distinguishes between levels, fixed order. Clear order, no clear magnitude/difference between the values.
Interval: distinguished between levels and values, with a fixed order and there’s equal distance from the differences.
Ratio: distinguished between levels and values, with a fixed order. Distances are equal, but now there’s a natural zero

Describing findings of variables:
Categorical: reporting in percentages or frequencies (56 oranges, 60 apples)
Ordinal: reporting in percentages or frequencies.
Interval: infinitely many options (infinite categories). Report in summary measures for mean, central tendency, and
width of distribution.
Ratio: infinitely many options (infinite categories). Report in summary measures for mean, central tendency, and width
of distribution.

Measures of central tendency:
How to summarize groups of people with one measure? Describe the typical/average income in group
Mode: most common occurrence. Measure of centrality
Median: middle person
Mean: what is the average?

In a normal distribution, all central tendency measures are the same.

Measures of distribution:
Shows the difference/spread in the sample, used with percentiles (%) or % ranges

Standard deviation: the average distance from the average.
Formula: sum (each individual observation – overall mean) ² / total nr of observations. So,
(squared difference between the value of an observation minus the mean).

Sum of Squares (SS): for every score you have, you calculate the difference to the mean (obs –
mean), and square it. Add all of these up. The more observations, the > the sum.

Variance: independent variation from the number of observations around mean. Formula:
Sum of squares / total number of observations.

Normal distribution notation: N(μ, σ)
Standard normal distribution (z-distribution) notation: N(0, 1). μ = 0 σ = 1. → Tabel Field p. 995-998.
Standard normal distribution: number of standard deviations
from the mean. Number: how much of the total observations
is lower than the z-value?

Rules of thumb normal distribution:
Generally, 50% is lower than the mean.
68% is between + and – 1 standard deviation. 1 SD from the
mean, means 2/3 of the sample (68%), etc

,Kurtosis: indicates the pointiness (how high the top value) is of the distribution. Three possibilities: Leptokurtic = very
high point.
Mesokurtic = normal
Platykurtic = flattened.

Lack of symmetry: skewness. Can be tricky as
the mean can no longer be used as a central
tendency value of the data.
Positive skewness = longer tail towards positive
values
Negative skewness = longer tail towards
negative values.




Checks for normal distribution/normality:
1) Histogram: does it look like a bell-shaped curve/ND?
2) Boxplot: median is given, around that box of 50% of all observations. Symmetric in box and whiskers? Whiskers
(uiteinden) should capture about 95% of the values.
3) Q-Q plot: are the predicted residuals under normality the same as the observed residuals (difference between
mean)? Ideally all residuals should be on the straight line.

Fixing non-normality:
Many real world situations have a lowest possible value of 0, e.g. income, distance, time spent on task. Then you get
a positively skewed distribution (figure above), which is called log-normal. In cases where it makes sense to think
about doubling distance or times (e.g. spending 1 or 2 secs on a task, or 1 or 2 minutes), then you can calculate the
logarithm of such a scale. Then the skewed data could transforms to a normal distribution.

Sample and population:
Population = every case of interest
Sample = part of the population, which we try to generalize to the population at large

Population estimates require random samples. Inferential statistics: making population claims based on sample.

Estimate values for population through sample:
μ: sample mean (M or 𝑥̅ ) is an estimate for population mean (μ)
σ: sample SD (s) is an estimate of population SD (σ). N-1 is a correction for small samples

Sample distribution (bell figure) will become narrower when the sample is larger. Meaning,
the larger the number of observations, the better the sample mean is an estimate of the population.

Standard error of the mean (SE): the standard deviation of the sample distribution. Larger sample, smaller SE.
Estimator formula: sample standard deviation / square root N.

,Lecture 1.2
Sample distribution: is normally distributed around the population mean, with SD called standard error (σ/√𝑛).
Standard error = the standard deviation of the sampling distribution.

When one sample is outside the e.g. 95% range, we conclude it does not belong to H0. (alpha = 0.05). Meaning, it is
unlikely that the sample was drawn from a population that had that actual population mean mu.

Significance only indicates whether there’s evidence for a difference, however small. We conclude that something
does not belong to a general population. Says little/nothing about relevance.

Transform data to a z-distribution:
(Sample mean – population mean) / standard deviation of the sampling distribution.
After getting the sample z-value, the new sample distribution follows the N(0,1).




Z-distribution


T-distribution


Estimate SE of population through SE of sample. Calculate
standard error of the sample by taking the standard
deviation and divide by square root n. The smaller the
sample, the flatter the t-distribution.

Difference in critical values: 95% z-distribution is always + - 1.96. In a t-distribution this depends on the number of
observations if that number becomes larger. → book p. 999-1000

Df (degrees freedom): number of total observation – number of parameters used to estimate situation.

T-distribution has heavier tails, a bit flatter than the ND (more probability over extreme ranges). How flat/heavy the
tails is determined by df. The t-distribution becomes standard normal (z-)distribution if df becomes infinite.

Assumptions t-distribution:
• Data is measured on interval or ratio scale
• Observations follow the normal distribution
• Based on independent observations.

The more observations (df), the steeper t gets. Especially with a
small group < 20, than the t is really different from the z.

Rule inferential statistics: we can only conclude something at a
given confidence, not 100% certain. We decide the confidence.

Type 1 and Type 2 error
Type 1 error: when in reality the null hypothesis is true, but we
reject it. Incorrectly conclude something is going on, while it’s not.

Type 2 error: something is going on, but we didn’t see it based on
sample. Beta depends on effect size, # observations, alpha (acceptance
for type 1.

Problem: The more critical on not having false positives (type 1, alpha),
the larger the chance that we miss something (type 2, beta). We want to have more compelling evidence.

, In sum:
α (alpha) = critical p-value: proportion of sample where we accept that if less than 5% of samples is beyond the point
we accept, it is probably not part of the null hypothesis.

Test statistic = calculated value (z or t). We have to find a reference point; critical t-value found with df.

Confidence interval = range in which a specific value is likely to be with given confidence. Complement of alpha: 1 – α

Rejection region= outcomes for the test statistic where we conclude H0 is not true (reject H0, support Ha). Dit is dan
buiten de 95% curve. De Rejection Region zijn de Test Statistic uitkomsten die buiten de level of significance/alpha vallen. Als je dus 0.10 en
two-sided hypothesis, heb je een rejection region van 0.10, met aan de linkerkant 0.05 en de rechterkant 0.05. One-sided: 0.10 aan die zijde.


Rejecting and accepting H0:
Outcome probability > alpha: we accept H0, Ha has not been shown
Outcome probability < (of gelijk) alpha: we reject H0, Ha has been shown

Statistical test-procedure:

Reviews from verified buyers

Showing all 2 reviews
3 year ago

3 year ago

4.0

2 reviews

5
0
4
2
3
0
2
0
1
0
Trustworthy reviews on Stuvia

All reviews are made by real Stuvia users after verified purchases.

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
Nerine Wageningen University
Follow You need to be logged in order to follow users or courses
Sold
75
Member since
9 year
Number of followers
65
Documents
4
Last sold
2 months ago

3.9

12 reviews

5
3
4
6
3
2
2
1
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their exams and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can immediately select a different document that better matches what you need.

Pay how you prefer, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card or EFT and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions