100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Class notes

Quantitative Data Analysis 2 Midterm Summary

Rating
-
Sold
-
Pages
19
Uploaded on
07-11-2021
Written in
2020/2021

A complete summary of all the lectures for the midterm exam. The first three weeks of the course.

Institution
Course










Whoops! We can’t load your doc right now. Try again or contact support.

Connected book

Written for

Institution
Study
Course

Document information

Uploaded on
November 7, 2021
Number of pages
19
Written in
2020/2021
Type
Class notes
Professor(s)
Roger pruppers
Contains
All classes

Subjects

Content preview

QDA 2
LECTURES

WEEK 1

OV = Outcome Variable (Field)
- DV = Dependent Variable: test variable, variable to be explained
PV = Predictor Variable (Field)
- IV = Independent Variable: variable that explains
We are interested of the effect of a predictor variable on an outcome variable.

The p-value
- Stands for the probability of obtaining a result (or test-statistic value) equal to (or ‘more extreme’ than) what was actually
observed (the result you actually got), assuming that the null hypothesis is true.
- P ≤ 0.05
o Reject the null hypothesis and support the alternative hypothesis.
o Given the sample and the significance level of 5% there is sufficient support that the mean weight differs from 12g.
o A low p value indicates that the null hypothesis is unlikely.
- P > 0.05
- Do not reject the null hypothesis and do not support the alternative hypothesis.
- Given the sample and significance level of 5%, there is not sufficient support that the mean weight differs from 12g.

What is a conceptual model?
- Visual representations of relations between theoretical constructs and variables of interest.
- Model: simplified description of reality.
- The boxes represent variables.
- Arrows represent relationships between variables.
- Arrows go from predictor variables to outcome variables.
- Hypotheses refer to specific arrows e.g. relationships/effects/differences.

Levels of measurement of variables
- Categorical: subgroups are indicated by numbers. Made up of categories and names distinct entities.
o Nominal: two or more categories, in no particular order e.g. male and female.
o Ordinal: ordered categories e.g. small, medium, large.
- Quantitative: use numerical scales, with equal distances between values.
o Discrete: can take only certain values e.g. 1, 2, 3.
o Interval: equal intervals on the scale.
o Ratio: true and meaningful zero point e.g. time, income.
- In social sciences, we often treat ordinal scales as interval (pseudo) scales e.g. Likert scales (1 – 5 disagree to agree).

Moderation
- If the proposed effect is stronger in certain settings.
- Also called interaction.
- A moderator is a variable that affects the strength of the relation between
the predictor and outcome variable.

Mediation
- If the proposed relationship goes via another variable.
- A mediating variable explains the relation between the predictor and the
outcome variable.

Hypotheses
- H0: null hypothesis (rejected or not)
- H1: alternative/research hypothesis (supported or not)
- Hypotheses are developed prior to research. They are based on theory and previous research.
- Not all potential relationships need to be hypothesized:
o Every hypothesis refers to an arrow in the conceptual model.
o But not every potential arrow refers to a hypothesis.
- A hypothesis is a verbalized expression of an expected relationship between variables.



1

,One vs. two-sided testing
- If the hypothesis is one-sided, check if the hypothesis is in line with the results (e.g. mean plots).
- If they are in line (e.g. positive and right sided), divide the two tailed p-value by 2.
- If they are not in line, then by (1 – two tailed p-value/2).

Test Hypotheses
- Appropriate way to test hypotheses depends on:
o Nature of the relationship: derived from conceptual model.
• Main effects, moderation/interaction, mediation.
• Total direct, indirect effect.
o Nature of the data: not all of this is derived from conceptual model.
• Number of PV, number of OVs
• How are variables operationalized?
• Data type PVs, data type OVs
• If there are multiple groups: number of groups, relationship between them (dependent/independent).

Independent and Paired Samples T-test
- Paired-samples t tests compare scores on two different variables but for the same group of cases.
- Independent-samples t tests compare scores on the same variable but for two different groups of cases.
o Use when there is one quantitative outcome variable and one categorial predictor variable with two mutually exclusive
categories.

Analysis of Variance – ANOVA
- With ANOVA, we are examining how much of the variance in our data can be explained by our predictor variable.
- ideally 40 observations per group

One-way independent ANOVA
- One-way independent ANOVA: when the participants are different (independent groups) and there is only one predictor
variable.
- Conditions:
o One quantitative outcome variable (when the OV is quantitative – test on the mean)
o One categorical predictor variable
o Two or more mutually exclusive categories/groups (independent groups)
- Assumptions: need to adhere to these assumptions, in order to prevent invalid outcomes.
o Variance is homogeneous across groups.
o Residuals are normally distributed.
o Groups are roughly equal sized.
- Distinguish between:
o Number of categories within one categorial predictor variable.
o Number of predictor variables.
- Hypotheses:
o H0: μ1 = μ2 = … = μi
• i = number of categories
• No difference in OV mean across the different categories in PV.
o H1: at least one μ differs
• There is at least one difference in OV mean score between PV categories.
- Based on an F-Test
o Test statistic: F-test
o F-distribution looks different than t-distribution.
o F-values are looking to explain variability.
- ANOVA decomposes total variability observed in OV into variation explained by the model and residual variation.
o Explained variability: how much is caused by differences between groups?
o Unexplained variability: how much is caused by differences within groups?
o Prefer a larger proportion of the variability to be explained than unexplained.

Variability measures
- Variance: the averages of the squared differences from the mean.
- Sum of squares: the sum of the squared differences from the mean.
o Used for ANOVA analysis.
o Use squared deviations because we want positive outcomes.


2

, Sums of squares
SStotal = SSmodel + SSresidual
- Total sum of squares
o Squared deviations from grand overall mean.
o Total variability to be explained.
- Model Sum of Squares
o Between SS: explained variability.
o Squared deviations group means from grand overall mean.
o How much variability can be explained by differences between groups?
- Residual sum of squares
o Unexplained variability: within SS.
o Squared deviations observations from group means.
o How much variation within groups?
o Thus, not explained by the groups we compare.

How to use the sums of squares?
1. R2: proportion of total variance in our data that is “explained” by our model.
!!
o R2 = !!!
"
- Explained variability / total variability
- Model Sum of Squares / Total Sum of Squares
- An important and valuable indication but not a formal statistical test.
2. F-Test
- To investigate if the group means differ with an ANOVA, we do an F-test.
- This is a statistical test and checks the ration explained variability to unexplained variability.
"#$%&'(") +&,'&-'%'./ -".1""( 2,30$ +&,'&-'%'./
o F(ratio) = =
0("#$%&'(") +&,'&-'%'./ 1'.4'( 2,30$ +&,'&-'%'./
- We cannot divide the model sum of square by the residual sum of squares because they are not based on same number of
observations/df.
- We therefore divide by the degrees of freedom to get Mean Squares (MS)
5! !! /)7 !! /89:
o F = 5!! = 5!! /)7! = !! !/((98)
# # # #
- We want a large F value because this means that a larger proportion of the variability is explained.

- Degrees of freedom (df) one-way independent ANOVA:
o dfM = k-1
o dfR = n-k
o dfT = n-1
*k = number of categories
*n = number of observations

From F to p to conclusion H0
- F is a test statistic which means it has both a null hypothesis and an alternative hypothesis.
- From test statistics to p-value:
o From F-ratio to p-value (depends on df)
o Look in F-table for critical value: dfR and dfM
- From (critical) p-value to conclusion H0
o If F-ratio > critical p-value: reject H0

One-way independent ANOVA calculations example
Research question: is there a relation between shopping platform and customer satisfaction?
- PV = shopping platform (categorical) with 3 levels/categories:
o 1 Brick-and-mortar store
o 2 Web shop
o 3 Reseller
- OV = customer satisfaction (quantitative)
o Score from 1-50
- 10 observations – not realistic
- A 1-way independent ANOVA is appropriate because there is one quantitative outcome variable and one categorical
predictor variable with more than two mutually exclusive categories.

H0: μ1 = μ2 = μ3
H1: at least one μ differs
3

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
bastudent Universiteit van Amsterdam
Follow You need to be logged in order to follow users or courses
Sold
60
Member since
5 year
Number of followers
51
Documents
0
Last sold
1 month ago

3.4

8 reviews

5
3
4
1
3
2
2
0
1
2

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions