Class notes

Quantitative Data Analysis 2 Midterm Summary

Rating

Sold

Pages

Uploaded on

07-11-2021

Written in

2020/2021

A complete summary of all the lectures for the midterm exam. The first three weeks of the course.

Institution

Course

Whoops! We can’t load your doc right now. Try again or contact support.

Report Copyright Violation

Connected book

Andy Field Discovering Statistics Using IBM SPSS

Edition:november 2017
ISBN:9781526419521
Edition:5

Written for

Institution: Universiteit van Amsterdam (UvA)
Study: Business Administration
Course: Quantitative Data Analysis 2

All documents for this subject (10)

Document information

Uploaded on: November 7, 2021
Number of pages: 19
Written in: 2020/2021
Type: Class notes
Professor(s): Roger pruppers
Contains: All classes

Subjects

Content preview

QDA 2
LECTURES

WEEK 1

OV = Outcome Variable (Field)
- DV = Dependent Variable: test variable, variable to be explained
PV = Predictor Variable (Field)
- IV = Independent Variable: variable that explains
We are interested of the effect of a predictor variable on an outcome variable.

The p-value
- Stands for the probability of obtaining a result (or test-statistic value) equal to (or ‘more extreme’ than) what was actually
observed (the result you actually got), assuming that the null hypothesis is true.
- P ≤ 0.05
o Reject the null hypothesis and support the alternative hypothesis.
o Given the sample and the significance level of 5% there is sufficient support that the mean weight differs from 12g.
o A low p value indicates that the null hypothesis is unlikely.
- P > 0.05
- Do not reject the null hypothesis and do not support the alternative hypothesis.
- Given the sample and significance level of 5%, there is not sufficient support that the mean weight differs from 12g.

What is a conceptual model?
- Visual representations of relations between theoretical constructs and variables of interest.
- Model: simplified description of reality.
- The boxes represent variables.
- Arrows represent relationships between variables.
- Arrows go from predictor variables to outcome variables.
- Hypotheses refer to specific arrows e.g. relationships/effects/differences.

Levels of measurement of variables
- Categorical: subgroups are indicated by numbers. Made up of categories and names distinct entities.
o Nominal: two or more categories, in no particular order e.g. male and female.
o Ordinal: ordered categories e.g. small, medium, large.
- Quantitative: use numerical scales, with equal distances between values.
o Discrete: can take only certain values e.g. 1, 2, 3.
o Interval: equal intervals on the scale.
o Ratio: true and meaningful zero point e.g. time, income.
- In social sciences, we often treat ordinal scales as interval (pseudo) scales e.g. Likert scales (1 – 5 disagree to agree).

Moderation
- If the proposed effect is stronger in certain settings.
- Also called interaction.
- A moderator is a variable that affects the strength of the relation between
the predictor and outcome variable.

Mediation
- If the proposed relationship goes via another variable.
- A mediating variable explains the relation between the predictor and the
outcome variable.

Hypotheses
- H0: null hypothesis (rejected or not)
- H1: alternative/research hypothesis (supported or not)
- Hypotheses are developed prior to research. They are based on theory and previous research.
- Not all potential relationships need to be hypothesized:
o Every hypothesis refers to an arrow in the conceptual model.
o But not every potential arrow refers to a hypothesis.
- A hypothesis is a verbalized expression of an expected relationship between variables.

1

,One vs. two-sided testing
- If the hypothesis is one-sided, check if the hypothesis is in line with the results (e.g. mean plots).
- If they are in line (e.g. positive and right sided), divide the two tailed p-value by 2.
- If they are not in line, then by (1 – two tailed p-value/2).

Test Hypotheses
- Appropriate way to test hypotheses depends on:
o Nature of the relationship: derived from conceptual model.
• Main effects, moderation/interaction, mediation.
• Total direct, indirect effect.
o Nature of the data: not all of this is derived from conceptual model.
• Number of PV, number of OVs
• How are variables operationalized?
• Data type PVs, data type OVs
• If there are multiple groups: number of groups, relationship between them (dependent/independent).

Independent and Paired Samples T-test
- Paired-samples t tests compare scores on two different variables but for the same group of cases.
- Independent-samples t tests compare scores on the same variable but for two different groups of cases.
o Use when there is one quantitative outcome variable and one categorial predictor variable with two mutually exclusive
categories.

Analysis of Variance – ANOVA
- With ANOVA, we are examining how much of the variance in our data can be explained by our predictor variable.
- ideally 40 observations per group

One-way independent ANOVA
- One-way independent ANOVA: when the participants are different (independent groups) and there is only one predictor
variable.
- Conditions:
o One quantitative outcome variable (when the OV is quantitative – test on the mean)
o One categorical predictor variable
o Two or more mutually exclusive categories/groups (independent groups)
- Assumptions: need to adhere to these assumptions, in order to prevent invalid outcomes.
o Variance is homogeneous across groups.
o Residuals are normally distributed.
o Groups are roughly equal sized.
- Distinguish between:
o Number of categories within one categorial predictor variable.
o Number of predictor variables.
- Hypotheses:
o H0: μ1 = μ2 = … = μi
• i = number of categories
• No difference in OV mean across the different categories in PV.
o H1: at least one μ differs
• There is at least one difference in OV mean score between PV categories.
- Based on an F-Test
o Test statistic: F-test
o F-distribution looks different than t-distribution.
o F-values are looking to explain variability.
- ANOVA decomposes total variability observed in OV into variation explained by the model and residual variation.
o Explained variability: how much is caused by differences between groups?
o Unexplained variability: how much is caused by differences within groups?
o Prefer a larger proportion of the variability to be explained than unexplained.

Variability measures
- Variance: the averages of the squared differences from the mean.
- Sum of squares: the sum of the squared differences from the mean.
o Used for ANOVA analysis.
o Use squared deviations because we want positive outcomes.

2

, Sums of squares
SStotal = SSmodel + SSresidual
- Total sum of squares
o Squared deviations from grand overall mean.
o Total variability to be explained.
- Model Sum of Squares
o Between SS: explained variability.
o Squared deviations group means from grand overall mean.
o How much variability can be explained by differences between groups?
- Residual sum of squares
o Unexplained variability: within SS.
o Squared deviations observations from group means.
o How much variation within groups?
o Thus, not explained by the groups we compare.

How to use the sums of squares?
1. R2: proportion of total variance in our data that is “explained” by our model.
!!
o R2 = !!!
"
- Explained variability / total variability
- Model Sum of Squares / Total Sum of Squares
- An important and valuable indication but not a formal statistical test.
2. F-Test
- To investigate if the group means differ with an ANOVA, we do an F-test.
- This is a statistical test and checks the ration explained variability to unexplained variability.
"#$%&'(") +&,'&-'%'./ -".1""( 2,30$ +&,'&-'%'./
o F(ratio) = =
0("#$%&'(") +&,'&-'%'./ 1'.4'( 2,30$ +&,'&-'%'./
- We cannot divide the model sum of square by the residual sum of squares because they are not based on same number of
observations/df.
- We therefore divide by the degrees of freedom to get Mean Squares (MS)
5! !! /)7 !! /89:
o F = 5!! = 5!! /)7! = !! !/((98)
# # # #
- We want a large F value because this means that a larger proportion of the variability is explained.

- Degrees of freedom (df) one-way independent ANOVA:
o dfM = k-1
o dfR = n-k
o dfT = n-1
*k = number of categories
*n = number of observations

From F to p to conclusion H0
- F is a test statistic which means it has both a null hypothesis and an alternative hypothesis.
- From test statistics to p-value:
o From F-ratio to p-value (depends on df)
o Look in F-table for critical value: dfR and dfM
- From (critical) p-value to conclusion H0
o If F-ratio > critical p-value: reject H0

One-way independent ANOVA calculations example
Research question: is there a relation between shopping platform and customer satisfaction?
- PV = shopping platform (categorical) with 3 levels/categories:
o 1 Brick-and-mortar store
o 2 Web shop
o 3 Reseller
- OV = customer satisfaction (quantitative)
o Score from 1-50
- 10 observations – not realistic
- A 1-way independent ANOVA is appropriate because there is one quantitative outcome variable and one categorical
predictor variable with more than two mutually exclusive categories.

H0: μ1 = μ2 = μ3
H1: at least one μ differs
3

$9.64

Get access to the full document:

100% satisfaction guarantee

Immediately available after payment

Both online and in PDF

No strings attached

Get to know the seller

bastudent

3.4

(8)

Get to know the seller

bastudent Universiteit van Amsterdam

View profile

Sold

Member since

5 year

Number of followers

Documents

Last sold

3 months ago

3.4

8 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller bastudent. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $9.64. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 55073 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

Quantitative Data Analysis 2 Midterm Summary

Connected book

Written for

Document information

Subjects

Content preview

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?