100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Class notes

Statistics notes in thorough detail from first-class student

Rating
-
Sold
-
Pages
23
Uploaded on
17-08-2023
Written in
2020/2021

Lecture notes of 23 pages for the course PSY237 Research Methods & Statistics at SWAN (class notes)

Institution
Course










Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Unknown
Course

Document information

Uploaded on
August 17, 2023
Number of pages
23
Written in
2020/2021
Type
Class notes
Professor(s)
Dr playfoot
Contains
All classes

Subjects

Content preview

Recap year 1

Scientific Method - observe->question->hypothesis->experiment->conclusion
Also, falsifiability of hypotheses is important (possible that its incorrect)

Reliability - internal/external

Validity, confound, bias etc

Participant information sheet for research:

https://swanseachhs.eu.qualtrics.com/jfe/form/SV_9KwCUp9GA63clXD (Links to an external site.)



Section 1

Type 1 error: say there’s a difference but there isn't
Type 2 error: say there's no difference but there is

p < .05 is essentially the likelihood of making a Type 1 error and rejecting the null hypothesis when it is true.



Standard normal distribution -
The formula for creating the standard score, or z score, is: z = (X - μ) / σ

X represents the individual score. μ is the mean for all those scores in your sample. σ is the standard
deviation of the sample. All that means is that, for each participant, you take the average score away from
their own score and divide the result by the standard deviation of the sample.



Parametric assumptions -
Assumption of interval/ratio data (necessity for DV scores, easily verified)
Assumption of independent scores
Assumption of normality (central limit theorem)
Assumption of homogeneity of variance (Levene’s test)



Non-parametric statistics used when assumptions are too badly violated (nominal data or ordinal data, don’t
require assumptions, skewed distributions)



Two correlation tests: Pearson’s r (correlation based on z/standard scores, parametric), Spearman’s rank
(correlation based on ranks, non-parametric)
Zero correlation – can happen if the measure of one of the variables is too hard or too easy



Central limit theorem – in many situations, when independent random variables are summed up, their
properly normalized sum tends toward a normal distribution, even if original variables themselves are not

,normally distributed.
Central Limit Theorem does not always hold. Must check the normality of samples otherwise it might go
wrong.
As sample size increases, the sample mean will be normally distributed and hypothesis tests will be robust
against the violation of normality.



Standard normal distribution has a mean of zero and a standard deviation of 1.
68% of scores fall within 1 standard deviation either side of the mean in a normal distribution,
95% within 2 standard deviations either side of the mean, and 99.7% within 3 standard
deviations of the mean - this is the empirical rule.
Our alpha level is .05, or 5%, meaning that a score is significantly different from the mean if it
falls > 2 standard deviations away.


A bigger sample size results in greater power.

It takes more power to find an effect that is small than an effect that is large.

The more conservative your cut-off for significance, the more power you'll need to reach it (i.e. it takes more
power to get to p < .001 than it does to get to p < .05).

r value (correlation coefficient) Interpretation
0.3 Weak
0.5 Moderate
0.7 Strong




Regression
Set of tests used to predict what will happen in the future – regression tests. We expect
participant to behave in a certain way given the info we have on them already.
Method of least squares – for each line make note of the residuals (distance between line and
actual data point). Square each residual, add them up. Whichever line has the smallest total is
the best line.
We use inferential regression to know if the regression equation describes a line that fits the
data well. We assess whether there is significantly less error when we predict scores based on
the regression than if we make simplest prediction possible (that everyone scores the mean).
Strength of correlation: r
The more closely related two variables are, the stronger the correlation, greater the r squared
The more of the variance in one is explained by the variance in the other

, R squared value tells us how much of the variance is shared between the predictor and the
outcome
Also tells us the proportion of the total amount of variance by which you’ve improved your
prediction including the additional information.


To see if the line we draw to guess what participants will score on the outcome variable results
in significantly less error if we include the predictor variable, we use Analysis of Variance
(ANOVA).
How to tell whether your regression has significantly improved the accuracy of your prediction:
1. Figure out the amount of error between baseline model (the mean) and an individual data
point, square it, do it again. Then add them all up (SStotal)
2. Figure out the amount of error between model including predictor and an individual data
point, square it, do it again. Then add them all up (SSresidual)
3. SStotal – SSresidual = SSmodel (amount of reduction in error after adding in new info)
Sums of squares depend on how many values you’ve added up
MSmodel=SSmodel/no of variables in the model (not including constant) [degrees freedom]
MSresidual=SSresidual/no of observations-no of betas being estimated
In a simple linear regression:
The number of variables in the model that are not the mean is only going to be 1
There will only ever be two betas
MSmodel/MSresidual gives F ratio (same distribution of critical values that we use for ANOVA)
If probability less than 0.5, we have significant F value, so prediction we’ve made when
including our new variable is significantly better than if we were to just use the mean.


Assumptions of simple linear regression:
Use scatterplot, of relationship isn't linear, prediction won't hold and you cany use test without
transforming data to make it linear
Independent values of outcome variable (should come from separate participants)
We usually report simple linear regression by saying:
Whether the prediction was significantly improved by the inclusion of the predictor variable
& How much of the variance in the outcome variable is explained by the predictor
(e.g. "The addition of the predictor variable significantly improved the prediction [F(df, df) = whatever it
equals, p < .05]. The final model was able to explain x% of the variance in the outcome variable.")
$10.42
Get access to the full document:

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Get to know the seller
Seller avatar
griches10

Get to know the seller

Seller avatar
griches10 Swansea University
Follow You need to be logged in order to follow users or courses
Sold
0
Member since
2 year
Number of followers
0
Documents
13
Last sold
-
Georgia's Notes

I have always been very thorough in my note-taking throughout my education, and like to write things in my own words, simplified and with lots of real-life examples.

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions