Summary

Summary Discovering Statistics Using IBM SPSS Statistics Ch. 1-11 & 13 & 14 & 17 & 18

Name: Discovering Statistics Using IBM SPSS Statistics Ch. 1-11 & 13 & 14 & 17 & 18
SKU: doc_838350
Rating: 4.00 (7 reviews)
Author: jettejacobs

Rating

4.0

(7)

Sold

Pages

Uploaded on

05-10-2020

Written in

2017/2018

Extensive summary of the book: Discovering Statistics Using IBM SPSS Statistics by Andy Field. The summary includes chapter 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 17, and 18. It also includes several notes taken in class.

Institution

Course

Whoops! We can’t load your doc right now. Try again or contact support.

Report Copyright Violation

Connected book

Andy Field Discovering Statistics Using IBM SPSS

Edition:maart 2013
ISBN:9781446249185
Edition:4

Written for

Institution: Maastricht University (UM)
Study: International Business
Course: Statistics

All documents for this subject (1)

Document information

Summarized whole book?: No
Which chapters are summarized?: 1-11
Uploaded on: October 5, 2020
Number of pages: 67
Written in: 2017/2018
Type: Summary

Subjects

statistics
premaster zuyd
international business
andy field
andy
field
ib
maastricht university

Content preview

Discovering Statistics Using IBM SPSS Statistics
Chapter 1
Levels of measurement
Categorical (entities are divided into distinct categories):
- Nominal variable/categorical
• Binary (Only two values possible: Married, Pregnant, etc.)
• With more than two categories (e.g. whether someone is an omnivore, vegetarian, vegan, or
fruitarian)
- Ordinal variable: The same as a nominal variable but the categories have a logical order from
lower to higher, smaller to larger
-e.g. whether people got a fail, a pass, a merit or a distinction in their exam
-Answers to statements on a 5-point or 7-point scale are typically ordinal
Continuous (entities get a distinct score):
- Interval variable: Equal intervals on the variable represent equal differences in the property
being measured
-e.g. Temperature in degrees Celsius: the difference between 6 and 8 is the same as
the difference between 13 and 15
- Ratio variable: The same as an interval variable, but the ratios of scores on the scale must
also make sense (if you have 0 money in your pocket, it does not have any value so that
would be an interval variable. If the temperature is 0 degrees, it does mean something =
ratio)
-e.g. an income of 30000 dollars is twice as much as an income of 15000 dollars
➔ Often taken together as Interval-Ratio or Scale

Validity
Criterion validity = whether you can establish that an instrument measures what it claims to
measure through comparison to objective criteria
- Concurrent validity = when data are recorded simultaneously using the new instrument and
existing criteria
- Predictive validity = when data from the new instrument are used to predict observations at
a later point in time

Confounding variables/confounds = extraneous factors (external factors that cause things)

Chapter 2
The degree to which a statistical model represents the data collected is known as the fit of the
model. We are interested in finding results that apply to an entire population. This is often not
possible, therefore we collect data from a small subset of the population → sample
Scientists tend to describe data with linear models → models based upon a straight line, linear =
straight, non-linear = curved

We want to have a good fit! We look at four things:
- Normal distribution
- Homogeneity → the way that the nature of the data is
- Variance → nature is the same, so I can compare them
- Linearity → to be able to predict (formula) we need to have a linear relationship. If there is
no linear relationship, you will have a scatterplot → difficult to predict

1

,Populations and samples
• Population → all the things of interest; all the things we can measure
- The collection of units (be they people, plants, cities, etc.) to which we want to generalize a
set of findings or a statistical model
• Sample
- A smaller (but hopefully representative) collection of units from a population used to
determine truths about that population
• Random sample
- Is a sample drawn in such a way that each case in the population has the same chance of
being drawn into our sample (with sample we always mean a random sample unless stated
otherwise)
- We could use a numbered list of all the cases in the population (a sample frame) and use
random numbers to select some cases
- Most sampling methods that you find discussed in the literature (stratified sampling,
systematic sampling, etc.) are sampling methods that are used when sampling frames are not
available (or too expensive) and that we hope result in more or less random samples

Outcome i = (model) + error I
→ regression variable (singular regression/multiple regression)

Statistical models are made up of variables (measured that vary) and parameters → estimated from
the data (not measured) and are usually constant (e.g. mean)
- In statistics we fit models to our data (i.e. we use a statistical model to represent what is
happening in the real world)
- The mean is a hypothetical value (i.e. it doesn’t have to be a value that actually exists in the
data set) (e.g. the mean number of children that women have is 2.12)
- The mean is a simple statistical model

The mean
- The mean is the value from which the (squared) scores deviate least (it has the least error)
n

 xi
Mean : X = i =1
n

x : the value for case i
i

n : the number of cases
 : sum (add them all up)

The mean as a model

• The mean is a model of what happens in the real world: the typical score
• It is not a perfect representation of the data
• How can we assess how well the mean represents reality?

The perfect fit

2

,Calculating ‘Error’
• A deviation is the difference between the mean and an actual data point.
• Deviations can be calculated by taking each score and subtracting the mean from it:

• Total Error
- We could just take the error between the mean and the data and add them.

Sum of Squared Errors
• We could add the deviations to find out the total error.
• Deviations cancel out because some are positive and others negative.
• Therefore, we square each deviation.
• If we add these squared deviations we get the Sum of Squared Errors (SS).
• Although the SS is a good measure of the accuracy of our model, it depends on the amount
of data collected. To overcome this problem, we use the following formula, where
N is the sample size and df = N-1 the degrees of freedom:

• Sample → X = 10
• Population →  = 10

The sum of squared error and the mean squared error are used to assess the fit of a
model. When the model is the mean, the mean squared error is called variance and the square
root of the variance is called the standard deviation (p.49). The mean squared error is the sum of
squared errors divided by the number of degrees of freedom – in the case of the variance divided
by N-1

Variance and Standard Deviation
• We call the mean squared error the variance when the model is the mean.
• The square root of the variance is called the standard deviation
( )
n

 xi − x
2

SS
Variance = s = MSE = =
2 i =1
df n −1

( )
n

 xi − x
2

SD = s = =
2 i =1
s n −1

The Standard Error
• SD tells us how well the mean represents the sample data. The smaller the SD is, the better
the mean represents the sample data.
• But, if we want to estimate this parameter in the population, then we need to take into
account the SD of the population and the size of the sample that we used to estimate that
parameter: the larger the sample size, the more accurate our estimate.
When we want to compare means of samples, we tend to compare SE’s instead of SD’s

3

, To estimate the mean of the population to the left with a certain accuracy a much larger sample is
required than for the population to the right.

The standard error of a statistic (e.g. the mean) is the standard deviation of the
sampling distribution of that statistic. The standard deviation of the population mean measures
how well the population mean fits the individual cases in the population. The standard error of
the mean measures how well the sample mean fits the population mean

Samples vs. populations
• Sample
- Mean and SD describe only the sample from which they were calculated
• Population
- Mean and SD are intended to describe the entire population
• Sample to population:
- Mean and SD are obtained from a sample, but are used to estimate the mean and SD of the
population

Central Limit Theorem (0)
• The CLT tells us something important about how random samples behave.
• Suppose we drew many samples of a certain size (say n=20) from a given population and
calculated the mean of every sample. How would the frequency distribution of all these
sample means look like? We call this distribution the sampling distribution of the sample
means.

You should get a normal distribution. The larger the number of samples is, the more the graph will
represent the normal distribution, even though the population may not be normally distributed.

If a population has standard deviation σ from which we draw many samples of size N, then the
standard deviation of the sampling distribution of the sample mean


X =
N

Method of least squares → principle of minimizing the sum of squared error
Sampling variation → samples will vary because they contain different members of the population
Sampling distribution → frequency distribution of sample means from the same population
Standard deviation of sample means → standard error of the mean (SE) /standard error
Central limit theorem → as samples get large (greater than 30), the sampling distribution has a
normal distribution with a mean equal to the population mean
Confidence intervals → calculate boundaries within which we believe the population will fall

Confidence intervals

4

$6.62

Get access to the full document:

Purchased by 49 students

100% satisfaction guarantee

Immediately available after payment

Both online and in PDF

No strings attached

Get to know the seller

jettejacobs

4.0

(7)

Reviews from verified buyers

Showing all 7 reviews

oldasubrt - · 12 reviews

3 year ago

bobbiemol · 2 reviews

3 year ago

steffiedekoningh Psychologie · 5 reviews

4 year ago

I think it's a very good summary though, did I read wrong and it's not the chapter I was looking for.

lillyjared14 · 2 reviews

4 year ago

sterrestikkelorum Liberal Arts and Sciences · 39 reviews

4 year ago

maxinemeyers f · 9 reviews

5 year ago

ndihma · 22 reviews

5 year ago

4.0

7 reviews

Trustworthy reviews on Stuvia

All reviews are made by real Stuvia users after verified purchases.

Get to know the seller

jettejacobs Maastricht University

View profile

Sold

Member since

5 year

Number of followers

Documents

Last sold

2 year ago

4.0

7 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller jettejacobs. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $6.62. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 43863 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 15 years now

Summary Discovering Statistics Using IBM SPSS Statistics Ch. 1-11 & 13 & 14 & 17 & 18

Connected book

Written for

Document information

Subjects

Content preview

Reviews from verified buyers

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?