100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Class notes

Videolectures Statistics Radboud Universiteit

Rating
4.0
(1)
Sold
-
Pages
18
Uploaded on
11-03-2019
Written in
2017/2018

Notes of the videolectures Statistics (English)

Institution
Course










Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Course

Document information

Uploaded on
March 11, 2019
Number of pages
18
Written in
2017/2018
Type
Class notes
Professor(s)
Unknown
Contains
All classes

Subjects

Content preview

Statistics videolectures
Lecture 1
Stephen Toulmin’s model of argumentation
 Claim: choice for a technique
 Ground: statistical output, type of research questions, measurement levels (on what data is
your decision based on): information to found your conclusion
 Warrant: general rules, statistical principles

Purpose of data analysis: get information to answer research questions
 Numerical methods for describing sets of data:
 Frequency table (can be used for every variable, regardless of measurement level)
 Measures of central tendency and variability (choice depends on measurement level)

First, you collect the data you need, then you analyze the data, organize them and calculate the
order and relations between the variables. This way you can get the answers to research questions.

Most of the time we have to do with descriptive static questions. Typical: there is no time indication
involved. Also, all questions deal with one characteristic.
There is no median when you are
collecting nominal data, because the
median has to do with the order of
variables: nominal data doesn’t have
any order.

Variability or dispersion is about how
the scores are spread around the
measurement of central tendency. It
is not applicable for nominal
variables, because there can’t be a
dispersion around the mode.

When we look at the interquartile
range, the median is the midst and
the interquartile range is 25 % above the median and 25% below the median. Adding this up gives us
the midst 50 % of the observations. IQR = the upper quartile minus lower quartile (75-25).

When you work with the mean, you can use variance or standard deviation.

There are two ways to interpret the standard deviation. The choice depends on the shape of your
distribution. When you have a normal distribution, we can use the empirical rule. When your
distribution is not symmetric and bell shaped, we use Chebyshev’s rule.

,The last part of the videolecture is about how to determine the skewness and shape of a distribution.
For this we use the median and the mean, so it is only applicable for interval and ratio level. When
you have a perfectly bell shaped distribution, the mean and the median are the same (midst).

If your mean is smaller than your median, you have a negatively skewed distribution with a tail on
the left side. When your mean is higher than your median, you have positively skewed distribution, it
is the other way around.




Videolecture 2
The process of estimation contains four steps:
1. Determine the population: is presented in the research question or hypothesis
2. Draw the sample: we draw samples because populations are too high
3. Determine the sample value (X)
4. Estimates and tests by analyses

When we work with samples, we have to establish how confident we are about the estimations.
Determining the confidence interval is one way to do this. This is a range of scores we are confident
about (empirical rule and normal distribution). We are allowed to do that because of the central limit
theorem. We can draw a lot of different samples from 100 out of a population of 10.000, the
elements in the samples can vary, for example one sample can have older people or younger people
than the other. For each sample we can calculate a characteristic, for example the mean.

We can gather all means and put them
in a database: make a new variable
with all means for all samples of that
certain sample size. If our sampling
distribution is normally distribution we
know the percentages of 90, 95 and
99 percent. We can calculate intervals
of confidence and we can do tests
whether or not our calculations are
right. + 1.65 means ‘1.65 standard
deviation from the mean’.


To work with the features of a sampling distribution, we need to know the standard deviation, here
called ‘standard error of the mean’: the standard deviation of all possible sample means. This is
almost always unknown, so we have to calculate it. We do this by calculating the standard deviation
of our own sample (S) and divide this by the square root of N.

, The confidence interval is the probability that the random selected interval encloses the
unknown parameter. You don’t know the real parameter (for example, the mean). We
need to know two things: the confidence interval and alpha. Alpha is the probability that
the random selected interval does not enclose the unknown parameter. Alpha is the insecurity that
the estimated parameter is not in the confidence interval. The confidence interval is 1 – alpha. Alpha
has the value of 0,01, 0,05 or 0,10, so the confidence interval has the following values: 99% (a =
0.01), 95% (a = 0.05), 90% (a = 0.10).

To calculate confidence intervals, we use the normal distributions. When the sample distribution is a
normal distribution (samples larger than 30), we can work with the z-value. When you work with a
small sample, we need to use t-value. This is related to the degrees of freedom and minus one.

Two tailed means that the interval is equally spread on both sides of the distribution. When you have
determined the critical z-value or t-value, you multiply this with the SE (standard error of the mean),
the sum is added and subtracted from the mean and the outcome is the confidence interval. The t-
values become infinite: then they are similar to z-value.

You can also calculate confidence intervals around the proportion (instead of the
mean). For example, you have males and females (no males). You can ask: what
is the mean time that 1 (male) is scored? The result is the proportion. When you
have a sample of a 100 respondents with 55 males, the number of the times 1 is
scored is 55 divided by 100 = 0,55. This is the estimated proportion. Then we calculate the standard
error of the proportion with the formula on the right. In this example: 0,55 (1-0,55)/100 and from
this number the square root. Then we multiply that number with 3 and subtract and add this to the
mean, so we get two different values: the values the proportion is in. If this does not contain zero or
one, we can use the normal distribution.

In SPSS: analyze – descriptive statistics – explore and find the variable you want to calculate, but it in
the dependent list and click ‘ok’. You have to make sure that the confidence interval is set on 95
percent (statistics – descriptive).

When you calculate something, you also have to tell why this is correct: the warrant. The conclusion
‘I am 95 percent sure that the proportion of X lies between X and X’ is called the claim. The data you
use is the ground. Underneath the information there are some examples from the lecture.

If you have 4 categories, you have to make sure that you make 2 categories. The category you say
something about is category 1 and the other 3 make category 2 (with the option recode into different
variables).

Example:
Claim (conclusion): We are 99% confident that the proportion of Dutch people that live in the West
part of the country lies between 0.413 and 0.465 (that is between 41.3% and 46.5%).

Ground (data): on the next page.

Warrant (explanation): This conclusion is right, for a sample of 30 cases or more is considered a large
sample. We have a sample of 2384 Dutch people, therefore we can assume that the sampling
distribution is approximately normal. We may use the normal approach when the formula does not
include 0 or 1. Here it is [.4087-.4697], so the normal approach is allowed. The confidence level is
99%, so α = 1% = .01. This has to be divided by two, for a confidence interval is two-sided. The
appropriate z-value is then Zα/2 = Z.005 = ±2.580. The SPSS output about the confidence interval shows

Reviews from verified buyers

Showing all reviews
2 year ago

4.0

1 reviews

5
0
4
1
3
0
2
0
1
0
Trustworthy reviews on Stuvia

All reviews are made by real Stuvia users after verified purchases.

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
Romygerritsen Radboud Universiteit Nijmegen
Follow You need to be logged in order to follow users or courses
Sold
1536
Member since
11 year
Number of followers
937
Documents
4
Last sold
1 month ago

Hi there! Mijn naam is Romy, op dit moment master student aan de Radboud Universiteit (Business Administration, specialisatie in Gender Equality, Diversity and Inclusion in Management).

4.0

230 reviews

5
50
4
132
3
42
2
4
1
2

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions