STATISTISCHE MODELLEN VOOR
COMMUNICATIEONDERZOEK
week 1
micro lecture 2.1
★ empirical cycle → 5 phases
○ hypothetic-deductive approach
1. observation: sparks idea for hypothesis pattern, unexpected event,
interesting relation we want to explain, source not important
(personal, shared, imagined, previous research) → observing relation
in one or more instances
2. induction: general rule → with inductive reasoning relation in
specific instances is transformed into general rule or hypothesis
○ inductive inference: relations holds in specific cases ⇒
relations holds in all cases
3. deduction: relation should hold in new instances,
expectation/prediction is deduced about new observations →
hypothesis is transformed with deductive reasoning and
specification of research setup into prediction about new
observations
○ determine research setup
○ define concepts, measurement instruments, procedures,
sample
4. testing: data collection, compare data to prediction, statistical
processing → new data collection and - with aid of statistics -
compared to predictions
○ descriptive: summarize
○ inferential: decide
5. evaluation: interpret results in terms of hypothesis → hypothesis
supported, adjusted or rejected
○ prediction confirmed ⇒ hypothesis provisionally supported,
n ot pr oven
○ prediction disconfirmed ⇒ hypothesis not automatically
rejected → repeat with better research set up/adjust
hypothesis/reject hypothesis
,literatuur hoofdstuk 1
★ expected value = mean of sampling distribution
1. draw thousands of samples
2. calculate the mean
3. and you have the true population value
★ musts:
1. random samples
2. unbiased estimator
3. continuous versus discrete: probability density vs. probabilities
4. impractical
★ sample statistic: a number describing a characteristic of a sample
★ sampling space: all possible sample statistic values
★ sampling distribution: all possible sample statistic values and their
probabilities or probability densities
★ probability density: a means of getting the probability that a continuous
random variable (like a sample statistic) falls within a particular range
★ random variable: a variable with values that depend on chance
★ expected value/expectation: the mean of a probability distribution, such as a
sampling distribution
★ unbiased estimator: a sample statistic for which the expected value equals the
population value
★ statistical inference = about estimation and null hypothesis testing
★ simulation: means that we let a computer draw many random samples from a
population
★ inferential statistics: offers techniques for making statements about a larger
set of observations from data collected for a smaller set of observations
○ population = the large set of observations about which we want to make
a statement
○ sample = the smaller set
○ we want to generalize a statement about the sample to a statement
about the population from which the sample was drawn
★ the sample statistic is called a random variable; it is a variable because
different sample can have different scores; the value of a variable may vary
from sample to sample; it is a random variable because the score depends on
chance, namely the chance that a particular sample is drawn.
★ probability distribution of the sample statistic: if we change the frequencies in
the sampling distributions into proportions.
★ discrete probability distribution ⇒ only a limited number of outcomes are possible
, ★ expected values = the average of the sampling distribution of a random
variable (also called: the expectation of a probability distribution)
★ unbiased estimator ⇒ if the expected value is equal to the population statistic
○ we usually refer to the population statistic as a parameter
★ downward biased ⇒ underestimate the number in the population
★ a sample is representative of a population (in the strict sense) if variables in
the sample are distributed in the same way as in the population
○ in principle representative/representative in the statistical sense
(because it is likely to differ from the real population)
★ continuous variable: we can always think of a new value in between two values
★ probability density function: if there is a label to the vertical axis of a
continuous probability distribution, it is ‘probability density’ instead of
‘probability’. a probability density function can give us the probability of
values between two thresholds.
○ left-hand probability: the probability of values up to (and including) a
threshold value → used to calculate p values
○ right-hand probability: the probability of values above (and including ) a
threshold value → used to calculate p values
★ we can use probability distributions in two ways:
○ we can use them to say how likely or unlikely we are to draw a sample
with the sample statistic value in a particular range
○ we can use them to find the threshold values that separate the top ten
percent or the bottom five per cent in a distribution
hoorcollege 1
★ statistical literacy
○ knowledge (basic understanding of concepts)
■ identify
■ describe
○ skills (ability to work with statistical tools)
■ translate
■ interpret
■ read
■ compute
★ statistical reasoning
○ understanding
■ explain why
■ explain how
★ statistical thinking
, ○ apply
■ what methods to use in a specific situation
○ critique
■ comment and reflect on work of others
○ evaluate
■ assigning value to work
○ generalize
■ what does variation mean in the large scheme of life
★ verschil binomiale verdeling vs. normale verdeling:
○ bij een normale verdeling kunnen alle waarden er zijn (bijv. ook 2,99)
○ bij een binomiale verdeling niet
★ het gemiddelde van de steekproevenverdeling is hetzelfde als de populatie
proportie
★ verwachte waarde altijd hetzelfde als het gemiddelde in de
steekproevenverdeling en het gemiddelde in de populatie
★ parameter is het gemiddelde in de populatie
★ sampling distribution ⇒ cases zijn daar de steekproeven
★ in strikte zin = identiek
★ het gemiddelde van de steekproef verdeling is een zuivere schatter (unbiased
estimator)
★ een discrete variabele heeft vaste uitkomsten, dus je gebruikt probabilities ipv
probability density
○ probability density bij continue variabele
COMMUNICATIEONDERZOEK
week 1
micro lecture 2.1
★ empirical cycle → 5 phases
○ hypothetic-deductive approach
1. observation: sparks idea for hypothesis pattern, unexpected event,
interesting relation we want to explain, source not important
(personal, shared, imagined, previous research) → observing relation
in one or more instances
2. induction: general rule → with inductive reasoning relation in
specific instances is transformed into general rule or hypothesis
○ inductive inference: relations holds in specific cases ⇒
relations holds in all cases
3. deduction: relation should hold in new instances,
expectation/prediction is deduced about new observations →
hypothesis is transformed with deductive reasoning and
specification of research setup into prediction about new
observations
○ determine research setup
○ define concepts, measurement instruments, procedures,
sample
4. testing: data collection, compare data to prediction, statistical
processing → new data collection and - with aid of statistics -
compared to predictions
○ descriptive: summarize
○ inferential: decide
5. evaluation: interpret results in terms of hypothesis → hypothesis
supported, adjusted or rejected
○ prediction confirmed ⇒ hypothesis provisionally supported,
n ot pr oven
○ prediction disconfirmed ⇒ hypothesis not automatically
rejected → repeat with better research set up/adjust
hypothesis/reject hypothesis
,literatuur hoofdstuk 1
★ expected value = mean of sampling distribution
1. draw thousands of samples
2. calculate the mean
3. and you have the true population value
★ musts:
1. random samples
2. unbiased estimator
3. continuous versus discrete: probability density vs. probabilities
4. impractical
★ sample statistic: a number describing a characteristic of a sample
★ sampling space: all possible sample statistic values
★ sampling distribution: all possible sample statistic values and their
probabilities or probability densities
★ probability density: a means of getting the probability that a continuous
random variable (like a sample statistic) falls within a particular range
★ random variable: a variable with values that depend on chance
★ expected value/expectation: the mean of a probability distribution, such as a
sampling distribution
★ unbiased estimator: a sample statistic for which the expected value equals the
population value
★ statistical inference = about estimation and null hypothesis testing
★ simulation: means that we let a computer draw many random samples from a
population
★ inferential statistics: offers techniques for making statements about a larger
set of observations from data collected for a smaller set of observations
○ population = the large set of observations about which we want to make
a statement
○ sample = the smaller set
○ we want to generalize a statement about the sample to a statement
about the population from which the sample was drawn
★ the sample statistic is called a random variable; it is a variable because
different sample can have different scores; the value of a variable may vary
from sample to sample; it is a random variable because the score depends on
chance, namely the chance that a particular sample is drawn.
★ probability distribution of the sample statistic: if we change the frequencies in
the sampling distributions into proportions.
★ discrete probability distribution ⇒ only a limited number of outcomes are possible
, ★ expected values = the average of the sampling distribution of a random
variable (also called: the expectation of a probability distribution)
★ unbiased estimator ⇒ if the expected value is equal to the population statistic
○ we usually refer to the population statistic as a parameter
★ downward biased ⇒ underestimate the number in the population
★ a sample is representative of a population (in the strict sense) if variables in
the sample are distributed in the same way as in the population
○ in principle representative/representative in the statistical sense
(because it is likely to differ from the real population)
★ continuous variable: we can always think of a new value in between two values
★ probability density function: if there is a label to the vertical axis of a
continuous probability distribution, it is ‘probability density’ instead of
‘probability’. a probability density function can give us the probability of
values between two thresholds.
○ left-hand probability: the probability of values up to (and including) a
threshold value → used to calculate p values
○ right-hand probability: the probability of values above (and including ) a
threshold value → used to calculate p values
★ we can use probability distributions in two ways:
○ we can use them to say how likely or unlikely we are to draw a sample
with the sample statistic value in a particular range
○ we can use them to find the threshold values that separate the top ten
percent or the bottom five per cent in a distribution
hoorcollege 1
★ statistical literacy
○ knowledge (basic understanding of concepts)
■ identify
■ describe
○ skills (ability to work with statistical tools)
■ translate
■ interpret
■ read
■ compute
★ statistical reasoning
○ understanding
■ explain why
■ explain how
★ statistical thinking
, ○ apply
■ what methods to use in a specific situation
○ critique
■ comment and reflect on work of others
○ evaluate
■ assigning value to work
○ generalize
■ what does variation mean in the large scheme of life
★ verschil binomiale verdeling vs. normale verdeling:
○ bij een normale verdeling kunnen alle waarden er zijn (bijv. ook 2,99)
○ bij een binomiale verdeling niet
★ het gemiddelde van de steekproevenverdeling is hetzelfde als de populatie
proportie
★ verwachte waarde altijd hetzelfde als het gemiddelde in de
steekproevenverdeling en het gemiddelde in de populatie
★ parameter is het gemiddelde in de populatie
★ sampling distribution ⇒ cases zijn daar de steekproeven
★ in strikte zin = identiek
★ het gemiddelde van de steekproef verdeling is een zuivere schatter (unbiased
estimator)
★ een discrete variabele heeft vaste uitkomsten, dus je gebruikt probabilities ipv
probability density
○ probability density bij continue variabele