Theory = a set of principles that explains a general broad phenomenon
Hypothesis = a proposed explanation for a fairly narrow phenomenon or set of observations.
not a guess but theory-driven attempt to explain the observed
● hypothesis ≠ prediction. Instead, hypothesis is an explanatory statement
Falsification = act of disproving a hypothesis or theory
Independent variable: a variable that we think is a cause, because its value does not depend
on any other variables ( = predictor variable)
Dependent variable: a variable that we think is an effect, because the value of this variable
depends on the cause (= outcome variable)
Categorical variable =
● Binary made up of distinct categories. (eg male-female = binary variable)
● Nominal variable = two things are equivalent in some sense are given the snaem
name/number, but more than two possibilities. Frequency needs to be considered
● Ordinal variable = when categories are ordered. Does however not tell the difference
between values
Continuous variable = one that gives us a score for each person and can take on any value
on the measurement scale that we are using
● Interval variable = difference along interval is same; scale 1-5 the difference between
3-4 and 1-2 is the same.
● Ratio variables = scale must be meaningful. In 1-5 example a 4 is twice as good as 2
but no true zero? Reaction time, the difference is visible and there is a true zero
(absence of time)
● Discrete variables = can take on only certain values. eg 1-5 rating, you can’t give
4.35 though it is a logical rating. A continuous variable can take unlimited values e.g.
age 34y 7m 21d 10h 55 10s 100ms 63mics 1 nanosec.
Measurement error = discrepancy between the numbers used to represent the thing
measured and the actual value of the thing measured
Validity = which is whether an instrument measures what it sets out the measure.
● Criterion validity: whether you can establish that an instrument measures what it
claims to measure through comparison to objective criteria
○ Concurrent validity: when data are recorded simultaneously using the new
instrument and existing criteria.
○ Predictive validity: when data from the new instrument are used to predict
observations at a later point in time
Reliability = which is whether an instrument can be interpreted consistently across different
situations. test-retest reliability: assessing reliability by testing the same people twice
Correlational research method: observing natural events. Either by snapshot of many
variables once, or by measuring variables repeatedly at different time points. (longitudinal
research)
, unsystematic variation: variation results from random factors that exist between the
experimental conditions
Systematic variation: variation is due to the experimenter doing something in one condition
but not in the other condition
Two most important sources of systematic variation in this type of design are:
- practice effects: participants may perform differently in the second condition because
of familiarity with the experimental situation and/or the measures being used
- Boredom effects: participants may perform differently in the second condition
because they are tired or bored from having completed the first condition
Risk minimised by counterbalancing: randomising the order either condition 1 before two or
vice versa.
Analysing data
Frequency distributions = histogram.
● Normal distributions = line through middle and then both sides look similar
● Skew: lack of symmetry
○ Positive skew: the frequent scores are clustered at the lower end and the tail
points towards the higher or more positive scores
○ Negative skew: the frequent scores are clustered at the higher end and the
tail points toward the lower or more negative scores
● Kurtosis: pointiness/degree to which scores cluster at the ends of the distribution
○ Leptokurtic: positive kurtosis, many scores in the tails and is pointy
○ Platykurtic: negative kurtosis, relatively thin in the tails and tends to be flatter
than normal
Mode = score that occurs most frequently in the data set
- bimodal = two values appear most frequent
- Multimodal = datasets with more than two modes
Median = the middle score when scores are ranked in order of magnitude
- Calculate: (n+1)/2 = median after ranking them.
Mean = the average score: add up all scores then divide by total numbers of scores
Dispersion in distribution:
- range: highest - lowest
- Interquartile range: range of middle 50% of scores.
- lower quartile is the median of lower half
- upper quartile is the median of upper half
Deviance=difference between each score and the mean
total deviance = sum of all deviances. This will always result in 0. Therefore:
sum of squared errors (SS) = (data point - mean)^2
- SS gives high total dispersion. Therefore we use average dispersion
Variance(s^2)= SS / N-1 // Standard deviation = 𝑆𝑆/𝑛 − 1
Probability: chance of something happening. 0=no chance // 1=will definitely happen
- Probability distribution: (x) value of the variable against (y) the probability of it
occurring.
𝑥−𝑥
Easiest way of normal distribution: mean=0 and std. deviation = 1//𝑧 = 𝑠
Week 2: CHP 2: What is the SPINE of statistics