Bachelor in Social Sciences: 1
2018-2019
,Table of Content
Introduction: statistics
o What are statistics?
o Variables
§ Categorical variables
- Nominal
- Ordinal
§ Metric variables
- Interval
- Ratio
§ Continuous variable
§ Discrete variable
Univariate, descriptive statistics
o Frequencies and visual presentations
o Measures of center
§ Mean y̅
§ Median M
§ Mode
o Measures of variability
§ Range
§ Variance s2
§ Standard deviation s
§ Variation coefficient V
o Measures of position
§ Percentiles
§ Interquartile range IQR
§ Boxplot
§ Outlier
o From sample to population distribution
§ Normal distribution
o Shape of distributions
§ Skewness
§ Kurtosis
o Standard normal distribution
o Z-transformations
Univariate, inferential statistics
o Simpe random sampling
o Sampling variability
o Sample error of a statistic
o Sampling distribution of a sample statistic
§ Sampling distribution of the sample mean
- Mean of sampling distribution of the sample mean
- Standard error of the sample mean
2
, - Law of large numbers
- Central limit theorem
- Normal distribution (Gauss-curve)
- Point estimate
- Interval estimate
- Confidence interval
- Z-scores
- T-distribution
- T-scores
§ Sampling distribution of the sample proportion
- Central limit theorem
- Mean of sampling distribution of the proportion
- Standard error of the sample proportion
- Z-scores
- Confidence interval
§ Sampling distributions of the median
o Significance testing
§ Assumptions
§ Hypotheses
§ Test value
§ P-value
§ Conclusion and interpretation
Bivariate, inferential statistics
o Symmetrical relations
o A-symmetrical relations
o Relationships between two categorical variables
- Cross-table or contingency table
- Chi-squared test of independence (with significance testing)
- Residual analysis
- Measures of association
o Relationships between two metric variables
- Scatterplot
- Covariance
- Correlation
- Regression analysis
1. Determining X and Y
2. Calculating the linear relationship
3. Residual analysis
! We can apply the logic of significance testing on linear regression analysis
Multivariate, linear regression analysis
o Basic model
o Mediating model
o Interaction or moderation model
3
, Statistics
What are statistics?
= a body of methods for obtaining and analyzing data
Population
= a total set of subjects of interest in a research
- Number of subjects = population size = N
!!! Requires a clear definition: who’s in and who’s out?
Sample
= a smaller subset of subjects selected from the research population
- Simple, random selection à representative sample
- Number of sampled subjects = sample size = n
- n is always either equal or smaller than N
Descriptive statistics summarize the information of the data
Inferential statistics provide predictions about an entire population, based on data from a
sample of that population
Variables
Subjects in a population/sample vary from each other with respect to a characteristic
ð a variable = a characteristic that can vary in value among subjects/statistical units in
a sample or population
- Notation: X, Y, Z, x, y, z, …
- Each subject has a particular value on a variable: X1, X2, …, Xn
- Number of different values a variable can take: m
- Measurement scale: something like “ratio”
- m is always equal or smaller than n
- Univariate statistics: 1 variable
- Bivariate statistics: association between 2 variables
- Multivariate statistics: associations between more than 2 variables
Types of variables
Categorical variables: are categories, can’t be counted (f ex: man + female = ???), wouldn’t
make sense, never in numbers
f ex: hair color, gender, religion, favorite show, political preferences, …
= qualitative variable
4