Intro to statistics
Lecture 1
Composed of 3 main facets
1. Design – how to collect data
2. Description – describing visual of data & summarizing
3. Inference – make predictions about wider world & future
Variables
Continuous - (interval) - measured on continuous scale eg income, height
Categorical (discrete) - information, distinct categories eg ethnicity
Within categorical – nominal – no order to categories eg ethnicity
And ordinal – implicit order eg education – gcse, a level , degree
Count – eg number of cars
Continuous variables
- Theoretically infinite number of outcomes
- Meaningful to do arithmetic with
- Can be meaningfully subdivided indefinitely eg age intervals
Categorical variables
- Values taken are category codes
- Not ‘real’ numbers and cant do arithmetic
- Not necessarily numerical but can be
- Ordinal – there exists a logical ordering/ranking of categories – note; magnitude between categories not
always meaningful
- Nominal – no obvious order/ranking eg country of birth – can have extrinsic vs intrinsic ordering
Count variables
- Have characteristics of ordinal and continuous data
- Has natural order
- Difference between categories is meaningful, can do basic arithmetic with values
Data types
Aggregation level
- Micro level – usually collect data at micro level (smallest possible unit)
- Macro / aggregated form – summed over a number of smaller units
- At more advanced levels aggregation may cause methodological problems
Dimension of data collection
- Cross section data – different entities observed at one point in time eg different exchange rates all measured
at different times
- Time series data – same entity observed at different points in time
- Panel data – combination of cross section and time series, number of entities observed over time
Time series data
Growth rate = (new – old)/old * 100
Gives a percentage
Lecture 1
Composed of 3 main facets
1. Design – how to collect data
2. Description – describing visual of data & summarizing
3. Inference – make predictions about wider world & future
Variables
Continuous - (interval) - measured on continuous scale eg income, height
Categorical (discrete) - information, distinct categories eg ethnicity
Within categorical – nominal – no order to categories eg ethnicity
And ordinal – implicit order eg education – gcse, a level , degree
Count – eg number of cars
Continuous variables
- Theoretically infinite number of outcomes
- Meaningful to do arithmetic with
- Can be meaningfully subdivided indefinitely eg age intervals
Categorical variables
- Values taken are category codes
- Not ‘real’ numbers and cant do arithmetic
- Not necessarily numerical but can be
- Ordinal – there exists a logical ordering/ranking of categories – note; magnitude between categories not
always meaningful
- Nominal – no obvious order/ranking eg country of birth – can have extrinsic vs intrinsic ordering
Count variables
- Have characteristics of ordinal and continuous data
- Has natural order
- Difference between categories is meaningful, can do basic arithmetic with values
Data types
Aggregation level
- Micro level – usually collect data at micro level (smallest possible unit)
- Macro / aggregated form – summed over a number of smaller units
- At more advanced levels aggregation may cause methodological problems
Dimension of data collection
- Cross section data – different entities observed at one point in time eg different exchange rates all measured
at different times
- Time series data – same entity observed at different points in time
- Panel data – combination of cross section and time series, number of entities observed over time
Time series data
Growth rate = (new – old)/old * 100
Gives a percentage