Inleiding econometrie – Economie en bedrijfseconomie
College 1:
Learning via tradition little effort
Learning via experts little effort
Learning via own experience causal chain
Problems no accurate observations, overgeneralization, illogical reasoning
Scientific learning: extend existing knowledge, learning via scientific methods, theory/data/analysis
Association: useful to describe facts, niet zinvol als we willen beslissen wat we willen doen aan iets
Causal effects: understanding relationships between variables and effectiveness of policy
interventions
Omitted variable bias: het vergeten van een belangrijke variabele en daardoor een verkeerde
conclusie trekken
Types of data:
- Experimental = designed to estimate causal effects
- Observational= not designed for causal effects estimation
- Time dimension = a. time series, b. cross-section samen leiden tot c. panel data
Ecological fallacy = erroneously drawing conclusions about lower levels of aggregration from higher
levels of aggregation
Conceptualization = the process through which we specify what we mean when we use particular
terms in research
Operationalization = the development of specific research procedures that will result in empirical
observations representing those concepts in the real world about measurement of a theoretical
concept
Ordinal: ordening
Nominal: geen ordening
Reliability = the quality of measurement method that suggests the same data would have been
collected each time in repeated observations of the same phenomenom
Validity = a term describing a measure that accurately reflects the concept it is intended to measure
College 2:
Regression models: any relationship between two variables
Linear regression = continuous dependent Y as function of any kind of variable X
Scatterplot : x and Y variables
Positive relationship = als het rechts omhoog gaat
Negative relationsihp = als het rechts naar beneden gaat
,Covariance=
- Tells us if X and Y tend to move in the same (+) or opposite (-) directions
- Units = Units of X x Units of Y
Correlation:
Always between -1 and 1
- Strength of linear relationship between X and Y
- 0 = uncorrelated
- 1 = perfect positive correlated
- -1 = perfect negative correlated
Linear regression model:
Ui = errorterm = vertical distance between regression line and observation point
What if we don’t know B1 and B0
OLS estimator
Bij predicted komt er dus zo’n dakje bovenop
Residuals = u = Yi – Ydakjei
, Linear regression = more explicitly tries to assess a cause and effect relationship, and quantify such
causal effect
Goodness of fit measures:
-
= proportion of sample variance of Yi that is explained by Xi
= always between 0 and 1
- Standard error of the regression
= large SER predictions very different from actual values In
STATA SER = Root MSE
College 3:
OLS simple linear regression models assumptions:
1. Zero conditional mean: Xi is uncorrelated with Ui en with other factors that influence Yi
Most of the times difficult to know if ZCM holds
ZCM also does not hold when there is simultaneous causality (if Y also affects X)
2. Observations are independent and identically distributed (i.d.d)
Holds if sample is drawn from simple random sampling
NOT when = time series data, panel data (more times)
3. Large outliers in X and Y are unlikely
If it happens, drop it
Otherwise you get an plausible assumption
College 1:
Learning via tradition little effort
Learning via experts little effort
Learning via own experience causal chain
Problems no accurate observations, overgeneralization, illogical reasoning
Scientific learning: extend existing knowledge, learning via scientific methods, theory/data/analysis
Association: useful to describe facts, niet zinvol als we willen beslissen wat we willen doen aan iets
Causal effects: understanding relationships between variables and effectiveness of policy
interventions
Omitted variable bias: het vergeten van een belangrijke variabele en daardoor een verkeerde
conclusie trekken
Types of data:
- Experimental = designed to estimate causal effects
- Observational= not designed for causal effects estimation
- Time dimension = a. time series, b. cross-section samen leiden tot c. panel data
Ecological fallacy = erroneously drawing conclusions about lower levels of aggregration from higher
levels of aggregation
Conceptualization = the process through which we specify what we mean when we use particular
terms in research
Operationalization = the development of specific research procedures that will result in empirical
observations representing those concepts in the real world about measurement of a theoretical
concept
Ordinal: ordening
Nominal: geen ordening
Reliability = the quality of measurement method that suggests the same data would have been
collected each time in repeated observations of the same phenomenom
Validity = a term describing a measure that accurately reflects the concept it is intended to measure
College 2:
Regression models: any relationship between two variables
Linear regression = continuous dependent Y as function of any kind of variable X
Scatterplot : x and Y variables
Positive relationship = als het rechts omhoog gaat
Negative relationsihp = als het rechts naar beneden gaat
,Covariance=
- Tells us if X and Y tend to move in the same (+) or opposite (-) directions
- Units = Units of X x Units of Y
Correlation:
Always between -1 and 1
- Strength of linear relationship between X and Y
- 0 = uncorrelated
- 1 = perfect positive correlated
- -1 = perfect negative correlated
Linear regression model:
Ui = errorterm = vertical distance between regression line and observation point
What if we don’t know B1 and B0
OLS estimator
Bij predicted komt er dus zo’n dakje bovenop
Residuals = u = Yi – Ydakjei
, Linear regression = more explicitly tries to assess a cause and effect relationship, and quantify such
causal effect
Goodness of fit measures:
-
= proportion of sample variance of Yi that is explained by Xi
= always between 0 and 1
- Standard error of the regression
= large SER predictions very different from actual values In
STATA SER = Root MSE
College 3:
OLS simple linear regression models assumptions:
1. Zero conditional mean: Xi is uncorrelated with Ui en with other factors that influence Yi
Most of the times difficult to know if ZCM holds
ZCM also does not hold when there is simultaneous causality (if Y also affects X)
2. Observations are independent and identically distributed (i.d.d)
Holds if sample is drawn from simple random sampling
NOT when = time series data, panel data (more times)
3. Large outliers in X and Y are unlikely
If it happens, drop it
Otherwise you get an plausible assumption