Three drawbacks of linear models - correct answer ✔✔1. Heteroskedasticity
2. Meaningless residuals
3. Because model is arbitrary, poor fitted values
AR(1) - correct answer ✔✔Only the immediate past value of yt-1 is used to predict yt
stationary - correct answer ✔✔Not related to time, therefore, average and variance should not depend
on t and should be equal across all values
covariance is dependent on t-s
Control Chart - correct answer ✔✔a time-ordered diagram that is used to determine whether observed
variations are abnormal; also can detect trends in time detects non stationarity
xbar chart - correct answer ✔✔A plot of sample means over time used to assess whether a process is in
control.
R-chart - correct answer ✔✔a chart used to monitor variability (time series)
Logit and probit graphs - correct answer ✔✔Are very similar
Pearson chi square statistic - correct answer ✔✔Pearson Residual = ei = (yi-µi)/√(∅v(µi))
Pearson Chi-Square Statistic = ∑ei²
- large value = over dispersion is more severe
- to address overdispertion, inflate variance with δ
- δ=Pearson Chi-square stat/(n-p-1)
Cook's distance - correct answer ✔✔R²*(e²*h)/((1+p)(1-h)²)
,OR
[ei²hi]/[MSE(p+1)(1-hi)²]
when given standardized residual:
Di=ri²hi/(p+1)(1-hi)
where standardized residual = ei/(s√(1-hi))
Lag k autocorrelation of white noise process - correct answer ✔✔Will always be zero, for all ks
In SLR, the F-statistic is always.. - correct answer ✔✔the square of the t-statistic
Tweedie distribution - correct answer ✔✔E[Y]= µ
Var[Y]=∅µ^d, 1<d<2
Inverse Gaussian distribution - correct answer ✔✔E[Y]=(-2θ)^(-1/2)
Var[Y]=∅(-2θ)^(-3/2)
Leverage formula - correct answer ✔✔SLR:
hi = 1/n + [(x*-x_bar)²]/[∑(xi-x_bar)²]
GLM:
hi = SE(y)²/MSE
1/n< hi <1
∑hi = p+1
diagonal values in X(X^TX)^(-1)X^T
, AIC formula - correct answer ✔✔Linear model:
[SSE+2p*MSE]/[n*MSE]
Non-linear model:
-2l(b)+2(# of estimated parameters)
BIC formula - correct answer ✔✔Linear model:
[SSE + ln(n)*p*MSE]/[n*MSE]
Non-linear model:
-2ln(b)+ln(n)*(# of estimated parameters)
Mallows Cp - correct answer ✔✔[SSE + 2p +MSE]/[n]
VIF - correct answer ✔✔1/[1-Rj²] = [sx²(n-1)]/[MSE]*se(bj)²
tolerance is reciprocal of VIF
** test for colinearity **
detects multicolinearity, if VIF >10
box plot median - correct answer ✔✔is the bold line in IQR box
matrix for coefficients (SLR) - correct answer ✔✔b = (XTX)^-1y
- all 1's is intercept
Lasso Regression - correct answer ✔✔goal is to minimize SSE + λ∑|bj|
where bj = ∑|bj|<=a