100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Exam (elaborations)

STAT310 Exam Terminology Questions and Comprehensive Answers Graded A

Rating
-
Sold
-
Pages
25
Grade
A
Uploaded on
03-02-2025
Written in
2024/2025

How are parameters estimated for linear regression? - ️️Method of least squares What is the multiple linear regression model? - ️️Yi = B0 + B1Xi1 + ... + BpXip + ei ei are random errors Yi is the response for the ith case There are p predictor variables x1, x2, ..., xp Least square estimates - ️️Minimizes sum of squares Residual - ️️Actual - Predicted value Residual sum of squares - ️️sum of residuals squared (Yi - mu hat)^2 summed together We want small RSS to indicate a better model (minimize SS) Factor - ️️Is a categorical variable that can be incorporated into regression models by dummy variables. Factors with k levels are coded with k-1 variables since one level is the reference level What is matrix formulation of the linear model? - ️️Y = XB + e - Y is the response vector - X is the design matrix - B is the vector of p+1 (including intercept) regression parameters - e is the vector of error terms Ex: Y = [Y1, Y2....] B = [B0, B1, ...] E = [e1, e2,...]

Show more Read less
Institution
STAT310
Course
STAT310










Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
STAT310
Course
STAT310

Document information

Uploaded on
February 3, 2025
Number of pages
25
Written in
2024/2025
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

Content preview

STAT310 Exam Terminology Questions and
Comprehensive Answers Graded A 2024-2025
How are parameters estimated for linear regression? - ✔️✔️Method of least squares

What is the multiple linear regression model? - ✔️✔️Yi = B0 + B1Xi1 + ... + BpXip + ei

ei are random errors
Yi is the response for the ith case
There are p predictor variables x1, x2, ..., xp

Least square estimates - ✔️✔️Minimizes sum of squares

Residual - ✔️✔️Actual - Predicted value

Residual sum of squares - ✔️✔️sum of residuals squared (Yi - mu hat)^2 summed
together

We want small RSS to indicate a better model (minimize SS)

Factor - ✔️✔️Is a categorical variable that can be incorporated into regression models by
dummy variables. Factors with k levels are coded with k-1 variables since one level is
the reference level

What is matrix formulation of the linear model? - ✔️✔️Y = XB + e

- Y is the response vector
- X is the design matrix
- B is the vector of p+1 (including intercept) regression parameters
- e is the vector of error terms

Ex: Y = [Y1, Y2....]
B = [B0, B1, ...]
E = [e1, e2,...]

F-test hypotheses - ✔️✔️Ho: response is not related to any of the predictors

Ha: response y is related to at least one of the predictors in the model

Bias-Variance tradeoff - ✔️✔️As complexity in model increases (more predictors), bias
decreases but variance increases

- We want to control both bias and variance

,Bias - ✔️✔️Bias(μˆ) = E[μˆ] − μ

Bias arises because of the model misspecification (banana shape but fit a linear
regression) WRONG MODEL

Variance - ✔️✔️Var(μˆ) = SE(μˆ)^2

Arises due to noise in estimating regression coefficients

Mean squared error (MSE) - ✔️✔️Overall error in estimation

E[(mu hat - mu)^2]
or
[Bias^2] + Variance

Xij means - ✔️✔️Variable J with index individual record of i

What happens to bias and variance when the sample size (n) increases? - ✔️✔️The
variance decreases and bias stays the same

"Sample size increase squashes variance but does nothing to bias"

Collinearity - ✔️✔️One of the columns in the design matrix is a linear combination of the
others. This makes it very difficult to distinguish between the effects of the variables in
the model. It is hard to get good estimates of the parameters with least squares.

- Collinearity gets worse as the ratio of p/n increases

- HIGH VARIANCE AS A RESULT

- You have perfect collinearity when p >= n

P > n - ✔️✔️When you have more predictors than sample size (n) you cannot get
standard errors for regression coefficients or optimal least square parameters
estimations (see MRNA data example)

- The method of least squares for parameter estimation will not work. You will get zero
residual sum of squares and no estimation for standard errors and parameter
estimation. Regularized regression techniques like ridge and lasso will give you a
unique parameter estimate even when p>n. VERY HELPFUL!!!

Each time we add a variable to the model, RSS will... - ✔️✔️never increase.

, So adding a variable to a model will improve the model (less bias) but will increase
variance. So tradeoff!!!

Principle of Parsimony (Occam's Razor) - ✔️✔️With all things being equal, simple models
are better than complex ones

Complex models have what? - ✔️✔️High variance, but low bias

Variable selection - ✔️✔️It is a way of trying to find the balance between model fit and
model complexity

Information Criteria - ✔️✔️- Can think of improvement in model fit in terms of information
about the response

- Only add variable if it contributes sufficient additional information about the response
to warrant additional model complexity

Q is overall quality of the model
WANT SMALL Q

Q = Badness of fit + k * Number of predictors + constant

We want to minimize Q

AIC - ✔️✔️Akaike Information criterion

For a linear model with unknown error variance:

AIC = nlog(RSS/n) + 2p + constant

So adding a single numeric predictor to the regression model cannot increase the AIC
by more than 2 units

- AIC for linear models is equivalent Mallow's Cp variable selection. WE CHOOSE
MODEL WITH SMALL AIC

Stepwise Variable Selection with AIC - ✔️✔️- Alternative to choosing the selection of
predictors that result in minimum AIC for all possible models. This approach is very time
consuming because there are 2^p different models to consider so with p=10 predictors,
there are 2^10 different models to consider

- DO STEPWISE VARIABLE SELECTION

1. Choose initial model (typically null model with no predictors or full model with all
predictors)
$12.99
Get access to the full document:

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Get to know the seller
Seller avatar
samuelmgichuki96

Get to know the seller

Seller avatar
samuelmgichuki96 Grand Canyon University
View profile
Follow You need to be logged in order to follow users or courses
Sold
0
Member since
2 year
Number of followers
0
Documents
141
Last sold
-

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions