100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Exam (elaborations)

ISYE 6414 - Midterm Exam Questions and Answers well Explained Latest 2024/2025 Update 100% Correct.

Rating
-
Sold
-
Pages
11
Grade
A+
Uploaded on
01-09-2024
Written in
2024/2025

Regression Estimator Properties - Unbiasedness: This is the property that the expectation of the estimator is exactly the true parameter. What this means is that Beta_1_hat is an unbiased estimator for Beta_1 Model Parameter Interpretation - a positive value for Beta_1, then that's consistent with a direct relationship between the predicting variable X and the response variable Y Regression Analysis - Regression analysis is a simple way to investigate the relationship between 2 or more variables in a non-deterministic way. Response/Target Variable (Y) - This is a variable we're interested in understanding, modeling or testing This is a random variable. It varies with changes in the predictor(s) 2. Predicting/Explanatory (independent) Variables(Xs ~ X1, X2) - These are variables we think might be useful in predicting or modeling the response variable This is a fixed variable. It does not change with the response Simple Linear Regression - We have a straight line which doesn't fit perfectly to the points The objective is to fit a non-deterministic linear model between the predicting variable and Y. In simple linear regression, we have 3 parameters to estimate. Multiple Linear Regression - We can have a plane if we have two predictionsPolynomial Regression - We are capturing a nonlinear relationship Objectives of Linear Regression - 1. Prediction: We want to see how the response variable behaves in different settings 2. Modeling: We are interested in modeling the relationship between the response variable and the explanatory/predicting variables 3. Testing: We are also interested in testing the hypotheses of association relationships. Simple Linear Regression Assumptions - • Linearity/Mean Zero Assumption: This means that the expected value of the errors is zero • Constant Variance Assumption: This means that the variance of the error term is equal to sigma_squared is the same across all error terms • Independence Assumption: This means that the error terms are independent random variables i.e. deviances (response variables Ys) are independently drawn from the data generating process -- it cannot be true that the model under-predicts Y for one particular case tells you anything or all about what it does for any other case • Normal Assumption: The errors are assumed to be normally distributed. Linearity Assumption - A violation of this assumption will lead to difficulties in estimating 0 and means that your model does not include a necessary systematic component Constant Variance Assumption - This means that the model cannot be more accurate in some parts and less accurate in other parts of the model. The variance has to be constant. A violation of this assumption means that the estimates are not as efficient as they could be in estimating the true parameters and better estimates can be calculated also results in poorly calibrated prediction intervalsIndependence Assumption - It cannot be true that the model under-predicts Y. One particular case doesn't tell you anything or all about what it does for any other case This violation most often occurs in data that are ordered in time (time series data) where areas that are near each other in time are similar to each other. Violation of this assumption can lead to very misleading assessments of the strength of the regression Normal Assumption - This is needed if we want to do any confidence or prediction intervals, or hypothesis test If this assumption is violated, hypothesis test and confidence and prediction intervals can be very misleading. This assumption is needed got statistical inference. Autocorrelation - Time-related correlation is often called autocorrelation. Error term - The error term is also a parameter in linear regression. Epsilon is the deviance of the data from the linear model. The error term is also normally distributed Sample Variance Estimation - Sigma_squared is a chi-squared distribution with n-2 degrees of freedom For the sample variance, we lose one degree of freedom from sigma_squared. We end up with a chisquared distribution with n-1 degrees of freedom negative value of Beta_1 is consistent with an inverse relationship between X and YWhen Beta_1 is close to zero, we interpret that there is not a significant association between the predicting variable X and the response variable Y. Interpreting Estimated Coefficients - Beta_1_hat is the estimated expected change in Y associated with 1 unit of change in X Beta_0_hat is the estimated expected value of Y, when X equals 0 Beta_1_hat is a combination of normally distributed random variables, and is thus normally distributed Confidence Interval - This is based on the assumption that we have normality i.e. the data/deviances are normally distributed Hypothesis Testing - Null Hypotheses: test whether Beta_1 is equal to 0 Alternative Hypotheses: Beta_1 is not equal to zero If we want to test whether Beta_1 is equal to 0 or not, we use the t-test (to test for statistical significance). If the t-value is large, reject the null hypothesis that Beta_1 is equal to zero and if the null hypothesis is rejected, we interpret this as Beta_1 is statistically significant If we want to test whether Beta_1 is equal to a constant vs Beta_1 not equal to that constant, we use the p-value. If P-value is small (P<0.01), reject null hypothesis If we want to test whether Beta_1 is positive or negative, we use parts of the p-value: Beta_1 > 0 --> right tail Beta_1 <0 --> left tail Statistical Significance - Statistical significance means that Beta_1 is statistically different from 0.If we reject the null hypothesis that Beta_1 is zero, that means Beta_1 is statistically significant P-value - The p-value is a measure of how rejectable the null hypothesis is. The smaller the p-value, the more rejectable the null hypothesis is for the observed data. The p-value is NOT the probability of rejecting the null hypothesis and it's NOT the probability that the null hypothesis is true Estimation - If x* is one of the observations for the predicting variable, then we use estimation The estimated regression line is the average estimated mean response for all settings under which the predicting variable is equal to x* In estimation, we average across all possible settings. Prediction - If x* is a new observation of the predicting variables under a new setting, then we use prediction. The estimated mean response for one setting under which the predicting variable is equal to x* In prediction we focus on one particular setting. Prediction contains 2 sources of uncertainty: : - Due to the new observation - Due to the parameter estimates of Beta_1 and Beta_0 (same as estimation) Estimation vs Prediction - The uncertainty in estimation comes from the estimation alone, whereas the prediction comes from the estimation of the regression parameters and from a newness of the observation.Outliers - This is any data point that is far from the majority of the data (in both X and Y) Leverage Points - These are data points that are far from the mean of the X's Influential points - This is a data point that is far from the mean of both the X's and Y's.

Show more Read less
Institution
ISYE 6414
Course
ISYE 6414









Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
ISYE 6414
Course
ISYE 6414

Document information

Uploaded on
September 1, 2024
Number of pages
11
Written in
2024/2025
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

Content preview

ISYE 6414 - Midterm Exam
Regression Estimator Properties - Unbiasedness: This is the property that the expectation of the
estimator is exactly the true parameter. What this means is that Beta_1_hat is an unbiased estimator for
Beta_1



Model Parameter Interpretation - a positive value for Beta_1, then that's consistent with a direct
relationship between the predicting variable X and the response variable Y



Regression Analysis - Regression analysis is a simple way to investigate the relationship between 2
or more variables in a non-deterministic way.



Response/Target Variable (Y) - This is a variable we're interested in understanding, modeling or
testing



This is a random variable. It varies with changes in the predictor(s)



2. Predicting/Explanatory (independent) Variables(Xs ~ X1, X2) - These are variables we think
might be useful in predicting or modeling the response variable



This is a fixed variable. It does not change with the response



Simple Linear Regression - We have a straight line which doesn't fit perfectly to the points



The objective is to fit a non-deterministic linear model between the predicting variable and Y.



In simple linear regression, we have 3 parameters to estimate.



Multiple Linear Regression - We can have a plane if we have two predictions

, Polynomial Regression - We are capturing a nonlinear relationship



Objectives of Linear Regression - 1. Prediction: We want to see how the response variable
behaves in different settings



2. Modeling: We are interested in modeling the relationship between the response variable and the
explanatory/predicting variables



3. Testing: We are also interested in testing the hypotheses of association relationships.



Simple Linear Regression Assumptions - • Linearity/Mean Zero Assumption: This means that the
expected value of the errors is zero



• Constant Variance Assumption: This means that the variance of the error term is equal to
sigma_squared is the same across all error terms



• Independence Assumption: This means that the error terms are independent random variables i.e.
deviances (response variables Ys) are independently drawn from the data generating process -- it cannot
be true that the model under-predicts Y for one particular case tells you anything or all about what it
does for any other case



• Normal Assumption: The errors are assumed to be normally distributed.



Linearity Assumption - A violation of this assumption will lead to difficulties in estimating 0 and
means that your model does not include a necessary systematic component



Constant Variance Assumption - This means that the model cannot be more accurate in some
parts and less accurate in other parts of the model. The variance has to be constant.



A violation of this assumption means that the estimates are not as efficient as they could be in
estimating the true parameters and better estimates can be calculated also results in poorly calibrated
prediction intervals

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
ACADEMICMATERIALS City University New York
View profile
Follow You need to be logged in order to follow users or courses
Sold
560
Member since
2 year
Number of followers
186
Documents
10590
Last sold
3 days ago

4.1

94 reviews

5
53
4
11
3
21
2
3
1
6

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions