ISYE 6501 Midterm EXAM QUESTIONS AND SOLUTIONS LATEST UPDATE 2023/2024
ISYE 6501 Midterm EXAM QUESTIONS AND SOLUTIONS LATEST UPDATE 2023/2024 Factor Based Models classification, clustering, regression. Implicitly assumed that we have a lot of factors in the final model Why limit number of factors in a model? 2 reasons overfitting: when # of factors is close to or larger than # of data points. Model may fit too closely to random effects simplicity: simple models are usually better Classical variable selection approaches 1. Forward selection 2. Backwards elimination 3. Stepwise regression greedy algorithms Backward elimination variable selection; classical Opposite of forward selection. Start with model with all factors, at each step find worst factor and remove from model. Continue until no more to add, # of factor threshold is satisfied. Remove factors at the end that were not good enough Forward selection variable selection; classical Start with model with no factors, at each step find best new factor to add. Continue until none bad enough to remove, # of factor threshold is satisfied. Remove factors at the end that were not good enough Stepwise regression variable selection; classical Combination of forward selection and backwards elimination. Start with all or no factors. Each step remove/add a factor. As it continues, after adding in new factor we eliminate right away any factors that may be good. Helps model adjust when new factors are added, goodness values change Ways of determining if factors are good enough in variable selection p-value, Rsquared, AIC, BIC Greedy algorithm At each step, it does the one thing that looks best without taking future options into consideration. Good for initial analysis 1. Forward selection 2. Backwards elimination 3. Stepwise regression Global variable selection approaches 1. LASSO 2. Elastic Net Slower, but tend to give better predictive models LASSO variable selection; global - SCALE the date (as with any constrained sum of coefficients) - add a constraint to the standard regression equation - minimize sum of squared errors - T = limit or "budget" on how large the sum of squared errors can get. Budget will be used on most important coefficients - Method for limiting the number of variables in a model by limiting the sum of all coefficients’ absolute values. Can be very helpful when number of data points is less than number of factors. Elastic Net variable selection; global - SCALE the date (as with any constrained sum of coefficients) - T = limit or "budget" on how large the sum of squared errors can get. Budget will be used on most important coefficients - Combination of lasso and ridge regression. - Variable selection benefits of LASSO - Predictive benefits of ridge regression Ridge Regression - Method of regularization by limiting the sum of the squares of the coefficients. Will reduce the magnitude of coefficients, not the number of variables chosen. - The quadratic term in ridge regression tends to shrink the coefficient values i.e Whatever the basic regression model coefficients would be, the quadratic constraint pushes them toward zero or regularizes them. Design of Experiments (DOE) How can we still have a representative sample of each combination of factors, while only surveying 600 people? How to determine which of the several factors are most important to predicting someone's answers? comparison to measure difference control for other factors and effects blocking factors that account for the variation between factors (red sports car vs red minivan example) A/B testing Whenever we want to choose between 2 alternatives. As long as the following 3 things are true: 1st, we need to be able to collect a lot of data quickly enough to get an answer in time to use it. 2nd, the data we collect has to be from a representative sample of the whole 3rd, the amount of data we collect has to be small compared to the total population we want to use the answer on. Before modeling and before collecting data (Full) Factorial Design Test every combination of variables in an experiment to find each one's effect, and interaction effects on the outcome. Fractional Factorial Design A subset of combinations to test - selected combinations give same result as full factorial design i.e a balanced design Before modeling and before collecting data What approach to take if it is believed the factors we can change are independent? (Factorial design) Test a subset of combinations and use regression to estimate the effect of each choice Before modeling and before
Written for
- Institution
- ISYE 6501
- Course
- ISYE 6501
Document information
- Uploaded on
- October 6, 2023
- Number of pages
- 10
- Written in
- 2023/2024
- Type
- Exam (elaborations)
- Contains
- Questions & answers
Subjects
-
isye 6501 midterm exam questions and solutions
-
isye 6501 midterm exam questions and solutions l
-
isye 6501 midterm exam questions and solutions l
Also available in package deal