ISYE 6501 Midterm 1 EXAM QUESTIONS WITH 100% SOLUTIONS LATEST UPDATE 2023/2024
ISYE 6501 Midterm 1 EXAM QUESTIONS WITH 100% SOLUTIONS LATEST UPDATE 2023/2024 True or false: In a regression tree, every leaf of the tree has a different regression model that might use different attributes, have different coefficients, etc. - ANSWER True - Each leaf's individual model is tailored to the subset of data points that follow all of the branches leading to the leaf. True or false: Tree-based approaches can be used for other models besides regression. - ANSWER True - For example, a classification tree might have a different SVM or KNN model at each leaf. It might even use SVM at some leaves and KNN at others (though that's probably rare). A common rule of thumb is to stop branching if a leaf would contain less than 5% of the data points. Why not keep branching and allow models to find very close fits to each very small subset of data? - ANSWER Fitting to very small subsets of data will cause overfitting. - With too few data points, the models will fit to random patterns as well as real ones. True or False: When using a random forest model, it's easy to interpret how its results are determined. - ANSWER False - Unlike a model like regression where we can show the result as a simple linear combination of each attribute times its regression coefficient, in a random forest model there are so many different trees used simultaneously that it's difficult to interpret exactly how any factor or factors affect the result. A logistic regression model can be especially useful when the response... - ANSWER - ...is a probability (a number between zero and one). - ...is binary (either zero or one). - Logistic regressions can be useful for either situation. A model is built to determine whether data points belong to a category or not. A "true negative" result is: - ANSWER A data point that is not in the category, and the model correctly says so. - True' and 'false' refer to whether the model is correct or not, and 'positive' and 'negative' refer to whether the model says the point is in the category. True or False: The most useful classification models are the ones that correctly classify the highest fraction of data points. - ANSWER False - Sometimes the cost of a false positive is so high that it's worth accepting more false negatives, or vice versa. PreviousNext Adjusted R-squared/Adjusted R2 - ANSWER Variant of R2 that encourages simpler models by penalizing the use of too many variables Akaike information criterion (AIC) - ANSWER Model selection technique that trades off between model fit and model complexity. When comparing models, the model with lower AIC is preferred. Generally penalizes complexity less than BIC Algorithm - ANSWER Step-by-step procedure designed to carry out a task. Area under curve/AUC - ANSWER Area under the ROC curve; an estimate of the classification model's accuracy. Also called concordance index ARIMA - ANSWER Autoregressive integrated moving average. Attribute - ANSWER A characteristic or measurement - for example, a person's height or the color of a car. Generally interchangeable with "feature", and often with "covariate" or "predictor". In the standard tabular format, a column of data Autoregression - ANSWER Regression technique using past values of time series data as predictors of future values. Autoregressive integrated moving average (ARIMA) - ANSWER Time series model that uses differences between observations when data is nonstationary. Also called BoxJenkins. Bayes' theorem/Bayes' rule - ANSWER Fundamental rule of conditional probability:
Escuela, estudio y materia
- Institución
- ISYE 6501
- Grado
- ISYE 6501
Información del documento
- Subido en
- 6 de octubre de 2023
- Número de páginas
- 6
- Escrito en
- 2023/2024
- Tipo
- Examen
- Contiene
- Preguntas y respuestas
Temas
-
isye 6501 midterm 1 exam questions with 100
-
isye 6501 midterm 1 exam questions with 100 sol
-
isye 6501 midterm 1 exam questions with 100 sol
Documento también disponible en un lote