ISYE 6501 MIDTERM 1 FINAL PRACTICE TEST
ISYE 6501 MIDTERM 1 FINAL PRACTICE TEST True or false: In a regression tree, every leaf of the tree has a different regression model that might use different attributes, have different coefficients, etc. - CORRECT ANSWER-True - Each leaf's individual model is tailored to the subset of data points that follow all of the branches leading to the leaf. True or false: Tree-based approaches can be used for other models besides regression. - CORRECT ANSWER-True - For example, a classification tree might have a different SVM or KNN model at each leaf. It might even use SVM at some leaves and KNN at others (though that's probably rare). A common rule of thumb is to stop branching if a leaf would contain less than 5% of the data points. Why not keep branching and allow models to find very close fits to each very small subset of data? - CORRECT ANSWER-Fitting to very small subsets of data will cause overfitting. - With too few data points, the models will fit to random patterns as well as real ones. True or False: When using a random forest model, it's easy to interpret how its results are determined. - CORRECT ANSWER-False - Unlike a model like regression where we can show the result as a simple linear combination of each attribute times its regression coefficient, in a random forest model there are so many different trees used simultaneously that it's difficult to interpret exactly how any factor or factors affect the result. A logistic regression model can be especially useful when the response... - CORRECT ANSWER-- ...is a probability (a number between zero and one). - ...is binary (either zero or one). - Logistic regressions can be useful for either situation. A model is built to determine whether data points belong to a category or not. A "true negative" result is: - CORRECT ANSWER-A data point that is not in the category, and the model correctly says so. - True' and 'false' refer to whether the model is correct or not, and 'positive' and 'negative' refer to whether the model says the point is in the category. True or False: The most useful classification models are the ones that correctly classify the highest fraction of data points. - CORRECT ANSWER-False - Sometimes the cost of a false positive is so high that it's worth accepting more false negatives, or vice versa. PreviousNext Adjusted R-squared/Adjusted R2 - CORRECT ANSWER-Variant of R2 that encourages simpler models by penalizing the use of too many variables Akaike information criterion (AIC) - CORRECT ANSWER-Model selection technique that trades off between model fit and model complexity. When comparing models, the model with lower AIC is preferred. Generally penalizes complexity less than BIC Algorithm - CORRECT ANSWER-Step-by-step procedure designed to carry out a task. Area under curve/AUC - CORRECT ANSWER-Area under the ROC curve; an estimate of the classification model's accuracy. Also called concordance index ARIMA - CORRECT ANSWER-Autoregressive integrated moving average. Attribute - CORRECT ANSWER-A characteristic or measurement - for example, a person's height or the color of a car. Generally interchangeable with "feature", and often with "covariate" or "predictor". In the standard tabular format, a column of data Autoregression - CORRECT ANSWER-Regression technique using past values of time series data as predictors of future values. Autoregressive integrated moving average (ARIMA) - CORRECT ANSWER-Time series model that uses differences between observations when data is nonstationary. Also called Box-Jenkins. Bayes' theorem/Bayes' rule - CORRECT ANSWER-Fundamental rule of conditional probability:
Escuela, estudio y materia
- Institución
- ISYE 6501
- Grado
- ISYE 6501
Información del documento
- Subido en
- 9 de enero de 2024
- Número de páginas
- 6
- Escrito en
- 2023/2024
- Tipo
- Examen
- Contiene
- Preguntas y respuestas
Temas
- box and whisker plot
- linear regression
-
isye 6501 midterm 1 final practice test
-
true or false in a regression tree every leaf of
-
if two models are approximately equally good meas
Documento también disponible en un lote