DATA MINNING EXAM 2 REVIEW
PRACTICE QUIZ QUESTIONS AND
ANSWERS
Cross-validation is used to estimate - Answer-Test error using training data
What is the fundamental difference between an artificial neural network (ANN) and a
deep neural network (DNN)? - Answer-No. of hidden layers (DNN has more hidden
layers, they automate unstructured data)
Which of the following is not a significant reason for the recent popularity of neural
networks?
Neural Networks were developed recently
More data that can be stored efficiently now
More computation power available now
Backpropagation algorithm and other better algos to train models now - Answer-
Neural Networks were developed recently
Suppose we are building a neural network with 2 hidden layers and 2 neurons in
each hidden layer.
We have 5 features (input nodes) and we are solving a regression problem.
How many parameters do we need to learn for this neural network? - Answer-21
Suppose we are building a neural network with 2 hidden layers and 10 neurons in
each layer, we have 5 features (input nodes) and we are solving a regression
problem. How many parameters do we need to learn for this neural network? -
Answer-1 hidden layer: 10 *(5 + 1) + (10 + 1) = 71
2 hidden layer: 10 *(5 + 1) + 10*(10 + 1) + (10 + 1) = 82
How can we increase the complexity of a neural network structure?
i) Increase the no. of neurons in the output layer
ii) Increase the no. of neurons in the hidden layer.
iii) Increase the no. of neurons in the input layer.
ii
i & ii
i & iii
ii & iii - Answer-ii
Which of the following statements is correct about boosting methods? - Answer-
Boosting involves building multiple models in a sequence
Which of the following statements is correct?
i) Each model in boosting is a weak learner.
ii) Each model in boosting is learned in a way that only a small set of features are
available at each step of making splits.
iii) There is only one kind of boosting method.
ii
, i and ii
ii and iii
i only - Answer-i only
Which of the following is an integral component of Gradient Boosting?
Weights given to different data points
Additive Model
Maximum Likelihood
Least Squares - Answer-Additive Model
Which of the following statements are correct about the gradient boosting method?
i) There's no restriction on the size of each tree model we build.
ii) The number of trees must not be too large as it can lead to overfitting.
iii) Learning rate is not a hyperparameter for gradient boosting.
ii and iii
i and ii
iii only
ii only - Answer-ii only
Which of the following statements is correct?
i) Gradient Boosting is a state-of-the-art model for tabular (structured) data.
ii) Gradient Boosting is quite a flexible model and can easily overfit.
iii) The Gradient Boosting model is more flexible than one decision tree model.
ii and iii
i and iii
iii only
All (i, ii, iii) - Answer-All (i, ii, iii)
Which of the following is not a machine learning approach to estimate the true
function f?
Parametric Methods
Dynamic Programming
Non-parametric methods - Answer-Dynamic Programming
An ESPN Analyst is estimating the viewership number for the Iowa vs. Iowa State
football game. The model built has the form
f(X) = b0 + b1 Hawkeye_Fans + b2 IowaState_Fans + b3 Historical_Viewers + ...
Which of the following is not a correct way to describe this model f?
None of the other options
Parametric Model
Linear Regression
Non-Parametric Model - Answer-Non-Parametric Model
Which of the following methods does not involve measuring distances?
Linear Regression
KNN
K-Means Clustering
Hierarchical Clustering - Answer-Linear Regression
Which of the following is not an example of Unsupervised Learning?
PRACTICE QUIZ QUESTIONS AND
ANSWERS
Cross-validation is used to estimate - Answer-Test error using training data
What is the fundamental difference between an artificial neural network (ANN) and a
deep neural network (DNN)? - Answer-No. of hidden layers (DNN has more hidden
layers, they automate unstructured data)
Which of the following is not a significant reason for the recent popularity of neural
networks?
Neural Networks were developed recently
More data that can be stored efficiently now
More computation power available now
Backpropagation algorithm and other better algos to train models now - Answer-
Neural Networks were developed recently
Suppose we are building a neural network with 2 hidden layers and 2 neurons in
each hidden layer.
We have 5 features (input nodes) and we are solving a regression problem.
How many parameters do we need to learn for this neural network? - Answer-21
Suppose we are building a neural network with 2 hidden layers and 10 neurons in
each layer, we have 5 features (input nodes) and we are solving a regression
problem. How many parameters do we need to learn for this neural network? -
Answer-1 hidden layer: 10 *(5 + 1) + (10 + 1) = 71
2 hidden layer: 10 *(5 + 1) + 10*(10 + 1) + (10 + 1) = 82
How can we increase the complexity of a neural network structure?
i) Increase the no. of neurons in the output layer
ii) Increase the no. of neurons in the hidden layer.
iii) Increase the no. of neurons in the input layer.
ii
i & ii
i & iii
ii & iii - Answer-ii
Which of the following statements is correct about boosting methods? - Answer-
Boosting involves building multiple models in a sequence
Which of the following statements is correct?
i) Each model in boosting is a weak learner.
ii) Each model in boosting is learned in a way that only a small set of features are
available at each step of making splits.
iii) There is only one kind of boosting method.
ii
, i and ii
ii and iii
i only - Answer-i only
Which of the following is an integral component of Gradient Boosting?
Weights given to different data points
Additive Model
Maximum Likelihood
Least Squares - Answer-Additive Model
Which of the following statements are correct about the gradient boosting method?
i) There's no restriction on the size of each tree model we build.
ii) The number of trees must not be too large as it can lead to overfitting.
iii) Learning rate is not a hyperparameter for gradient boosting.
ii and iii
i and ii
iii only
ii only - Answer-ii only
Which of the following statements is correct?
i) Gradient Boosting is a state-of-the-art model for tabular (structured) data.
ii) Gradient Boosting is quite a flexible model and can easily overfit.
iii) The Gradient Boosting model is more flexible than one decision tree model.
ii and iii
i and iii
iii only
All (i, ii, iii) - Answer-All (i, ii, iii)
Which of the following is not a machine learning approach to estimate the true
function f?
Parametric Methods
Dynamic Programming
Non-parametric methods - Answer-Dynamic Programming
An ESPN Analyst is estimating the viewership number for the Iowa vs. Iowa State
football game. The model built has the form
f(X) = b0 + b1 Hawkeye_Fans + b2 IowaState_Fans + b3 Historical_Viewers + ...
Which of the following is not a correct way to describe this model f?
None of the other options
Parametric Model
Linear Regression
Non-Parametric Model - Answer-Non-Parametric Model
Which of the following methods does not involve measuring distances?
Linear Regression
KNN
K-Means Clustering
Hierarchical Clustering - Answer-Linear Regression
Which of the following is not an example of Unsupervised Learning?