Data Science EXAM WITH ACTUAL QUESTION AND CORRECT ANSWERS NEWEST 2025
Kernel trick - CORRECT ANSWER-Using the kernel trick to find separating hyperplanes in higher
dimensional space
To solve a nonlinear problem using an SVM, we transform the training data onto a higher
dimensional feature space via a mapping function Using the kernel trick to find separating
hyperplanes in higher dimensional space and train a linear SVM model to classify the data in this new
feature space. Then we can use the same mapping function Using the kernel trick to find separating
hyperplanes in higher dimensional space to transform new, unseen data to classify it using the linear
SVM model.
However, one problem with this mapping approach is that the construction of the new features is
computationally very expensive, especially if we are dealing with high-dimensional data. This is
where the so-called kernel trick comes into play. Although we didn't go into much detail about how
to solve the quadratic programming task to train an SVM, in practice all we need is to replace the dot
product Using the kernel trick to find separating hyperplanes in higher dimensional space by Using
the kernel trick to find separating hyperplanes in higher dimensional space. In order to save the
expensive step of calculating this dot product between two points explicitly, we define a so-called
kernel function: Using the kernel trick to find separating hyperplanes in higher dimensional space.
One of the most widely used kernels is the Radial Basis Function kernel (RBF kernel) or Gaussian
kernel:
The trick is to choose a transformation so that the kernel can be computed without actually
computing the transformation.
replacing the dot-product function with a new function that returns what the dot product would
have been if the data had first been transformed to a higher dimensional space. Usually done using
the radial-basis function
Radial Basis Function kernel (RBF kernel) - CORRECT ANSWER-Used for Kernel Trick in SVMs
Gaussian Kernel - CORRECT ANSWER-Used for Kernel Trick in SVMs
Types of Kernels for Kernel Trick - CORRECT ANSWER-Fisher kernel
Graph kernels
,Kernel smoother
Polynomial kernel
RBF kernel
String kernels
What are kernels? - CORRECT ANSWER-A kernel is a similarity function. It is a function that you, as
the domain expert, provide to a machine learning algorithm. It takes two inputs and spits out how
similar they are.
Kernels offer an alternative. Instead of defining a slew of features, you define a single kernel function
to compute similarity between images. You provide this kernel, together with the images and labels
to the learning algorithm, and out comes a classifier.
SVM - Strengths and Weeknesses - CORRECT ANSWER-...
Decision tree classifiers - Strength and Weeknesses - CORRECT ANSWER-Strength:
1) feature scaling is not a requirement for decision tree algorithms
2) Can visualize the DT (using GraphViz)
Weakness:
1) we have to be careful since the deeper the decision tree, the more complex the decision boundary
becomes, which can easily result in overfitting
Note:
Using Random Forest allows combining weak learners with strong learners
Information gain (IG) - CORRECT ANSWER-Information gain is simply the difference between the
impurity of the parent node and the sum of the child node impurities—the lower the impurity of the
child nodes, the larger the information gain
Gini index - CORRECT ANSWER-...
Entropy - CORRECT ANSWER-...
, Classification error - CORRECT ANSWER-This is a useful criterion for pruning but not recommended
for growing a decision tree, since it is less sensitive to changes in the class probabilities of the nodes.
z - CORRECT ANSWER-,,,
Parametric versus nonparametric models - CORRECT ANSWER-Machine learning algorithms can be
grouped into parametric and nonparametric models. Using parametric models, we estimate
parameters from the training dataset to learn a function that can classify new data points without
requiring the original training dataset anymore. Typical examples of parametric models are the
perceptron, logistic regression, and the linear SVM. In contrast, nonparametric models can't be
characterized by a fixed set of parameters, and the number of parameters grows with the training
data. Two examples of nonparametric models that we have seen so far are the decision tree
classifier/random forest and the kernel SVM.
KNN belongs to a subcategory of nonparametric models that is described as instance-based learning.
Models based on instance-based learning are characterized by memorizing the training dataset, and
lazy learning is a special case of instance-based learning that is associated with no (zero) cost during
the learning process.
Nonparametric - CORRECT ANSWER-Nonparametric models can't be characterized by a fixed set of
parameters, and the number of parameters grows with the training data.
Two examples of nonparametric models that we have seen so far are the decision tree
classifier/random forest and the kernel SVM.
Parametric - CORRECT ANSWER-Using parametric models, we estimate parameters from the training
dataset to learn a function that can classify new data points without requiring the original training
dataset anymore.
Typical examples of parametric models are the perceptron, logistic regression, and the linear SVM.
Free parameter - CORRECT ANSWER-...
Slack variable - CORRECT ANSWER-...
Kernel trick - CORRECT ANSWER-Using the kernel trick to find separating hyperplanes in higher
dimensional space
To solve a nonlinear problem using an SVM, we transform the training data onto a higher
dimensional feature space via a mapping function Using the kernel trick to find separating
hyperplanes in higher dimensional space and train a linear SVM model to classify the data in this new
feature space. Then we can use the same mapping function Using the kernel trick to find separating
hyperplanes in higher dimensional space to transform new, unseen data to classify it using the linear
SVM model.
However, one problem with this mapping approach is that the construction of the new features is
computationally very expensive, especially if we are dealing with high-dimensional data. This is
where the so-called kernel trick comes into play. Although we didn't go into much detail about how
to solve the quadratic programming task to train an SVM, in practice all we need is to replace the dot
product Using the kernel trick to find separating hyperplanes in higher dimensional space by Using
the kernel trick to find separating hyperplanes in higher dimensional space. In order to save the
expensive step of calculating this dot product between two points explicitly, we define a so-called
kernel function: Using the kernel trick to find separating hyperplanes in higher dimensional space.
One of the most widely used kernels is the Radial Basis Function kernel (RBF kernel) or Gaussian
kernel:
The trick is to choose a transformation so that the kernel can be computed without actually
computing the transformation.
replacing the dot-product function with a new function that returns what the dot product would
have been if the data had first been transformed to a higher dimensional space. Usually done using
the radial-basis function
Radial Basis Function kernel (RBF kernel) - CORRECT ANSWER-Used for Kernel Trick in SVMs
Gaussian Kernel - CORRECT ANSWER-Used for Kernel Trick in SVMs
Types of Kernels for Kernel Trick - CORRECT ANSWER-Fisher kernel
Graph kernels
,Kernel smoother
Polynomial kernel
RBF kernel
String kernels
What are kernels? - CORRECT ANSWER-A kernel is a similarity function. It is a function that you, as
the domain expert, provide to a machine learning algorithm. It takes two inputs and spits out how
similar they are.
Kernels offer an alternative. Instead of defining a slew of features, you define a single kernel function
to compute similarity between images. You provide this kernel, together with the images and labels
to the learning algorithm, and out comes a classifier.
SVM - Strengths and Weeknesses - CORRECT ANSWER-...
Decision tree classifiers - Strength and Weeknesses - CORRECT ANSWER-Strength:
1) feature scaling is not a requirement for decision tree algorithms
2) Can visualize the DT (using GraphViz)
Weakness:
1) we have to be careful since the deeper the decision tree, the more complex the decision boundary
becomes, which can easily result in overfitting
Note:
Using Random Forest allows combining weak learners with strong learners
Information gain (IG) - CORRECT ANSWER-Information gain is simply the difference between the
impurity of the parent node and the sum of the child node impurities—the lower the impurity of the
child nodes, the larger the information gain
Gini index - CORRECT ANSWER-...
Entropy - CORRECT ANSWER-...
, Classification error - CORRECT ANSWER-This is a useful criterion for pruning but not recommended
for growing a decision tree, since it is less sensitive to changes in the class probabilities of the nodes.
z - CORRECT ANSWER-,,,
Parametric versus nonparametric models - CORRECT ANSWER-Machine learning algorithms can be
grouped into parametric and nonparametric models. Using parametric models, we estimate
parameters from the training dataset to learn a function that can classify new data points without
requiring the original training dataset anymore. Typical examples of parametric models are the
perceptron, logistic regression, and the linear SVM. In contrast, nonparametric models can't be
characterized by a fixed set of parameters, and the number of parameters grows with the training
data. Two examples of nonparametric models that we have seen so far are the decision tree
classifier/random forest and the kernel SVM.
KNN belongs to a subcategory of nonparametric models that is described as instance-based learning.
Models based on instance-based learning are characterized by memorizing the training dataset, and
lazy learning is a special case of instance-based learning that is associated with no (zero) cost during
the learning process.
Nonparametric - CORRECT ANSWER-Nonparametric models can't be characterized by a fixed set of
parameters, and the number of parameters grows with the training data.
Two examples of nonparametric models that we have seen so far are the decision tree
classifier/random forest and the kernel SVM.
Parametric - CORRECT ANSWER-Using parametric models, we estimate parameters from the training
dataset to learn a function that can classify new data points without requiring the original training
dataset anymore.
Typical examples of parametric models are the perceptron, logistic regression, and the linear SVM.
Free parameter - CORRECT ANSWER-...
Slack variable - CORRECT ANSWER-...