Multiple Choice Questions
Q1. What is the primary reason for using convolutional layers in image processing tasks?
A) To reduce the number of training epochs
B) To introduce non-linearity
C) To reduce the number of parameters by local connectivity and weight sharing
D) To enforce regularization
Answer: C) To reduce the number of parameters by local connectivity and weight sharing
Explanation: Convolutional layers leverage spatial locality by connecting each neuron to only a
local region of the input, and reusing the same weights (filters), which reduces the total
parameter count.
Q2. What does the "stride" parameter control in a convolution operation?
A) The number of layers in the CNN
B) The padding of the input
C) The number of filters applied
D) The step size of the filter as it moves over the input
Answer: D) The step size of the filter as it moves over the input
Explanation: Stride determines how much the filter moves when sliding over the input matrix.
Q3. Which of the following best describes the function of dropout?
A) It removes irrelevant input features permanently
B) It reduces overfitting by randomly deactivating neurons during training
C) It increases the number of training samples
D) It guarantees convergence during training
Answer: B) It reduces overfitting by randomly deactivating neurons during training
Explanation: Dropout prevents co-adaptation of neurons by forcing the network to learn
redundant representations.
Q4. Batch Normalization helps deep networks by:
, A) Increasing the model's size
B) Reducing internal covariate shift
C) Removing outliers
D) Scaling gradients uniformly
Answer: B) Reducing internal covariate shift
Explanation: By normalizing the inputs to each layer, batch normalization stabilizes and
accelerates training.
True/False Questions
Q5. Batch normalization is applied only during training, not during inference.
Answer: False
Explanation: During inference, batch norm uses moving averages of mean and variance
computed during training.
Q6. The Adam optimizer combines momentum and RMSProp.
Answer: True
Explanation: Adam uses momentum-like first-moment estimation and RMSProp-style second-
moment estimation.
Short Answer Questions
Q7. What are the main hyperparameters of the Adam optimizer?
Answer:
Learning rate (α\alpha)
β1\beta_1: Exponential decay rate for the first moment estimates
β2\beta_2: Exponential decay rate for the second moment estimates
ϵ\epsilon: Small constant to prevent division by zero
Q8. Explain how zero-padding in CNNs affects the output size.
Answer:
Zero-padding allows us to control the spatial dimensions of the output. With "same" padding, we
can preserve the input size, while "valid" padding results in a smaller output. Padding helps
retain information at the borders of the input.