CERTIFIED 106 QUESTIONS
WITH 100% CORRECT
ANSWERS GUARANTEED
EXCELLENT SCORES
"Effect of channels on output size - CORRECT ANSWER=> It doesn't have effect on the output size:
we perform the dot product for each channels and summing them up."
"Effect of channels on parameters - CORRECT ANSWER=> Each channel might have its own
weights with respect to the same kernel.
M x (Ch x K1 x K2 + 1)"
"Effect of multiple kernels (feature extraction) on output size. - CORRECT ANSWER=> The kernel
size should be equal (K1 x K2) for each kernel within the layer. The output size:
(H - K1 + 1) x (W - K2 + 1) x Number of Kernels"
"Effect of multiple kernels (feature extraction) on parameters - CORRECT ANSWER=> Each kernel,
each channel has its own set of weights, but each kernel has only 1 bias term.
(K1 x K2 x Channels + 1) x M
where M is the number of kernels"
"What is the purpose of pooling layer? - CORRECT ANSWER=> Dimensionality reduction"
"How many learned parameters does a max pooling layer have? - CORRECT ANSWER=> None"
1|Page
, "Invariance - CORRECT ANSWER=> If the feature changes, moves or rotates slightly on the image,
the output value remains the same. (For example, we classify the image of a cat regardless of
where the cat is in the image)"
"Equivariance - CORRECT ANSWER=> If the feature translates or moves a little bit, the output
values move by the same translation and can be detected in the new location."
"Why different kernels would learn different features? - CORRECT ANSWER=> Because we
initialize them to different values, and the local minima on the weight space will different, and
so the gradient will be different --> kernels are learning different features."
"If cross-correlation is the forward pass,
then gradient w.r.t. the input is ... - CORRECT ANSWER=> CONVOLUTION between the upstream
and the kernel weights"
"If cross-correlation is the forward pass,
then gradient w.r.t the kernel is ... - CORRECT ANSWER=> CROSS-CORRELATION between the
upstream gradient and the input"
"LeNet - CORRECT ANSWER=> simple conv architecture:
Conv - MaxPool - Conv - MaxPool - FC - FC - Gaussian (=MSE loss)"
"What led to the success of deep CNN? - CORRECT ANSWER=> The large scale of data.
In 2012, ImageNet contained 1.2 billion labeled examples over 1000 categories. AlexNet was
much more successful than any other traditional and hand-engineered models."
"AlexNet key aspects - CORRECT ANSWER=> 7 layers, alternating Conv, MaxPooling,
Normalization and FC layers
- First neural network architecture that used ReLU instead of sigmoid or tanh
- Normalization layer - not common these days
- PCA based data augmentation (reduce var caused by lighting)
- Dropout - regularization
- Ensemble - 7 CNN, weighted sum of probabilities"
"VGG key aspects - CORRECT ANSWER=> >> Formed blocks of Conv-MaxPool, alternating those
>> small kernel size 3x3, stride 1 (AlexNet stride 4, kernel 11x11)
>> very large number of parameters (>100 millions)"
2|Page