MaxPooling preserves detected features, and
downsamples feature map (image)
True
If the input volume of an image is 227x227x3, if we
apply 96 11x11 filters with stride 4, how many
parameters?
(11x11x3)x96
In CNN, two conv layers cannot be connected
directly, we must use a pooling layer in the middle.
False
In the design of CNN, fully connected layer usually
contains much more parameters than conv layers.
True
What is the purpose of the ReLU activation function
in a CNN?
To introduce non-linearity
Which following statement is True about
convolution layer?
Convolution layer is non-linear and can be used to
extract rich feature by its non-linearity.
,Neural network can be built by simply stacking a
lot of convolution layers and fully connected layers
to achieve good performance.
Convolution layer is linear and it is often used
along with activation function.
Convolution layer is linear as fully connected layers
but requires more parameters to achieve better
performance in vision tasks.
Convolution layer is linear and it is often used
along with activation function.
What is the main advantage of using dropout in a
CNN?
Preventing overfitting
Given two stacking dilated convolution layers with
kernel size 3x3 and stride 1 and dilation 2 (one
pixel space among kernel points), what is the size
of receptive field?
9x9
Given a convolution layer with input channels 3,
output channel 64, kernel size 4x4 and stride 2,
dilation 3, padding 1, what are the parameter size
of this convolution layer?
3x64x4x4
, In Pytorch (import torch.nn as nn), which of the
following layer downsamples the input size into
half?
nn.Linear(in_features=6, out_features=2)
nn.Conv2d(in_channels=3, out_channels=64,
kernel_size=3, stride=1, padding=1, dialtion=1))
nn.Conv2d(in_channels=3, out_channels=64,
kernel_size=3, stride=1, padding=1, dialtion=2)
nn.Conv2d(in_channels=3, out_channels=64,
kernel_size=3, stride=2, padding=1, dialtion=1))
nn.Conv2d(in_channels=3, out_channels=64,
kernel_size=3, stride=2, padding=1, dialtion=1))
In the design of auto-encoder, the encoder and
decoder should follow the exact same structure.
False
Compared to classific machine learning, Deep
Learning often requires
lots of training data and lots of computations.
In order to reduce loss step by step, what direction
does the gradient descent algorithm take a step in
every iteration?
negative gradient