On Translation Invariance in
CNNs: Convolutional Layers
can Exploit Absolute Spatial
Location
Introduction
CNNs are assumed to be translational invariant. This means that the
networks is able to recognize patterns or features regardless of their
position in the input image.
💡 When convolution is followed by an operator that does not depend on
the position (max and average pooling) that gives translational
invariance
CNNs can encode absolute spatial location by exploiting image boundary
effects.
Related Works and Relevance
Boundary effects allow CNNs to learn filters whose output is placed outside
the image conditioned on their absolute position in the image. This encodes
position by only keeping filter outputs for specific absolute positions e.g.
learn filters that only fire for the top of the image, while bottom responses
are placed outside of the image boundary.
Encoding absolute locations has effect on cropping. Cropping a region after
the CNN can include absolute position information.
Robustness to image transformation can be implemented through
geometric data augmentation.
Pooling layers, like max pooling or average pooling, reduce the spatial
dimensions of feature maps, which can cause the network to lose
precise spatial information and thus reduce translational equivariance.
On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location 1
CNNs: Convolutional Layers
can Exploit Absolute Spatial
Location
Introduction
CNNs are assumed to be translational invariant. This means that the
networks is able to recognize patterns or features regardless of their
position in the input image.
💡 When convolution is followed by an operator that does not depend on
the position (max and average pooling) that gives translational
invariance
CNNs can encode absolute spatial location by exploiting image boundary
effects.
Related Works and Relevance
Boundary effects allow CNNs to learn filters whose output is placed outside
the image conditioned on their absolute position in the image. This encodes
position by only keeping filter outputs for specific absolute positions e.g.
learn filters that only fire for the top of the image, while bottom responses
are placed outside of the image boundary.
Encoding absolute locations has effect on cropping. Cropping a region after
the CNN can include absolute position information.
Robustness to image transformation can be implemented through
geometric data augmentation.
Pooling layers, like max pooling or average pooling, reduce the spatial
dimensions of feature maps, which can cause the network to lose
precise spatial information and thus reduce translational equivariance.
On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location 1