PointNet: Deep Learning on
Point Sets for 3D Classification
and Segmentation
Point cloud is an important type of geometric data structure. Due to its irregular
format, most researchers transform such data to regular 3D voxel grids or
collection of images. This, however, renders data unnecessarily voluminous
and causes issues. This paper proposes a type of neural network that directly
consumes point clouds, which well respects the permutation invariance of
points in the input.
💡 Permutation Invariant refers to a property of a model where the output
remains unchanged regardless of the order of the input elements.
Introduction
Typical convolutional architectures require highly regular input data formats,
like those of image grids or 3D voxels, in order to perform weight sharing and
other kernel optimizations. This however renders the resulting data
unnecessarily voluminous.
Key Contributions:
We design a novel deep net architecture suitable for consuming unordered
point sets in 3D
We show how such a net can be trained to perform 3D shape classification,
shape part segmentation and scene semantic parsing tasks
We provide thorough empirical and theoretical analysis on the stability and
efficiency of our method
We illustrate the 3D features computed by the selected neurons in the net
and develop intuitive explanations for its performance.
Related Works
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation 1
, Point Cloud Features, Deep Learning on 3D Data, Deep Learning on
Unordered Sets
Problem Statement
A point cloud is represented as a set of 3D points where each point P is a
vector of its (x,y,z) coordinate plus extra feature channels such as color, normal
etc.
Deep Learning Point Sets
Property of Point Sets in R^n
The input is a subset of points from an Euclidean space. The 3 main properties:
1. Unordered. Unlike pixel arrays in images or voxels point cloud is a set of
points without specific order.
2. Interaction among points. The points are from a space with a distance
metric. It means that points are not isolated, and neighboring points form a
meaningful subset. The model needs to be able to capture local structures
from nearby points.
3. Invariance under transformations: As a geometric object, the learned
representation of the points set should be invariant to certain
transformations.
PointNet Architecture
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation 2
Point Sets for 3D Classification
and Segmentation
Point cloud is an important type of geometric data structure. Due to its irregular
format, most researchers transform such data to regular 3D voxel grids or
collection of images. This, however, renders data unnecessarily voluminous
and causes issues. This paper proposes a type of neural network that directly
consumes point clouds, which well respects the permutation invariance of
points in the input.
💡 Permutation Invariant refers to a property of a model where the output
remains unchanged regardless of the order of the input elements.
Introduction
Typical convolutional architectures require highly regular input data formats,
like those of image grids or 3D voxels, in order to perform weight sharing and
other kernel optimizations. This however renders the resulting data
unnecessarily voluminous.
Key Contributions:
We design a novel deep net architecture suitable for consuming unordered
point sets in 3D
We show how such a net can be trained to perform 3D shape classification,
shape part segmentation and scene semantic parsing tasks
We provide thorough empirical and theoretical analysis on the stability and
efficiency of our method
We illustrate the 3D features computed by the selected neurons in the net
and develop intuitive explanations for its performance.
Related Works
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation 1
, Point Cloud Features, Deep Learning on 3D Data, Deep Learning on
Unordered Sets
Problem Statement
A point cloud is represented as a set of 3D points where each point P is a
vector of its (x,y,z) coordinate plus extra feature channels such as color, normal
etc.
Deep Learning Point Sets
Property of Point Sets in R^n
The input is a subset of points from an Euclidean space. The 3 main properties:
1. Unordered. Unlike pixel arrays in images or voxels point cloud is a set of
points without specific order.
2. Interaction among points. The points are from a space with a distance
metric. It means that points are not isolated, and neighboring points form a
meaningful subset. The model needs to be able to capture local structures
from nearby points.
3. Invariance under transformations: As a geometric object, the learned
representation of the points set should be invariant to certain
transformations.
PointNet Architecture
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation 2