Structure from Motion (sfm)

1 minute read

Published:

Blog Post #5

Reconstruction a 3D point cloud and camera pose of a given image set.

In this project, we are going to implement the full pipeline of Structure from Motion, including two-view reconstruction, triangulation, PnP, and bundle adjustment.Assume the intrinsic matrix K of camera is known.

1.1: Introduction


Structure from Motion (SFM) helps to estimate the 3-D structure of a scene from a set of 2-D images. It is a process to reconstruct a 3D scene and simultaneously obtain the camera poses of a monocular camera w.r.t. the given scene. The main pipeline of this project includes the following:

Initialization:

  1. Feature points detection and find matching between the first 2 images, SIFT detector is used in this project.
  2. Finding Essential Matrix E, where (y’)^TEy = 0
  3. Recovering Camera Pose From E to obtain the rotation R & Translation T,
  4. Obtaining the 3D points by Triangulation

Incremental Optimization (append more images to extend the reconstruction):

  1. Finding Correspondence between the new image & previous frame
  2. Estimating new camera poses from Perspective-n-Point (PnP)
  3. Adding 3D new points by Triangulation
  4. Minimizing reprojection error by Bundle Adjustment (BA)

1.2: Input


TempleRing Dataset is used. It contains 46 images in total from 00.png to 45.png, capturing a 2D view of the object from different views.


input images

1.3: Outputs



3D reconstruction of TempleRing & camera poses from pyntcloud library in python codes. Green dots represent the TempleRing. Red dots represent the camera positions


I also saved the 3D points in .ply format which can be viewed from free online PLY file viewer.


a video output for the 3D object.


ideal ground truth

Hope you enjoy this :)

Reference:
Project 4 coursework from HKUST ELEC6910A - First Principles of Computer Vision, instructor: Prof. Ping TAN, 2023 Fall