3D-Image Shape Representation

Photo by Reno Laithienne on Unsplash

Real world objects, most times, are three dimensional. In my opinion, most objects exist in 3D, hence it is important that we represent images and train models using three dimensional images.

3D Vision is an active research area in Computer vision. Nowadays, 3D vision combines both traditional algorithms and deep learning. In this post, I will share existing representations of 3D shapes.

These representations include:

  1. Depth Map

Depth Map gives distance from the camera to the object in real world. When a depth map is merged with a source(input) image, it forms a 3D image. Depth Maps data can also be recorded for some types of 3D sensors. In order to create a depth map from a 2D image, a fully convolutional network can be used to predict it. However, the major problem with depth map is scale ambiguity and this can be minimised by using a scale invariant loss.

Depth Map (Source: Google Images)

2. Voxel Grid

Voxel Grid represents a shape with a V x V x V grid of occupancies. It’s like a segmentation mask in Mask-R-CNN, but in 3D. Voxel shapes can be generated from a 2D input image by using a combination of 2D and 3D CNN. It can also be generated by using a voxel tube representation model. You can also scale voxels using Oct-Trees.

Voxel Grid (Source: Google Images)

3. Implicit Surface

Implicit Surface learns a function to classify arbitrary points as inside or outside the shape. The term implicit means that the equation is not solved for any variable.

Implicit Surface (Source: Google Images)

4. Point Cloud

Point Cloud represents a set of P points in 3D space. Self driving cars uses point cloud representation of the world around it. A neural network application popularly used is the PointNet .

Point Cloud (Source: Google Photos)

5. Mesh

Mesh is commonly used in Computer Graphics. It represents a 3D shape as a set of triangles. A popular deep learning architecture for mesh is Pixel2Mesh. Graph Convolutions can be used to predict triangle meshes.

Mesh (source: google photos)




I write on Computer Vision, Deep Learning and Machine Learning techniques.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Making a simple and fast chatbot in 10 minutes

Finding the optimal parameters in a RF model using pipes for code reuse

Perceptron: More Than Meets the i

Developing a Recurrent Neural Network (RNN) for a Google Stock Price Dataset Having Continuous Data

Image Gradient for Edge Detection in PyTorch

Dog Breed Detection Using Deep Learning | Python | Transfer Learning | Step By Step Walkthrough |

Machine Learning: Inspiring technology

SVM Classifier and RBF Kernel — How to Make Better Models in Python

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ayoola Olaleye

Ayoola Olaleye

I write on Computer Vision, Deep Learning and Machine Learning techniques.

More from Medium

Synthetic data to develop a trustworthy autonomous driving system | Chapter 6

What Is an Ideal Rotation Representation With Uncertainty for Object Manipulation?

Dilated Convolutions ( Deep Learning)

PIA project’s achievement at NeurIPS AIDO6