Title Object recognition from point cloud data


A major problem for visually impaired people (or blind people) is to grab object without pushing them over, e.g., to grab a cup of coffee. To grab an object the following information is needed: the size and position of the object, and a very simplified description of its shape.  This project aims to detect, recognize and describe such objects from range data. The main purpose of this work is to provide visually impaired people grabbing objects in a highly simplified form, as shown in Fig. 1.


However, different from conventional techniques for 3-D object recognition, of particular importance is the recognition of cylindrical objects such as cups, glasses, bottles, pencils, and objects composed of planar segments, e.g., small box. In this work, we construct a simplified geometric model of the objects from an unstructured set of feature points extracted from real-time video and range data. The recognition has to be made fast and reliable, and it must give a sense of direct feeling.

 Fig. 1: An example of 3-D object detection and recognition

We seek to approximate point clouds by a few very simple geometric primitives. One solution would be to use existing methods for accuracy shape extraction and then apply geometric simplification techniques, e.g., based on triangulation [Ronfard96]. In this project, we rather propose to fit simple models directly to data. This means we need to allow much bigger tolerances than usual, which complicates algorithms such as RANSAC.  One part of the solution is to perform the fitting using L1 norms rather than L2 norms, but this requires radically different algorithms [Veelaert13]. A second part of the solution is to exploit geometric relationships between objects in the model fitting. For example, indoor scenes often several almost parallel planes, for perpendicular planes, coplanar poles, concentric cylinders, or other incidence relations.

Work description:


  • Study KINECT SDK, and depth data
  • Study geometric relations between objects., e.g., three planes that meet at a point perpendicular to other
  • Study relationships between objects whose position and shape are not precisely known.
  • Study finding (RANSAC, generalized Hough transform) and fitting (L1, L-infinity) algorithms


  • A method to take advantages of geometric simplification
  • A method embeds the results to applications on real environments such as recognizing a cup on table when a blind people grab a coffee cup for his breakfast.


Student prerequisites

This subject is dedicated to Vietnamese students as well as foreigner students at Master degree of Signal and Image processing option. The students who have a fairly good knowledge about image processing and C++ programming are privileged. 


Dr. Vu Hai: This email address is being protected from spambots. You need JavaScript enabled to view it.


[Ronfard96] R. Ronfard et al., “Full-range approximation of triangulated polyhedra”, Proc. Eurographics Computer Graphic Forum, 1996, 15: C67-C76

[Veelaert13] P. Veelaert et al., “Concurrency relations between digital planes”, Proc. Digital Geometry and Computer Imagery, 2013, pp. 347-335