Title Learning and Matching of Dynamic Manifold from RGB-D data for Human Action Recognition


Recognizing human action opens many applications ranging from video surveillance to gaming. However, accurate recognition is a highly challenging task due to cluttered backgrounds, occlusions and viewpoint variations. Conventionally methods usually extract spatial and motion handcraft features representing actions then a recognition algorithm will be applied to classify actions [1-4]. In real world scenario, it is very difficult to know which features are important since the choice of features is highly problem-dependence. In addition, in a system using multiple sensors, human shape as well human action appears different from each sensors views. 

Human is an articulated object with high degree of freedom. However, despite the high dimensionality of configuration space, human action lies on low dimensional manifold. We rely upon this observation and would like to examine the role of using manifolds to represent dynamic shape of the human during action in this work [7, 8]

The main objective is to find out which type of manifolds model is suitable for human action representation then to learn this manifold model from examples for recognition. The aspect of variety in viewpoint, scale should be analyzed.

Previous works using manifold for human action representation use conventional RGB data. In this work, we would like to use RGB-D data taken from Kinect sensor. The difficulty is how to incorporate depth information into RGB to make an efficient algorithm of human action recognition. The developed method will be tested on benchmark dataset such as MSRAction3D, MSRGestures, and fall actions [6].


Figure 1: Human daily activities from Kinect

(Source from [8])


Work description:


  • Study manifolds and the use of manifold for human action recognition
  • Propose a method to incorporate depth and RGB images for learning and matching action manifold


       • Develop the proposed method on C/C++, Matlab with OpenCV, PCL

       • Testing and evaluation


Student prerequisites

Knowledge of image processing, computer vision, machine learning

Capable of programming in C / C ++ Microsoft Visual Studio 


Dr. Tran Thi Thanh Hai: This email address is being protected from spambots. You need JavaScript enabled to view it.


[1] Md. Atiqur Rahman Ahad, J. K. Tan, H. Kim, Motion history image: its variants and applications, Machine Vision and Application.

[2] J.K. Aggarwal and Lu Xia, Spatio-Temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera, 24th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, Oregon, June 2013. 

[3]Lu Xia, Chia-Chih Chen and J.K. Aggarwal, View Invariant Human Action Recognition Using Histograms of 3D Joints, International Workshop on Human Activity Understanding from 3D Data in conjunction with 23th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. 

[4]J.K. Aggarwal and Lu Xia, Human Activity Recognition From 3D Data: A Review, Pattern Recognition Letters Special Issue,2014 http://research.microsoft.com/en-us/um/people/zliu/ActionRecoRsrc/

[5] Ji et al, 3D Convolutional Neural Networks for Human Action Recognition, ICML 2010, http://www.dbs.ifi.lmu.de/~yu_k/icml2010_3dcnn.pdf

[6] http://research.microsoft.com/en-us/um/people/zliu/actionrecorsrc/

[7] Lee, A. E., & Su, C. (2008). The Role of Manifold Learning in Human Motion Analysis. Human Motion the series Computational Imaging and Vision, 36, 25–56.

[8] Wang, L., & Suter, D. (2007). Learning and Matching of Dynamic Shape Manifolds for Human Action Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING, 16(6), 1646–1661.