Online robust action recognition based on a hierarchical model

Jiang, Xinbo; Zhong, Fan; Peng, Qunsheng; Qin, Xueying

doi:10.1007/s00371-014-0923-8

Online robust action recognition based on a hierarchical model

Original Article
Published: 26 February 2014

Volume 30, pages 1021–1033, (2014)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Xinbo Jiang ¹,
Fan Zhong ¹,
Qunsheng Peng ² &
...
Xueying Qin ^1,3

1783 Accesses
28 Citations
Explore all metrics

Abstract

Action recognition solely based on video data has known to be very sensitive to background activity, and also lacks the ability to discriminate complex 3D motion. With the development of commercial depth cameras, skeleton-based action recognition is becoming more and more popular. However, the skeleton-based approach is still very challenging because of the large variation in human actions and temporal dynamics. In this paper, we propose a hierarchical model for action recognition. To handle confusing motions, a motion-based grouping method is proposed, which can efficiently assign each video a group label, and then for each group, a pre-trained classifier is used for frame-labeling. Unlike previous methods, we adopt a bottom-up approach that first performs action recognition for each frame. The final action label is obtained by fusing the classification to its frames, with the effect of each frame being adaptively adjusted based on its local properties. To achieve online real-time performance and suppressing noise, bag-of-words is used to represent the classification features. The proposed method is evaluated using two challenge datasets captured by a Kinect. Experiments show that our method can robustly recognize actions in real-time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from 17,985円 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Fig. 7

Fig. 8

Fig. 9

Fig. 10

Fig. 11

Fig. 12

Fig. 13

Fig. 14

Complex Human Action Recognition Using a Hierarchical Feature Reduction and Deep Learning-Based Method

Article Open access 13 February 2021

Skeleton-Based Online Action Detection with Temporal Enhancement

Learning Discriminative Representation for Skeletal Action Recognition Using LSTM Networks

Discover the latest articles, books and news in related subjects, suggested using machine learning.

Notes

In our implementation we adopt the standard implementation of KNN in OpenCV.

References

Matikainen, P., Hebert, M., Sukthankar, R.: Trajectons: action recognition through the motion analysis of tracked features. In: Workshop on Video-Oriented Object and Event Classification, ICCV 2009 (2009)
Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: CVPR’12, pp. 1290–1297 (2012)
Laptev, I.: On space-time interest points. Int. J. Comput. Vis. 64(2–3), 107–123 (2005)
Google Scholar
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: The Tenth IEEE International Conference on Computer Vision (ICCV’05), pp. 1395–1402 (2005)
Kläser, A., Marszałek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: British Machine Vision Conference, pp. 995–1004 (2008)
Willems, G., Tuytelaars, T., Gool, L.: An efficient dense and scale-invariant spatio-temporal interest point detector. In: Proceedings of the 10th European Conference on Computer Vision: Part II, ser. ECCV ’08, pp. 650–663. Springer, Berlin (2008)
Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Conference on Computer Vision and Pattern Recognition (2008)
Ke, Y., Sukthankar, R., Hebert, M.: Event detection in crowded videos. In: IEEE International Conference on Computer Vision (2007)
Shechtman, E., Irani, M.: Space-time behavior based correlation -or- how to tell if two underlying motion fields are similar without computing them? IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 29(11), 2045–2056 (2007)
Article Google Scholar
Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: CVPR (2008)
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, ser. CVPR ’11. Washington, DC, IEEE Computer Society, pp. 1297–1304 (2011)
Ellis, C., Masood, S.Z., Tappen, M.F., Laviola Jr, J.J., Sukthankar, R.: Exploring the trade-off between accuracy and observational latency in action recognition. Int. J. Comput. Vis. 101(3), 420–436 (Feb. 2013)
Article Google Scholar
Fothergill, S., Mentis, H.M., Kohli, P., Nowozin, S.: Instructing people for training gestural interactive systems. In: Konstan, J.A., Chi, E.H., Höök, K. (eds.) ACM, pp. 1737–1746 (2012)
Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceedings of the 20th ACM international conference on Multimedia, ser. MM ’12, pp. 1057–1060. ACM, New York (2012)
Negin, F., Ozdemir, F., Akgul, C.B., Yuksel, K.A., Ercil, A.: A decision forest based feature selection framework for action recognition from rgb-depth cameras (2013)
Chatzis, S.P., Kosmopoulos, D.I., Doliotis, P.: A conditional random field-based model for joint sequence segmentation and classification. Pattern Recognit. 46(6), 1569–1578 (2013)
Article MATH Google Scholar

Download references

Acknowledgments

The authors gratefully acknowledge the anonymous reviewers for their comments to help us to improve our paper, and also thank Guofeng Wang for his enormous help in revising this paper. This work is supported by 973 program of China (No. 2009CB320802), NSF of China (Nos. U1035004, 61173070, 61202149), Key Projects in the National Science & Technology Pillar Program (No. 2013BAH39F00).

Author information

Authors and Affiliations

School of Computer Science and Technology, Shandong University, Jinan, People’s Republic of China
Xinbo Jiang, Fan Zhong & Xueying Qin
State Key Lab of CAD and CG, Zhejiang University, Hangzhou, People’s Republic of China
Qunsheng Peng
Shandong Provincial Key Laboratory of Network Based Intelligent Computing, Jinan, People’s Republic of China
Xueying Qin

Authors

Xinbo Jiang
View author publications
Search author on:PubMed Google Scholar
Fan Zhong
View author publications
Search author on:PubMed Google Scholar
Qunsheng Peng
View author publications
Search author on:PubMed Google Scholar
Xueying Qin
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Fan Zhong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, X., Zhong, F., Peng, Q. et al. Online robust action recognition based on a hierarchical model. Vis Comput 30, 1021–1033 (2014). https://doi.org/10.1007/s00371-014-0923-8

Download citation

Published: 26 February 2014
Issue date: September 2014
DOI: https://doi.org/10.1007/s00371-014-0923-8

Online robust action recognition based on a hierarchical model

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Complex Human Action Recognition Using a Hierarchical Feature Reduction and Deep Learning-Based Method

Skeleton-Based Online Action Detection with Temporal Enhancement

Learning Discriminative Representation for Skeletal Action Recognition Using LSTM Networks

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Online robust action recognition based on a hierarchical model

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Complex Human Action Recognition Using a Hierarchical Feature Reduction and Deep Learning-Based Method

Skeleton-Based Online Action Detection with Temporal Enhancement

Learning Discriminative Representation for Skeletal Action Recognition Using LSTM Networks

Explore related subjects

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now