Bux, Allah and Angelov, Plamen and Habib, Zulfiqar (2017) Vision-based human action recognition using machine learning techniques. PhD thesis, Lancaster University.
2017AllahBuxPhD.pdf - Published Version
Available under License Creative Commons Attribution-NoDerivs.
Download (19MB)
Abstract
The focus of this thesis is on automatic recognition of human actions in videos. Human action recognition is defined as automatic understating of what actions occur in a video performed by a human. This is a difficult problem due to the many challenges including, but not limited to, variations in human shape and motion, occlusion, cluttered background, moving cameras, illumination conditions, and viewpoint variations. To start with, The most popular and prominent state-of-the-art techniques are reviewed, evaluated, compared, and presented. Based on the literature review, these techniques are categorized into handcrafted feature-based and deep learning-based approaches. The proposed action recognition framework is then based on these handcrafted and deep learning based techniques, which are then adopted throughout the thesis by embedding novel algorithms for action recognition, both in the handcrafted and deep learning domains. First, a new method based on handcrafted approach is presented. This method addresses one of the major challenges known as “viewpoint variations” by presenting a novel feature descriptor for multiview human action recognition. This descriptor employs the region-based features extracted from the human silhouette. The proposed approach is quite simple and achieves state-of-the-art results without compromising the efficiency of the recognition process which shows its suitability for real-time applications. Second, two innovative methods are presented based on deep learning approach, to go beyond the limitations of handcrafted approach. The first method is based on transfer learning using pre-trained deep learning model as a source architecture to solve the problem of human action recognition. It is experimentally confirmed that deep Convolutional Neural Network model already trained on large-scale annotated dataset is transferable to action recognition task with limited training dataset. The comparative analysis also confirms its superior performance over handcrafted feature-based methods in terms of accuracy on same datasets. The second method is based on unsupervised deep learning-based approach. This method employs Deep Belief Networks (DBNs) with restricted Boltzmann machines for action recognition in unconstrained videos. The proposed method automatically extracts suitable feature representation without any prior knowledge using unsupervised deep learning model. The effectiveness of the proposed method is confirmed with high recognition results on a challenging UCF sports dataset. Finally, the thesis is concluded with important discussions and research directions in the area of human action recognition.