Sense Ph.D. student Igor Bastos was selected to present his novel approach for gesture recognition at the 15th IEEE International Conference on Advanced Video and Signal-based Surveillance (AVSS). Sponsored by the Institute of Electrical and Electronics Engineers (IEEE), AVSS 2018 will be held in Auckland, New Zealand, from 27 to 30 November.
Due to its applicability in contexts such as surveillance and biometric authentication, gesture recognition has been investigated by several works, often focusing on the capture of motion and appearance on videos. However, most of the previous work do not properly explore the well-defined temporal structure of gestures and are not suitable to deal with large numbers of gestures.
In his research, Bastos proposes the MultiOutput Recurrent Autoencoders (MORA), an approach that relies on the representation of each gesture class independently. MORA employs a specific autoencoder model per class, composed of a convolutional (3D) and a Gated Recurrent Unit (GRU) layer, that allows spatiotemporal information extraction and scalability in terms of number of classes.
“We used a somewhat different strategy from what researchers typically use for gesture recognition. Usually, discriminatory models based on traditional supervised classification are employed. In our case, we adapt unsupervised models and make a kind of competition between them. The main advantage of the approach is the scalability with respect to the number of gestures and the robustness to class imbalance. It is a very different strategy from the literature and it has achieved good results”, explains Igor.
To validate MORA, experiments were conducted on SKIG and ChaLearn IsoGD datasets, on which the approach achieved accuracies comparable to state-of-the-art methods.
The research had financial support of the National Council for Scientific and Technological Development (CNPq), the Minas Gerais State Agency for Research and Development (FAPEMIG), and the Coordination for the Improvement of Higher Education Personnel (CAPES) through the DeepEyes Project.
Igor Bastos has a degree in Computer Engineering from the State University of Feira de Santana (UEFS) and a Master’s Degree in Computer Science from the Federal University of Bahia (UFBA), with the dissertation “Libras Sign Recognition Using Form Descriptors and Artificial Neural Networks”. He has experience in Computational Vision, working mainly on image processing, neural networks and cephalometry.
Led by Igor Bastos, the research also had contributions from Ph.D students Victor Hugo Melo and Gabriel Resende Goncalves. Professor William Robson Schwartz, coordinator of Smart Sense Laboratory, guided the efforts. The publication can be accessed here.
In its 15th edition, the International Conference on Advanced Video and Signal-based Surveillance focuses on underlying theory, methods, systems, and applications of surveillance. AVSS is the premier conference in the field of video and signal-based surveillance that brings together experts from academia, industry, and government to advance theories, methods, systems, and applications related to surveillance. Since its creation, the event has already been based on four continents.