Source code of the spatiotemporal feature descriptor proposed in the Optical Flow Co-occurrence Matrices: A Novel Spatiotemporal Feature Descriptor (ICPR 2016). Aiming at capturing more information from the optical flow, this work proposes a novel spatiotemporal local feature descriptor called Optical Flow Co-occurrence Matrices (OFCM). The method is based on the extraction of Haralick features from co-occurrence matrices computed using the optical flow information.
Our hypothesis for designing the OFCM is based on the assumption that the motion information on a video sequence can be described by the spatial relationship contained on local neighborhoods of the flow field. More specifically, we assume that the motion information is adequately specified by a set of magnitude and orientation co-occurrence matrices computed for various angular relationships at a given offset between neighboring vector pairs on the optical flow. Therefore, matrices obtained by modifying the spatial relationship (different angles or distances between magnitudes and orientations of neighboring pixels from which the optical flow has been extracted) will provide different information.
To demonstrate the effectiveness of the OFCM, we evaluate it in the action recognition task, a challenging problem that has attracted the attention of the research community for several years due to its practical and real-world applications. For instance, it can be employed on applications such as surveillance systems to detect and prevent abnormal or suspicious activities and on health care systems to monitor patients performing activities of daily living. Although we evaluate our method on the action recognition task, it is important to emphasize that, since the OFCM is a spatiotemporal feature descriptor, it can be also applied to other computer vision applications involving video description. According to the experimental results, the proposed feature aggregated using a standard visual recognition pipeline (Bag-of-Words followed by the SVM classifier) is able to recognize actions accurately on three well-know datasets: KTH, UCF Sports and HMDB51. The employment of the OFCM outperforms the results achieved by several widely employed spatiotemporal descriptors available in the literature.
A considerable improvement was obtained with OFCM, reaching 96.30% of accuracy on KTH dataset and 92.80% on UCF Sports dataset. There is an improvement of 2.10 percentage points (p.p.) on the KTH dataset and 3.80 p.p. on the UCF Sports dataset achieved by the OFCM when compared to Wang et al. Dense Trajectories (DT) method.
For the experiments on HMDB51 dataset, we used the parameters learned using the KTH dataset, as according to literature it turned out to be universal enough to obtain accurate results. For this dataset the OFCM also achieves the best results, reaching 56.91% of accuracy, an improvement of 5.41 p.p. when compared to the MBH feature descriptor .
Download the code from SSIGLib library.
Contact: Carlos Caetano (firstname.lastname@example.org)
If you use the code or parts of it, please cite the following paper.