Less is More: Micro-expression Recognition from Video using Apex Frame (1606.01721v3)

Published 6 Jun 2016 in cs.CV

Abstract: Despite recent interest and advances in facial micro-expression research, there is still plenty room for improvement in terms of micro-expression recognition. Conventional feature extraction approaches for micro-expression video consider either the whole video sequence or a part of it, for representation. However, with the high-speed video capture of micro-expressions (100-200 fps), are all frames necessary to provide a sufficiently meaningful representation? Is the luxury of data a bane to accurate recognition? A novel proposition is presented in this paper, whereby we utilize only two images per video: the apex frame and the onset frame. The apex frame of a video contains the highest intensity of expression changes among all frames, while the onset is the perfect choice of a reference frame with neutral expression. A new feature extractor, Bi-Weighted Oriented Optical Flow (Bi-WOOF) is proposed to encode essential expressiveness of the apex frame. We evaluated the proposed method on five micro-expression databases: CAS(ME)$^2$, CASME II, SMIC-HS, SMIC-NIR and SMIC-VIS. Our experiments lend credence to our hypothesis, with our proposed technique achieving a state-of-the-art F1-score recognition performance of 61% and 62% in the high frame rate CASME II and SMIC-HS databases respectively.

Citations (285)

View on Semantic Scholar

Summary

The paper introduces the Bi-WOOF feature extractor, which utilizes both onset and apex frames to capture subtle micro-expressive dynamics.
It demonstrates improved recognition performance with F1-scores reaching up to 0.62 on high frame rate datasets.
The approach reduces computational load, opening avenues for real-time applications in security and psychological assessment.

Overview of "Less is More: Micro-expression Recognition from Video using Apex Frame"

The research paper titled "Less is More: Micro-expression Recognition from Video using Apex Frame" authored by Sze-Teng Liong et al., presents an innovative methodology to enhance the recognition of micro-expressions in video sequences by predominantly utilizing two pivotal frames: the onset and the apex frames. Recognizing the subtleties involved in facial micro-expressions, this paper challenges conventional approaches that rely on entire video sequences, offering instead a streamlined path to feature extraction through a novel representation of expressions.

Key Contribution

The principal contribution of this paper is the implementation of the Bi-Weighted Oriented Optical Flow (Bi-WOOF) feature extractor. Bi-WOOF captures and encodes the micro-expressive features within the apex frame—where the expression’s intensity peaks—and the onset frame, posited as the neutral baseline. Optical flow, renowned for pinpointing object displacement at pixel level between frames, undergoes optimization via two forms of weighting: local and global. The local weighting is achieved through optical flow magnitude while global block weighting is enhanced by optical strain—a derivative indicating deformation intensity—gathering precise representation aptitudes for micro-movements.

Empirical Evidence

The paper’s hypotheses are tested rigorously on five distinct micro-expression datasets: CAS(ME $)^2$ , CASME II, SMIC-HS, SMIC-NIR, and SMIC-VIS. Notably, empirical results reveal that apex-based frame selection using Bi-WOOF affords exceptional recognition performance, outstripping several state-of-the-art recognition methods in high frame rate datasets such as CASME II and SMIC-HS, where F1-scores reach 0.61 and 0.62 respectively. Furthermore, the computational efficiency recorded is compelling, with the apex-based method showcasing dramatically reduced processing times, imparting scalability and feasibility to practical applications.

Implications and Future Directions

The implications of this research extend substantially into applications where discerning micro-expressions are beneficial, such as psychological assessment and national security. The accurate identification of emotional states through minimal frame processing can lead to more efficient systems capable of real-time implementation. Moreover, the findings suggest the prospect of reducing hardware demands by employing lower-capacity video capture methods without degrading recognition performance.

The paper presents several avenues for future research. One area of exploration includes refining automatic apex frame spotting across various contexts to further enhance the robustness and generalizability of the approach. Another aspect resides in delving deeper into the individual action units of the face during micro-expressions, potentially coupling apex frame analysis with advances in neural networks for refined emotion recognition.

In their conclusive proposition, the authors advocate the potent phrase "less is more," affirming that the apex frame suffices substantially for the recognition process, marking potentially transformative improvements in the domain of computer vision and emotional intelligence. The paper paves a constructive pathway in advancing micro-expression analysis and offers a versatile foundation for further empirical refinements across multidisciplinary fields.

PDF Markdown