Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-Modality Fusion based on Consensus-Voting and 3D Convolution for Isolated Gesture Recognition (1611.06689v2)

Published 21 Nov 2016 in cs.CV

Abstract: Recently, the popularity of depth-sensors such as Kinect has made depth videos easily available while its advantages have not been fully exploited. This paper investigates, for gesture recognition, to explore the spatial and temporal information complementarily embedded in RGB and depth sequences. We propose a convolutional twostream consensus voting network (2SCVN) which explicitly models both the short-term and long-term structure of the RGB sequences. To alleviate distractions from background, a 3d depth-saliency ConvNet stream (3DDSN) is aggregated in parallel to identify subtle motion characteristics. These two components in an unified framework significantly improve the recognition accuracy. On the challenging Chalearn IsoGD benchmark, our proposed method outperforms the first place on the leader-board by a large margin (10.29%) while also achieving the best result on RGBD-HuDaAct dataset (96.74%). Both quantitative experiments and qualitative analysis shows the effectiveness of our proposed framework and codes will be released to facilitate future research.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Jiali Duan (14 papers)
  2. Shuai Zhou (13 papers)
  3. Jun Wan (79 papers)
  4. Xiaoyuan Guo (14 papers)
  5. Stan Z. Li (222 papers)
Citations (35)

Summary

We haven't generated a summary for this paper yet.