Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deep Learning for Visual Speech Analysis: A Survey (2205.10839v2)

Published 22 May 2022 in cs.CV

Abstract: Visual speech, referring to the visual domain of speech, has attracted increasing attention due to its wide applications, such as public security, medical treatment, military defense, and film entertainment. As a powerful AI strategy, deep learning techniques have extensively promoted the development of visual speech learning. Over the past five years, numerous deep learning based methods have been proposed to address various problems in this area, especially automatic visual speech recognition and generation. To push forward future research on visual speech, this paper aims to present a comprehensive review of recent progress in deep learning methods on visual speech analysis. We cover different aspects of visual speech, including fundamental problems, challenges, benchmark datasets, a taxonomy of existing methods, and state-of-the-art performance. Besides, we also identify gaps in current research and discuss inspiring future research directions.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Changchong Sheng (1 paper)
  2. Gangyao Kuang (8 papers)
  3. Liang Bai (9 papers)
  4. Chenping Hou (11 papers)
  5. Yulan Guo (89 papers)
  6. Xin Xu (187 papers)
  7. Matti Pietikäinen (28 papers)
  8. Li Liu (311 papers)
Citations (26)