Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Some observations on computer lip-reading: moving from the dream to the reality (1710.01084v1)

Published 3 Oct 2017 in cs.CV and eess.IV

Abstract: In the quest for greater computer lip-reading performance there are a number of tacit assumptions which are either present in the datasets (high resolution for example) or in the methods (recognition of spoken visual units called visemes for example). Here we review these and other assumptions and show the surprising result that computer lip-reading is not heavily constrained by video resolution, pose, lighting and other practical factors. However, the working assumption that visemes, which are the visual equivalent of phonemes, are the best unit for recognition does need further examination. We conclude that visemes, which were defined over a century ago, are unlikely to be optimal for a modern computer lip-reading system.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Helen L. Bear (23 papers)
  2. Gari Owen (1 paper)
  3. Richard Harvey (9 papers)
  4. Barry-John Theobald (34 papers)
Citations (13)

Summary

We haven't generated a summary for this paper yet.