Video Rewrite: driving visual speech with audio

Authors:
Christoph Bregler

Interval Research Corporation, 1801 Page Mill Road, Building C, Palo Alto, CA

Interval Research Corporation, 1801 Page Mill Road, Building C, Palo Alto, CA
View Profile

,
Michele Covell

Interval Research Corporation, 1801 Page Mill Road, Building C, Palo Alto, CA

Interval Research Corporation, 1801 Page Mill Road, Building C, Palo Alto, CA
View Profile

,
Malcolm Slaney

Interval Research Corporation, 1801 Page Mill Road, Building C, Palo Alto, CA

Interval Research Corporation, 1801 Page Mill Road, Building C, Palo Alto, CA
View Profile

SIGGRAPH '97: Proceedings of the 24th annual conference on Computer graphics and interactive techniquesAugust 1997Pages 353–360https://doi.org/10.1145/258734.258880

Published:03 August 1997Publication History

SIGGRAPH '97: Proceedings of the 24th annual conference on Computer graphics and interactive techniques

Pages 353–360

ABSTRACT

Video Rewrite uses existing footage to create automatically new video of a person mouthing words that she did not speak in the original footage. This technique is useful in movie dubbing, for example, where the movie sequence can be modified to sync the actors' lip motions to the new soundtrack.

Video Rewrite automatically labels the phonemes in the training data and in the new audio track. Video Rewrite reorders the mouth images in the training footage to match the phoneme sequence of the new audio track. When particular phonemes are unavailable in the training footage, Video Rewrite selects the closest approximations. The resulting sequence of mouth images is stitched into the background footage. This stitching process automatically corrects for differences in head position and orientation between the mouth images and the background footage.

Video Rewrite uses computer-vision techniques to track points on the speaker's mouth in the training footage, and morphing techniques to combine these mouth gestures into the final video sequence. The new video combines the dynamics of the original actor's articulations with the mannerisms and setting dictated by the background footage. Video Rewrite is the first facial-animation system to automate all the labeling and assembly tasks required to resync existing footage to a new soundtrack.

References

http://www.interval.com/papers/1997-012/Google Scholar

Index Terms

Video Rewrite: driving visual speech with audio
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Interest point and salient region detections

Recommendations

Video Rewrite: Driving Visual Speech with Audio
Seminal Graphics Papers: Pushing the Boundaries, Volume 2

Video Rewrite uses existing footage to create automatically new video of a person mouthing words that she did not speak in the original footage. This technique is useful in movie dubbing, for example, where the movie sequence can be modified to sync the ...
Read More
A Practical and Configurable Lip Sync Method for Games
MIG '13: Proceedings of Motion on Games

We demonstrate a lip animation (lip sync) algorithm for real-time applications that can be used to generate synchronized facial movements with audio generated from natural speech or a text-to-speech engine. Our method requires an animator to construct ...
Read More
Real-time language independent lip synchronization method using a genetic algorithm
Special section: Multimodal human-computer interfaces

Lip synchronization is a method for the determination of the mouth and tongue motion during a speech. It is widely used in multimedia productions, and real time implementation is opening application possibilities in multimodal interfaces. We present an ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGGRAPH '97: Proceedings of the 24th annual conference on Computer graphics and interactive techniques
August 1997
512 pages
ISBN:0897918967
Chairmen:
G. Scott Owen
Georgia State Univ., Atlanta
,
Turner Whitted
Numerical Design Ltd.
,
Barbara Mones-Hattal
George Mason Univ., Fair Fax, VA
Seminal Graphics Papers: Pushing the Boundaries, Volume 2
August 2023
893 pages
ISBN:9798400708978
DOI:10.1145/3596711
Editor:
Mary C. Whitton
Department of Computer Science, UNC Chapel Hill, USA
Sponsors
In-Cooperation
Publisher
ACM Press/Addison-Wesley Publishing Co.
United States
Publication History
- Published: 3 August 1997
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Seminal Paper
Author Tags
facial animation
lip sync
Qualifiers
- Article
Conference

Acceptance Rates
SIGGRAPH '97 Paper Acceptance Rate48of265submissions,18%Overall Acceptance Rate1,822of8,601submissions,21%
More
Upcoming Conference
SIGGRAPH '24

Sponsor:

siggraph

Special Interest Group on Computer Graphics and Interactive Techniques Conference

July 27 - August 1, 2024

Denver , CO , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 436
  Total Citations
  View Citations
- 2,945
  Total Downloads
- Downloads (Last 12 months)539
- Downloads (Last 6 weeks)66
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Video Rewrite: driving visual speech with audio

SIGGRAPH '97: Proceedings of the 24th annual conference on Computer graphics and interactive techniques

ABSTRACT

References

Cited By

Index Terms

Recommendations

Video Rewrite: Driving Visual Speech with Audio

A Practical and Configurable Lip Sync Method for Games

Real-time language independent lip synchronization method using a genetic algorithm