Audio-Driven Speech Animation with Text-Guided Expression

Jung, Sunjin; Chun, Sewhan; Noh, Junyong

Audio-Driven Speech Animation with Text-Guided Expression

Files

pg20241290.pdf (3.83 MB)

supplementary_material.pdf (111.63 KB)

video.mp4 (80.2 MB)

Date

2024

Authors

Jung, Sunjin
Chun, Sewhan
Noh, Junyong

Publisher

The Eurographics Association

Abstract

We introduce a novel method for generating expressive speech animations of a 3D face, driven by both audio and text descriptions. Many previous approaches focused on generating facial expressions using pre-defined emotion categories. In contrast, our method is capable of generating facial expressions from text descriptions unseen during training, without limitations to specific emotion classes. Our system employs a two-stage approach. In the first stage, an auto-encoder is trained to disentangle content and expression features from facial animations. In the second stage, two transformer-based networks predict the content and expression features from audio and text inputs, respectively. These features are then passed to the decoder of the pre-trained auto-encoder, yielding the final expressive speech animation. By accommodating diverse forms of natural language, such as emotion words or detailed facial expression descriptions, our method offers an intuitive and versatile way to generate expressive speech animations. Extensive quantitative and qualitative evaluations, including a user study, demonstrate that our method can produce natural expressive speech animations that correspond to the input audio and text descriptions.

CCS Concepts: Computing methodologies → Animation; Neural networks

        @inproceedings{10.2312:pg.20241290
,
booktitle = {Pacific Graphics Conference Papers and Posters
},
editor = {Chen, Renjie and 
Ritschel, Tobias and 
Whiting, Emily
},
title = {{Audio-Driven Speech Animation with Text-Guided Expression
}},
author = {Jung, Sunjin and 
Chun, Sewhan and 
Noh, Junyong
},
year = {2024
},
publisher = {The Eurographics Association
},
ISBN = {978-3-03868-250-9
},
DOI = {10.2312/pg.20241290
}
}

URI

https://doi.org/10.2312/pg.20241290
https://diglib.eg.org/handle/10.2312/pg20241290

Collections

PG2024 Conference Papers and Posters

Full item page