Learning-Based Facial Animation
MetadataShow full item record
This thesis proposes a novel approach for automated 3D speechanimation from audio. An end-to-end system is presented whichundergoes three principal phases. In the acquisition phase, dynamicarticulation motions are recorded and amended. The learning phasestudies the correlation of these motions in their phonetic context inorder to understand the visual nature of speech. Finally, for thesynthesis phase, an algorithm is proposed that carries as much of thenatural behavior as possible from the acquired data to the finalanimation.The selection of motion segments for the synthesis of animationsrelies on a novel similarity measure, based on a Locally LinearEmbedding representation of visemes, which closely relates to visemecategories defined in articulatory phonetics literature. This measureoffers a relaxed selection of visemes, without reducing the quality ofthe animation.Along with a general hierarchical substitution procedure which candirectly be reused in other speech animation systems, our algorithmperforms optimum segment concatenation in order to create newutterances with natural coarticulation effects.