VOCA 捕捉，学习和综合3D语音样式数据集

VOCA is a simple and generic speech-driven facial animation framework that works across a range of identities. This codebase demonstrates how to synthesize realistic character animations given an arbitrary speech signal and a static character mesh. For details please see the scientific publication。

VOCA 捕捉，学习和综合3D语音样式数据集 (http://ds.jsai.org.cn/) 语音识别第1张

Audio-driven 3D facial animation has been widely explored, but achieving realistic, human-like performance is still unsolved. This is due to the lack of available 3D datasets, models, and standard evaluation metrics. To address this, we introduce a unique 4D face dataset with about 29 minutes of 4D scans captured at 60 fps and synchronized audio from 12 speakers. We then train a neural network on our dataset that factors identity from facial motion. The learned model, VOCA (Voice Operated Character Animation) takes any speech signal as input—even speech in languages other than English—and realistically animates a wide range of adult faces. Conditioning on subject labels during training allows the model to learn a variety of realistic speaking styles. VOCA also provides animator controls to alter speaking style, identity-dependent facial shape, and pose (i.e. head, jaw, and eyeball rotations) during animation. To our knowledge, VOCA is the only realistic 3D facial animation model that is readily applicable to unseen subjects without retargeting. This makes VOCA suitable for tasks like in-game video, virtual reality avatars, or any scenario in which the speaker, speech, or language is not known in advance. We make the dataset and model available for research purposes.

Referencing VOCA

@inproceedings{VOCA2019,
    title = {Capture, Learning, and Synthesis of {3D} Speaking Styles},
    author = {Cudeiro, Daniel and Bolkart, Timo and Laidlaw, Cassidy and Ranjan, Anurag and Black, Michael},
    booktitle = {Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)},
    pages = {10101--10111},
    year = {2019}
    url = {http://voca.is.tue.mpg.de/}
}

License

Free for non-commercial and scientific research purposes. By using this code, you acknowledge that you have read the license terms (https://voca.is.tue.mpg.de/license), understand them, and agree to be bound by them. If you do not agree with these terms and conditions, you must not use the code.

VOCA 捕捉，学习和综合3D语音样式数据集免费

资源介绍

Referencing VOCA

License

发表评论取消回复

最新文章

热门文章

THUYG-20 维吾尔语语音数据

VGG-Sound

ESC环境噪音分类数据集

LibriTTS语料库

CN-Celeb

标签云

猜你喜欢

VOCA 捕捉，学习和综合3D语音样式数据集免费

资源介绍

Referencing VOCA

License

发表评论 取消回复

最新文章

热门文章

THUYG-20 维吾尔语语音数据

VGG-Sound

ESC环境噪音分类数据集

LibriTTS语料库

CN-Celeb

标签云

猜你喜欢

THUYG-20 维吾尔语语音数据

VGG-Sound

ESC环境噪音分类数据集

LibriTTS语料库

CN-Celeb

叠置密集去噪-分割合成标注

AISHELL-1 开源中文语音数据库

呼吸声音数据集，用于检测呼吸系统疾病

THCHS30 中文语音数据集

Google Audioset 音频数据集

发表评论取消回复