Skip to content

amphion/Emilia-Dataset

Text To SpeechAutomatic Speech RecognitionZH, EN, JAcc-by-4.0

The amphion/Emilia-Dataset dataset is a ZH, EN, JA text to speech resource from amphion at 2024. With 79.3K downloads and 460 likes, it is actively used by the community. It is released under the cc-by-4.0 license and is a 10M<n<100M-scale dataset.

About amphion/Emilia-Dataset

Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation This is the official repository 👑 for the Emilia dataset and the source code for the Emilia-Pipe speech data preprocessing pipeline. Ne...

Details

Task
Text To Speech, Automatic Speech Recognition
Language
ZH, EN, JA
Format
Parquet
Rows / instances
N/A
Size
10M<n<100M
Creator
amphion
Year
2024
License
cc-by-4.0
Downloads
79279
Likes
460
Download Homepage

Related Text To Speech, Automatic Speech Recognition datasets

FAQ