HKUSTAudio/Audio-FLAN-Dataset
Text To SpeechText To AudioAutomatic Speech RecognitionEN, ZHapache-2.0
The HKUSTAudio/Audio-FLAN-Dataset dataset is a EN, ZH text to speech resource from HKUSTAudio at 2025. With 5.4K downloads and 45 likes, it is actively used by the community. It is released under the apache-2.0 license and is a 10M<n<100M-scale dataset.
About HKUSTAudio/Audio-FLAN-Dataset
Audio-FLAN Dataset (Paper)
(the FULL audio files and jsonl files are still updating)
An Instruction-Tuning Dataset for Unified Audio Understanding and Generation Across Speech, Music, and Sound.
1. Dataset Structure
The Audio-FLAN-D...
Details
- Task
- Text To Speech, Text To Audio, Automatic Speech Recognition
- Language
- EN, ZH
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 10M<n<100M
- Creator
- HKUSTAudio
- Year
- 2025
- License
- apache-2.0
- Downloads
- 5370
- Likes
- 45