Skip to content

HKUSTAudio/Audio-FLAN-Dataset

Text To SpeechText To AudioAutomatic Speech RecognitionEN, ZHapache-2.0

The HKUSTAudio/Audio-FLAN-Dataset dataset is a EN, ZH text to speech resource from HKUSTAudio at 2025. With 5.4K downloads and 45 likes, it is actively used by the community. It is released under the apache-2.0 license and is a 10M<n<100M-scale dataset.

About HKUSTAudio/Audio-FLAN-Dataset

Audio-FLAN Dataset (Paper) (the FULL audio files and jsonl files are still updating) An Instruction-Tuning Dataset for Unified Audio Understanding and Generation Across Speech, Music, and Sound. 1. Dataset Structure The Audio-FLAN-D...

Details

Task
Text To Speech, Text To Audio, Automatic Speech Recognition
Language
EN, ZH
Format
Parquet
Rows / instances
N/A
Size
10M<n<100M
Creator
HKUSTAudio
Year
2025
License
apache-2.0
Downloads
5370
Likes
45
Download Homepage

Related Text To Speech, Text To Audio, Automatic Speech Recognition datasets

FAQ