malaysia-ai/pseudolabel-dialects-youtube-whisper-large-v3
General NLPEnglish
Malaysia-ai/pseudolabel-dialects-youtube-whisper-large-v3 is a General NLP-focused dataset in English distributed in Parquet format.
About malaysia-ai/pseudolabel-dialects-youtube-whisper-large-v3
malaysia-ai/pseudolabel-dialects-youtube-whisper-large-v3
Pseudolabel malaysia-ai/malaysian-dialects-youtube using openai/whisper-large-v3
How to prepare the dataset
huggingface-cli download --repo-type dataset \
--include '*.zip' \
...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- malaysia-ai
- Year
- 2025