Skip to content

Sinoosoida/SpeechRu

Automatic Speech RecognitionText To SpeechAudio ClassificationRU

Sinoosoida/SpeechRu is a automatic speech recognition-focused dataset in RU distributed in Parquet format.

About Sinoosoida/SpeechRu

Russian Podcasts (unlabeled) ~186k unlabeled Russian-language podcast episodes scraped from the web, packaged as Parquet shards with the audio bytes embedded. The audio has no transcripts — this is an unsupervised / self-supervised audio corpus...

Details

Task
Automatic Speech Recognition, Text To Speech, Audio Classification
Language
RU
Format
Parquet
Rows / instances
N/A
Creator
Sinoosoida
Year
2026
Download

Related Automatic Speech Recognition, Text To Speech, Audio Classification datasets

FAQ