Russian Datasets
We catalog 16 Russian datasets for NLP and machine learning. Browse the list below or narrow down by task.
This page covers Russian, a high-resource Slavic language widely covered in multilingual datasets. Our directory includes 16 datasets in Russian.
Updated June 2026
- Russian Commitment Bank (RCB) (SuperGlue)Natural Language Inference (NLI)Russian
- Choice of Plausible Alternatives for Russian language (PARus) (SuperGlue)CommonsenseRussian
- Russian Multi-Sentence Reading Comprehension (MuSeRC) (SuperGlue)Question AnsweringRussian
- Textual Entailment Recognition for Russian (TERRa) (SuperGlue)Natural Language Inference (NLI)Russian
- Words in Context (RUSSe) (SuperGlue)Word Sense DisambiguationRussian
- The Winograd Schema Challenge Russian (RWSD) (SuperGlue)Coreference ResolutionRussian
- DaNetQA (SuperGlue)Binary Question AnsweringRussian
- Russian Reading Comprehension with Commonsense reasoning (RuCoS) (SuperGlue)CommonsenseRussian
- CC100-RussianText CorporaRussian
- artur-muratov/multilingual-speech-commands-15langGeneral NLPEN, RU, KK
- SberQuADQuestion Answering, Reading ComprehensionRussian
- OpenAssistant/oasst1General NLPEN, ES, RU
- IlyaGusev/gpt_roleplay_realmText GenerationRU, EN
- Den4ikAI/russian_dialoguesGeneral NLPRU
- Vikhrmodels/GrandMaster-PRO-MAXText GenerationRU, EN
- IlyaGusev/ru_turbo_alpacaText GenerationRU