Skip to content

Hindi Datasets

We catalog 5 Hindi datasets for NLP and machine learning, including 1 benchmarks. Browse the list below or narrow down by task.

This page covers Hindi, one of the most spoken languages in India and a key low-to-mid-resource language. Our directory includes 5 datasets in Hindi.

Updated June 2026

What tasks do Hindi datasets cover?

Datasets in other languages

Frequently asked questions