Skip to content

Text Classification Datasets

There are 33 text classification datasets in our directory, 1 of which are benchmarks. Each links to its source, paper, and download — browse the full list below or filter by language.

Text Classification is the task of assigning predefined categories or labels to a piece of text, such as topic or intent labelling. We catalog 33 datasets for it.

Updated June 2026

What languages do text classification datasets cover?

Explore other dataset tasks

Frequently asked questions