Skip to content

Clustering Datasets

There are 9 clustering datasets in our directory, 1 of which are benchmarks. Each links to its source, paper, and download — browse the full list below or filter by language.

Clustering is the task of grouping similar items together without any predefined labels. We catalog 9 datasets for it.

Updated June 2026

What languages do clustering datasets cover?

Explore other dataset tasks

Frequently asked questions