Skip to content

Text Generation Datasets

There are 137 text generation datasets in our directory, 4 of which are benchmarks. Each links to its source, paper, and download — browse the full list below or filter by language.

Text Generation is the task of producing new, coherent text from a prompt — the core capability behind chatbots and writing assistants. We catalog 137 datasets for it.

Updated June 2026

What languages do text generation datasets cover?

Explore other dataset tasks

Frequently asked questions