wikimedia/wikipedia
Text GenerationFill MaskAB, ACE, ADY
Wikimedia/wikipedia is a text generation dataset in AB, ACE, ADY from wikimedia with 61,614,907 records in Parquet format.
About wikimedia/wikipedia
Dataset Card for Wikimedia Wikipedia
Dataset Summary
Wikipedia dataset containing cleaned articles of all languages.
The dataset is built from the Wikipedia dumps (https://dumps.wikimedia.org/)
with one subset per language, e...
Details
- Task
- Text Generation, Fill Mask
- Language
- AB, ACE, ADY
- Format
- Parquet
- Rows / instances
- 61,614,907
- Creator
- wikimedia
- Year
- 2026