legacy-datasets/wikipedia
Text GenerationFill MaskAA, AB, ACE
Created by legacy-datasets at 2022, the legacy-datasets/wikipedia is a text generation dataset in AA, AB, ACE in Parquet format.
About legacy-datasets/wikipedia
Wikipedia dataset containing cleaned articles of all languages.
The datasets are built from the Wikipedia dump
(https://dumps.wikimedia.org/) with one split per language. Each example
contains the content of one full Wikipedia article with cleanin...
Details
- Task
- Text Generation, Fill Mask
- Language
- AA, AB, ACE
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- legacy-datasets
- Year
- 2022