graelo/wikipedia
Text GenerationFill MaskAB, ACE, ADYcc-by-sa-3.0
Graelo/wikipedia is a text generation-focused dataset in AB, ACE, ADY that provides 122,847,731 labeled examples distributed in Parquet format. It is distributed under the cc-by-sa-3.0 license and falls in the 100M<n<1B size category, and has been downloaded 1.8K times.
About graelo/wikipedia
Wikipedia dataset containing cleaned articles of all languages.
The datasets are built from the Wikipedia dump
(https://dumps.wikimedia.org/) with one split per language. Each example
contains the content of one full Wikipedia article with cleanin...
Details
- Task
- Text Generation, Fill Mask
- Language
- AB, ACE, ADY
- Format
- Parquet
- Rows / instances
- 122847731
- Size
- 100M<n<1B
- Creator
- graelo
- Year
- 2023
- License
- cc-by-sa-3.0
- Downloads
- 1775
- Likes
- 71