Skip to content

graelo/wikipedia

Text GenerationFill MaskAB, ACE, ADYcc-by-sa-3.0

Graelo/wikipedia is a text generation-focused dataset in AB, ACE, ADY that provides 122,847,731 labeled examples distributed in Parquet format. It is distributed under the cc-by-sa-3.0 license and falls in the 100M<n<1B size category, and has been downloaded 1.8K times.

About graelo/wikipedia

Wikipedia dataset containing cleaned articles of all languages. The datasets are built from the Wikipedia dump (https://dumps.wikimedia.org/) with one split per language. Each example contains the content of one full Wikipedia article with cleanin...

Details

Task
Text Generation, Fill Mask
Language
AB, ACE, ADY
Format
Parquet
Rows / instances
122847731
Size
100M<n<1B
Creator
graelo
Year
2023
License
cc-by-sa-3.0
Downloads
1775
Likes
71
Download Homepage

Related Text Generation, Fill Mask datasets

FAQ