Skip to content

olm/wikipedia

Text GenerationFill MaskAA, AB, ACE

Olm/wikipedia is a text generation-focused dataset in AA, AB, ACE distributed in Parquet format.

About olm/wikipedia

Wikipedia dataset containing cleaned articles of all languages. The datasets are built from the Wikipedia dump (https://dumps.wikimedia.org/) with one split per language. Each example contains the content of one full Wikipedia article with cleanin...

Details

Task
Text Generation, Fill Mask
Language
AA, AB, ACE
Format
Parquet
Rows / instances
N/A
Creator
olm
Year
2022
Download

Related Text Generation, Fill Mask datasets

FAQ