Skip to content

oscar-corpus/OSCAR-2201

Fill MaskText GenerationAF, SQ, AM

Oscar-corpus/OSCAR-2201 is a fill mask dataset in AF, SQ, AM from oscar-corpus in Parquet format.

About oscar-corpus/OSCAR-2201

The Open Super-large Crawled Aggregated coRpus is a huge multilingual corpus obtained by language classification and filtering of the Common Crawl corpus using the Ungoliant architecture.\

Details

Task
Fill Mask, Text Generation
Language
AF, SQ, AM
Format
Parquet
Rows / instances
N/A
Creator
oscar-corpus
Year
2022
Download

Related Fill Mask, Text Generation datasets

FAQ