oscar-corpus/OSCAR-2201
Fill MaskText GenerationAF, SQ, AM
Oscar-corpus/OSCAR-2201 is a fill mask dataset in AF, SQ, AM from oscar-corpus in Parquet format.
About oscar-corpus/OSCAR-2201
The Open Super-large Crawled Aggregated coRpus is a huge multilingual corpus obtained by language classification and filtering of the Common Crawl corpus using the Ungoliant architecture.\
Details
- Task
- Fill Mask, Text Generation
- Language
- AF, SQ, AM
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- oscar-corpus
- Year
- 2022