Skip to content

saheedniyi/naijaweb

Text GenerationEN, YO, HA

Created by saheedniyi at 2024, the saheedniyi/naijaweb is a text generation dataset in EN, YO, HA in Parquet format.

About saheedniyi/naijaweb

Naijaweb Dataset 🇳🇬 Naijaweb is a dataset that contains over 270,000+ documents, totaling approximately 230 million GPT-2 tokens. The data was web scraped from web pages popular among Nigerians, providing a rich resource for modeling Nigerian l...

Details

Task
Text Generation
Language
EN, YO, HA
Format
Parquet
Rows / instances
N/A
Creator
saheedniyi
Year
2024
Download

Related Text Generation datasets

FAQ