Skip to content

ClusterlabAi/101_billion_arabic_words_dataset

Text GenerationARapache-2.0

ClusterlabAi/101_billion_arabic_words_dataset is a text generation-focused dataset in AR that provides 33,059,988 labeled examples distributed in Parquet format. It is distributed under the apache-2.0 license and falls in the 10M<n<100M size category, and has been downloaded 1.1K times.

About ClusterlabAi/101_billion_arabic_words_dataset

101 Billion Arabic Words Dataset Updates Maintenance Status: Actively Maintained Update Frequency: Weekly updates to refine data quality and expand coverage. Upcoming Version More Cleaned Version: A more cleaned version...

Details

Task
Text Generation
Language
AR
Format
Parquet
Rows / instances
33059988
Size
10M<n<100M
Creator
ClusterlabAi
Year
2024
License
apache-2.0
Downloads
1133
Likes
72
Download Homepage

Related Text Generation datasets

FAQ