Skip to content

Jr23xd23/ArabicText-Large

Text GenerationFill MaskText ClassificationARapache-2.0

Jr23xd23/ArabicText-Large is a text generation-focused dataset in AR distributed in Parquet format. It is distributed under the apache-2.0 license and falls in the 100K<n<1M size category, and has been downloaded 413 times.

About Jr23xd23/ArabicText-Large

ArabicText-Large: High-Quality Arabic Corpus for LLM Training Dataset Summary ArabicText-Large is a comprehensive, high-quality Arabic text corpus comprising 743,288 articles with over 244 million words, specifically curated fo...

Details

Task
Text Generation, Fill Mask, Text Classification
Language
AR
Format
Parquet
Rows / instances
N/A
Size
100K<n<1M
Creator
Jr23xd23
Year
2025
License
apache-2.0
Downloads
413
Likes
69
Download Homepage

Related Text Generation, Fill Mask, Text Classification datasets

FAQ