Skip to content

EleutherAI/fineweb-edu-dedup-10b

General NLPEnglish

Created by EleutherAI at 2025, the EleutherAI/fineweb-edu-dedup-10b is a General NLP dataset in English containing 9,508,400 records in Parquet format. With 17.5K downloads and 4 likes, it is actively used by the community and is a 1M<n<10M-scale dataset.

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
9508400
Size
1M<n<10M
Creator
EleutherAI
Year
2025
Downloads
17536
Likes
4
Download Homepage

Related General NLP datasets

FAQ