EleutherAI/fineweb-edu-dedup-10b
General NLPEnglish
Created by EleutherAI at 2025, the EleutherAI/fineweb-edu-dedup-10b is a General NLP dataset in English containing 9,508,400 records in Parquet format. With 17.5K downloads and 4 likes, it is actively used by the community and is a 1M<n<10M-scale dataset.
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- 9508400
- Size
- 1M<n<10M
- Creator
- EleutherAI
- Year
- 2025
- Downloads
- 17536
- Likes
- 4