HuggingFaceFW/fineweb-edu-score-2
Text GenerationENodc-by
The HuggingFaceFW/fineweb-edu-score-2 dataset is a EN text generation resource from HuggingFaceFW at 2024. With 46.7K downloads and 87 likes, it is actively used by the community. It is released under the odc-by license and is a 10B<n<100B-scale dataset.
About HuggingFaceFW/fineweb-edu-score-2
š FineWeb-Edu-score-2
1.3 trillion tokens of the finest educational data the š web has to offer
What is it?
š FineWeb-Edu dataset consists of 1.3T tokens (FineWeb-Edu) and 5.4T tokens of educational web pages filtered fro...
Details
- Task
- Text Generation
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 10B<n<100B
- Creator
- HuggingFaceFW
- Year
- 2024
- License
- odc-by
- Downloads
- 46701
- Likes
- 87