Skip to content

HuggingFaceFW/fineweb-edu-score-2

Text GenerationENodc-by

The HuggingFaceFW/fineweb-edu-score-2 dataset is a EN text generation resource from HuggingFaceFW at 2024. With 46.7K downloads and 87 likes, it is actively used by the community. It is released under the odc-by license and is a 10B<n<100B-scale dataset.

About HuggingFaceFW/fineweb-edu-score-2

šŸ“š FineWeb-Edu-score-2 1.3 trillion tokens of the finest educational data the 🌐 web has to offer What is it? šŸ“š FineWeb-Edu dataset consists of 1.3T tokens (FineWeb-Edu) and 5.4T tokens of educational web pages filtered fro...

Details

Task
Text Generation
Language
EN
Format
Parquet
Rows / instances
N/A
Size
10B<n<100B
Creator
HuggingFaceFW
Year
2024
License
odc-by
Downloads
46701
Likes
87
Download Homepage

Related Text Generation datasets

FAQ