HuggingFaceFW/fineweb-2
Text GenerationAAI, AAK, AAUodc-by
The HuggingFaceFW/fineweb-2 dataset is a AAI, AAK, AAU text generation resource from HuggingFaceFW at 2024. With 99.1K downloads and 827 likes, it is actively used by the community. It is released under the odc-by license and is a 1B<n<10B-scale dataset.
About HuggingFaceFW/fineweb-2
🥂 FineWeb2
A sparkling update with 1000s of languages
What is it?
This is the second iteration of the popular 🍷 FineWeb dataset, bringing high quality pretraining data to over 1000 🗣️ languages.
The 🥂 FineWeb2 dataset is fu...
Details
- Task
- Text Generation
- Language
- AAI, AAK, AAU
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 1B<n<10B
- Creator
- HuggingFaceFW
- Year
- 2024
- License
- odc-by
- Downloads
- 99062
- Likes
- 827