m-a-p/FineFineWeb
Text ClassificationText GenerationEN
Created by m-a-p at 2026, the m-a-p/FineFineWeb is a text classification dataset in EN in Parquet format.
About m-a-p/FineFineWeb
FineFineWeb: A Comprehensive Study on Fine-Grained Domain Web Corpus
arXiv: Coming Soon
Project Page: Coming Soon
Blog: Coming Soon
Data Statistics
Domain (#tokens/#samples)
Iteration 1 Tokens
Iteration 2 Tokens
Iterati...
Details
- Task
- Text Classification, Text Generation
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- m-a-p
- Year
- 2026