OctoThinker/MegaMath-Web-Pro-Max
General NLPEnglish
The OctoThinker/MegaMath-Web-Pro-Max dataset is a English General NLP resource from OctoThinker at 2025.
About OctoThinker/MegaMath-Web-Pro-Max
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling
The Curation of MegaMath-Web-Pro-Max
Step 1: Uniformly and randomly sample millions of documents from the MegaMath-Web corpus, stratified by publication year;
Ste...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- OctoThinker
- Year
- 2025