Skip to content

OctoThinker/MegaMath-Web-Pro-Max

General NLPEnglish

The OctoThinker/MegaMath-Web-Pro-Max dataset is a English General NLP resource from OctoThinker at 2025.

About OctoThinker/MegaMath-Web-Pro-Max

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling The Curation of MegaMath-Web-Pro-Max Step 1: Uniformly and randomly sample millions of documents from the MegaMath-Web corpus, stratified by publication year; Ste...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
N/A
Creator
OctoThinker
Year
2025
Download

Related General NLP datasets

FAQ