Skip to content

allenai/dolma3_dolmino_mix-100B-1125

General NLPENodc-by

Created by allenai at 2025, the allenai/dolma3_dolmino_mix-100B-1125 is a General NLP dataset in EN in Parquet format. With 36.7K downloads and 21 likes, it is actively used by the community. It is released under the odc-by license.

About allenai/dolma3_dolmino_mix-100B-1125

Dolma 3 Dolmino dataset pool for Olmo 3 stage 2 annealing training This dataset contains the high-quality pool of data considered for the second stage of Olmo 3 32B. Dataset Sources Source Category TinyMATH Mind Math (syn...

Details

Task
General NLP
Language
EN
Format
Parquet
Rows / instances
N/A
Creator
allenai
Year
2025
License
odc-by
Downloads
36691
Likes
21
Download Homepage

Related General NLP datasets

FAQ