Skip to content

nvidia/Nemotron-ClimbMix

Text GenerationEN

Nvidia/Nemotron-ClimbMix is a text generation-focused dataset in EN distributed in Parquet format.

About nvidia/Nemotron-ClimbMix

ClimbMix Dataset šŸš€ Creating the highest-quality pre-training datasets for LLMs 🌟 šŸ“„ PAPER šŸ¤— CLIMBLAB šŸ¤— CLIMBMIX šŸ  HOMEPAGE Figure 1: Continuously training a 1B model yields a 2.0% imp...

Details

Task
Text Generation
Language
EN
Format
Parquet
Rows / instances
N/A
Creator
nvidia
Year
2025
Download

Related Text Generation datasets

FAQ