nvidia/OpenCodeInstruct
Text GenerationEN
Created by nvidia at 2025, the nvidia/OpenCodeInstruct is a text generation dataset in EN in Parquet format.
About nvidia/OpenCodeInstruct
OpenCodeInstruct: A Large-scale Instruction Tuning Dataset for Code LLMs
Dataset Description
We introduce OpenCodeInstruct, the largest open-access instruction tuning dataset, comprising 5 million diverse samples. OpenCodeInstruct is...
Details
- Task
- Text Generation
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- nvidia
- Year
- 2025