Skip to content

OpenCoder-LLM/opc-fineweb-code-corpus

General NLPEnglish

OpenCoder-LLM/opc-fineweb-code-corpus is a General NLP-focused dataset in English distributed in Parquet format.

About OpenCoder-LLM/opc-fineweb-code-corpus

OpenCoder Dataset The OpenCoder dataset is composed of the following datasets: opc-sft-stage1: the sft data used for opencoder sft-stage1 opc-sft-stage2: the sft data used for opencoder sft-stage2 opc-annealing-corpus: the synthetic data & alg...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
N/A
Creator
OpenCoder-LLM
Year
2024
Download

Related General NLP datasets

FAQ