Skip to content

Third-Space/code_bagel

General NLPEN

Third-Space/code_bagel is a General NLP dataset in EN from Third-Space in Parquet format.

About Third-Space/code_bagel

This is an unoffical reupload of Code_bagel. You can find the original dataset here: https://huggingface.co/datasets/rombodawg/code_bagel A coding bagel, with everything coding related Around 800 million tokens of unique coding data 1...

Details

Task
General NLP
Language
EN
Format
Parquet
Rows / instances
N/A
Creator
Third-Space
Year
2024
Download

Related General NLP datasets

FAQ