Third-Space/code_bagel
General NLPEN
Third-Space/code_bagel is a General NLP dataset in EN from Third-Space in Parquet format.
About Third-Space/code_bagel
This is an unoffical reupload of Code_bagel. You can find the original dataset here:
https://huggingface.co/datasets/rombodawg/code_bagel
A coding bagel, with everything coding related
Around 800 million tokens of unique coding data
1...
Details
- Task
- General NLP
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- Third-Space
- Year
- 2024