bigcode/the-stack-smol
Text GenerationCODE
Bigcode/the-stack-smol is a text generation-focused dataset in CODE distributed in Parquet format. And falls in the 100K<n<1M size category, and has been downloaded 12.4K times.
About bigcode/the-stack-smol
Dataset Description
A small subset (~0.1%) of the-stack dataset, each programming language has 10,000 random samples from the original dataset. The dataset has 2.6GB of text (code).
Languages
The dataset contains 30 programming ...
Details
- Task
- Text Generation
- Language
- CODE
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 100K<n<1M
- Creator
- bigcode
- Year
- 2022
- Downloads
- 12390
- Likes
- 84