Skip to content

bigcode/the-stack-dedup

Text GenerationCODE

The bigcode/the-stack-dedup dataset is a CODE text generation resource from bigcode at 2022. With 18K downloads and 398 likes, it is actively used by the community. It is released under the other license and is a 100M<n<1B-scale dataset.

About bigcode/the-stack-dedup

Dataset Card for The Stack Changelog Release Description v1.0 Initial release of the Stack. Included 30 programming languages and 18 permissive licenses. Note: Three included licenses (MPL/EPL/LGPL) are considered weak co...

Details

Task
Text Generation
Language
CODE
Format
Parquet
Rows / instances
N/A
Size
100M<n<1B
Creator
bigcode
Year
2022
License
other
Downloads
17966
Likes
398
Download Homepage

Related Text Generation datasets

FAQ