Skip to content

bigcode/the-stack

Text GenerationCODE

Bigcode/the-stack is a text generation-focused dataset in CODE distributed in Parquet format. It is distributed under the other license and falls in the 100M<n<1B size category, and has been downloaded 22.9K times.

About bigcode/the-stack

Dataset Card for The Stack Changelog Release Description v1.0 Initial release of the Stack. Included 30 programming languages and 18 permissive licenses. Note: Three included licenses (MPL/EPL/LGPL) are considered weak co...

Details

Task
Text Generation
Language
CODE
Format
Parquet
Rows / instances
N/A
Size
100M<n<1B
Creator
bigcode
Year
2022
License
other
Downloads
22932
Likes
1025
Download Homepage

Related Text Generation datasets

FAQ