hasankursun/github-code-2025-language-split
General NLPEnglish
Created by hasankursun at 2025, the hasankursun/github-code-2025-language-split is a General NLP dataset in English in Parquet format. With 15.8K downloads and 10 likes, it is actively used by the community. It is released under the other license and is a 100M<n<1B-scale dataset.
About hasankursun/github-code-2025-language-split
📜 Source Data & Attribution
This dataset is a processed derivative of nick007x/github-code-2025.
Origination
The original data was aggregated by nick007x from public GitHub repositories. We have retained the original content, fil...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 100M<n<1B
- Creator
- hasankursun
- Year
- 2025
- License
- other
- Downloads
- 15819
- Likes
- 10