Skip to content

NeelNanda/pile-10k

General NLPEnglish

NeelNanda/pile-10k is a General NLP dataset in English from NeelNanda in Parquet format.

About NeelNanda/pile-10k

The first 10K elements of The Pile, useful for debugging models trained on it. See the HuggingFace page for the full Pile for more info. Inspired by stas' great resource doing the same for OpenWebText

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
N/A
Creator
NeelNanda
Year
2022
Download

Related General NLP datasets

FAQ