EleutherAI/the_pile_deduplicated
General NLPEnglish
EleutherAI/the_pile_deduplicated is a General NLP dataset in English from EleutherAI in Parquet format. And falls in the 100M<n<1B size category, and has been downloaded 16.4K times.
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 100M<n<1B
- Creator
- EleutherAI
- Year
- 2022
- Downloads
- 16356
- Likes
- 114