EleutherAI/pile
Text GenerationFill MaskEN
The EleutherAI/pile dataset is a EN text generation resource from EleutherAI at 2022. With 6K downloads and 500 likes, it is actively used by the community. It is released under the other license and is a 100B<n<1T-scale dataset.
About EleutherAI/pile
The Pile is a 825 GiB diverse, open source language modelling data set that consists of 22 smaller, high-quality
datasets combined together.
Details
- Task
- Text Generation, Fill Mask
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 100B<n<1T
- Creator
- EleutherAI
- Year
- 2022
- License
- other
- Downloads
- 5950
- Likes
- 500