Skip to content

EleutherAI/proof-pile-2

Text GenerationEN

Created by EleutherAI at 2023, the EleutherAI/proof-pile-2 is a text generation dataset in EN in Parquet format. With 16.4K downloads and 226 likes, it is actively used by the community and is a 10B<n<100B-scale dataset.

About EleutherAI/proof-pile-2

<img src="proofpile_logo.jpg" width="500"> [ArXiv](http://arxiv.org/abs/2310.10631) | [Models](https://huggingface.co/EleutherAI/llemma_34b) | [Data](https://huggingface.co/datasets/EleutherAI/proof-pile-2) | [Code](https://github.com/EleutherAI/math-lm) | [Blog](https://blog.eleuther.ai/llemma/) | [Sample Explorer](https://llemma-demo.github.io/) [Zhangir Azerbayev](https://zhangir-azerbayev.github.io/), [Hailey Schoelkopf](https://github.com/haileyschoelkopf), [Keiran Paster](https://keirp.com), [Marco Dos Santos](https://github.com/dsantosmarco), [Stephen McAleer](https://www.andrew.cmu.edu/user/smcaleer/), [Albert Q. Jiang](https://albertqjiang.github.io/), [Jia Deng](https://www.cs.princeton.edu/~jiadeng/), [Stella Biderman](https://www.stellabiderman.com/), [Sean Welleck](https://wellecks.com/)…

Details

Task
Text Generation
Language
EN
Format
Parquet
Rows / instances
N/A
Size
10B<n<100B
Creator
EleutherAI
Year
2023
Downloads
16448
Likes
226
Download Homepage

Related Text Generation datasets

FAQ