jondurbin/gutenberg-dpo-v0.1
General NLPENcc-by-4.0
Created by jondurbin at 2024, the jondurbin/gutenberg-dpo-v0.1 is a General NLP dataset in EN in Parquet format. With 622 downloads and 166 likes, it is actively used by the community. It is released under the cc-by-4.0 license and is a n<1K-scale dataset.
About jondurbin/gutenberg-dpo-v0.1
Gutenberg DPO
Overview
This is a dataset meant to enhance novel writing capabilities of LLMs, by using public domain books from Project Gutenberg
Process
First, the each book is parsed, split into chapters, cleaned up fro...
Details
- Task
- General NLP
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Size
- n<1K
- Creator
- jondurbin
- Year
- 2024
- License
- cc-by-4.0
- Downloads
- 622
- Likes
- 166