Skip to content

jondurbin/gutenberg-dpo-v0.1

General NLPENcc-by-4.0

Created by jondurbin at 2024, the jondurbin/gutenberg-dpo-v0.1 is a General NLP dataset in EN in Parquet format. With 622 downloads and 166 likes, it is actively used by the community. It is released under the cc-by-4.0 license and is a n<1K-scale dataset.

About jondurbin/gutenberg-dpo-v0.1

Gutenberg DPO Overview This is a dataset meant to enhance novel writing capabilities of LLMs, by using public domain books from Project Gutenberg Process First, the each book is parsed, split into chapters, cleaned up fro...

Details

Task
General NLP
Language
EN
Format
Parquet
Rows / instances
N/A
Size
n<1K
Creator
jondurbin
Year
2024
License
cc-by-4.0
Downloads
622
Likes
166
Download Homepage

Related General NLP datasets

FAQ