Skip to content

unalignment/toxic-dpo-v0.1

General NLPEnglish

Unalignment/toxic-dpo-v0.1 is a General NLP-focused dataset in English distributed in Parquet format.

About unalignment/toxic-dpo-v0.1

Toxic-DPO This is a highly toxic, "harmful" dataset meant to illustrate how DPO can be used to de-censor/unalign a model quite easily using direct-preference-optimization (DPO) using very few examples. Most of the examples still contain some am...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
N/A
Creator
unalignment
Year
2023
Download

Related General NLP datasets

FAQ