unalignment/toxic-dpo-v0.1
General NLPEnglish
Unalignment/toxic-dpo-v0.1 is a General NLP-focused dataset in English distributed in Parquet format.
About unalignment/toxic-dpo-v0.1
Toxic-DPO
This is a highly toxic, "harmful" dataset meant to illustrate how DPO can be used to de-censor/unalign a model quite easily using direct-preference-optimization (DPO) using very few examples.
Most of the examples still contain some am...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- unalignment
- Year
- 2023