unalignment/toxic-dpo-v0.2
General NLPEnglish
Created by unalignment at 2024, the unalignment/toxic-dpo-v0.2 is a General NLP dataset in English in Parquet format.
About unalignment/toxic-dpo-v0.2
Toxic-DPO
This is a highly toxic, "harmful" dataset meant to illustrate how DPO can be used to de-censor/unalign a model quite easily using direct-preference-optimization (DPO) using very few examples.
Many of the examples still contain some am...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- unalignment
- Year
- 2024