Question 1

What is the unalignment/toxic-dpo-v0.2 dataset?

Accepted Answer

Toxic-DPO

This is a highly toxic, "harmful" dataset meant to illustrate how DPO can be used to de-censor/unalign a model quite easily using direct-preference-optimization (DPO) using very few examples.
Many of the examples still contain some am...

Question 2

Is unalignment/toxic-dpo-v0.2 a benchmark?

Accepted Answer

unalignment/toxic-dpo-v0.2 is a dataset for training or evaluation; it isn't tracked as a standard LLM benchmark in our catalog.

Question 3

Where can I download unalignment/toxic-dpo-v0.2?

Accepted Answer

unalignment/toxic-dpo-v0.2 is available at its source: https://huggingface.co/datasets/unalignment/toxic-dpo-v0.2.

unalignment/toxic-dpo-v0.2

About unalignment/toxic-dpo-v0.2

Details

Related General NLP datasets

FAQ