Question 1

What is the unalignment/toxic-dpo-v0.1 dataset?

Accepted Answer

Toxic-DPO

This is a highly toxic, "harmful" dataset meant to illustrate how DPO can be used to de-censor/unalign a model quite easily using direct-preference-optimization (DPO) using very few examples.
Most of the examples still contain some am...

Question 2

Is unalignment/toxic-dpo-v0.1 a benchmark?

Accepted Answer

unalignment/toxic-dpo-v0.1 is a dataset for training or evaluation; it isn't tracked as a standard LLM benchmark in our catalog.

Question 3

Where can I download unalignment/toxic-dpo-v0.1?

Accepted Answer

unalignment/toxic-dpo-v0.1 is available at its source: https://huggingface.co/datasets/unalignment/toxic-dpo-v0.1.

unalignment/toxic-dpo-v0.1

About unalignment/toxic-dpo-v0.1

Details

Related General NLP datasets

FAQ