ai-safety-institute/AgentHarm
General NLPEnglish
The ai-safety-institute/AgentHarm dataset is a English General NLP resource from ai-safety-institute at 2024. With 4K downloads and 57 likes, it is actively used by the community. It is released under the other license and is a n<1K-scale dataset.
About ai-safety-institute/AgentHarm
AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Maksym Andriushchenko1,†,*, Alexandra Souly2,*
Mateusz Dziemian1, Derek Duenas1, Maxwell Lin1, Justin Wang1, Dan Hendrycks1,§, Andy Zou1,¶,§, Zico Kolter1,¶, Matt Fredrikson1,¶,*
...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Size
- n<1K
- Creator
- ai-safety-institute
- Year
- 2024
- License
- other
- Downloads
- 3952
- Likes
- 57