Skip to content

obliteratus

SkillMITby erfanzar

Remove refusal behaviors from open-weight LLMs using OBLITERATUS — mechanistic interpretability techniques (diff-in-means, SVD, whitened SVD, LEACE, SAE decomposition, etc.) to excise guardrails while preserving reasoning. 9 CLI methods, 28 analysis modules, 116 model presets across 5 compute tiers, tournament evaluation, and telemetry-driven recommendations. Use when a user wants to uncensor, abliterate, or remove refusal from an LLM.

Repository Source folder

Details

Path
src/python/xerxes/skills/inference/obliteratus/SKILL.md
License
MIT
Dependencies
4

FAQ