hqq-quantization
Half-Quadratic Quantization for LLMs without calibration data. Use when quantizing models to 4/3/2-bit precision without needing calibration datasets, for fast quantization workflows, or when deploying with vLLM or HuggingFace Transformers.
Details
- Path
- 10-optimization/hqq/SKILL.md
- License
- MIT
- Dependencies
- 4