Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 67 tok/s

Gemini 2.5 Pro 52 tok/s Pro

GPT-5 Medium 30 tok/s Pro

GPT-5 High 29 tok/s Pro

GPT-4o 128 tok/s Pro

Kimi K2 204 tok/s Pro

GPT OSS 120B 461 tok/s Pro

Claude Sonnet 4.5 35 tok/s Pro

2000 character limit reached

How to Parameterize Asymmetric Quantization Ranges for Quantization-Aware Training (2404.16898v1)

Published 25 Apr 2024 in cs.LG and cs.AI

Abstract: This paper investigates three different parameterizations of asymmetric uniform quantization for quantization-aware training: (1) scale and offset, (2) minimum and maximum, and (3) beta and gamma. We perform a comprehensive comparative analysis of these parameterizations' influence on quantization-aware training, using both controlled experiments and real-world LLMs. Our particular focus is on their changing behavior in response to critical training hyperparameters, bit width and learning rate. Based on our investigation, we propose best practices to stabilize and accelerate quantization-aware training with learnable asymmetric quantization ranges.

References (33)