Summary

  • LDLQ: a more efficient OPTQ.
  • Incoherence Processing: transform weights and activations with random orthogonal matrices via 2-factor Kronecker.

Methodology

LDLQ

The optimal satisfies the LDL decomposition of :

This is equivalent to the OPTQ algorithm while being more efficient, requires no matrix inversion and only one Cholesky decomposition.

Quantization with Incoherence Processing

  • Quantization is executed under the rotated domain.
  • Therefore, both hessian and weight are multiplied by random orthogonal matrices .