Summary

  • Static channel-wise weight pruning by utilizing computational invariance.
  • Apply PCA (eigen decomposition) to calibration dataset to obtain transformation matrix .
  • is merged into linear weights.
  • LayerNorm is converted to RMSNorm.