#technical-report-reading
1 total
- Attention Residuals: Making Residual Connections Attention-Like
A reading of Kimi Team's Attention Residuals technical report: why residual connections should become attention-like too, and how Full AttnRes / Block AttnRes turn that idea into a trainable, deployable system