Huang et al** argue that "transformers incur a quadratic attention cost, limiting their ability to model long spatial and temporal sequences..."
* see my blog of 1 Feb 2025
** Jihao Huang et al, LADY: Linear Attention for Autonomous Driving Efficiency without Transformers, arXiv:2512.15038v1, 17 Dec 2025.
No comments:
Post a Comment