Dynamic sparse attention for scalable transformer acceleration

January 1, 2022·

Liu Liu

,

Zheng Qu

,

Zhaodong Chen

,

Fengbin Tu

,

Yufei Ding

Yuan Xie

· 0 min read

Type

Journal article

Publication

IEEE Transactions on Computers

Last updated on January 1, 2022

Yuan Xie

Authors

Chair Professor

Fang Professor of Engineering | Chair Professor | IEEE/ACM/AAAS Fellow

← Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting January 1, 2022

EPQuant: A Graph Neural Network compression approach based on product quantization January 1, 2022 →