PF-LLM: Large Language Model Hinted Hardware Prefetching

March 22, 2026·

Ceyu Xu

Xiangfeng Sun

Weihang Li

Chen Bai

Bangyan Wang

Mengming Li

Zhiyao Xie

Yuan Xie

· 0 min read

PDF

Type

Conference paper

Publication

Proceedings of the 31th ACM International Conference on Architectural Support for Programming Languages and Operating Systems 🏆Best Paper Award

Last updated on March 22, 2026

Authors

Ceyu Xu

Research Assistant Professor

Entropy Xu (徐策羽) obtained his Ph.D. in the Computer Science Department at Duke University. His research focuses on the intersection of computer architecture and machine learning, with two main directions: designing efficient computer architectures for machine learning algorithms and developing machine learning algorithms for efficient computer architectures. His interests include Machine Learning Hardware Accelerator, Large Language Models, Compression and Quantization, and AI for EDA. HKUST Official Profile Page

Authors

Xiangfeng Sun

Ph.D. Student

Xiangfeng Sun is an Ph.D. student at the Hong Kong University of Science and Technology specializing in computer architecture and VLSI. He aims for his research to be both meaningful and interesting.

Authors

Chen Bai

Postdoctoral Researcher

Chen Bai’s research interests include computer architecture and electronic design automation. He received best paper awards from ICCAD 2021 and ISPD 2024.

Authors

Bangyan Wang

Postdoctoral Researcher

Bangyan Wang (王邦彦) is a Postdoc in FACT Lab. His current research interests focus on applying AI and formal methods to hardware verification and generation. In the past, his major focus was on computer architecture, including multicore scheduling problems and instruction extension design for semi-general-purpose scenarios.

Authors

Yuan Xie

Chair Professor

Fang Professor of Engineering | Chair Professor | IEEE/ACM/AAAS Fellow

AccelStack: A Cost-Driven Analysis of 3D-Stacked LLM Accelerators October 26, 2025 →