Featured Publications
(2025). AccelStack: A Cost-Driven Analysis of 3D-Stacked LLM Accelerators. 2025 IEEE/ACM International Conference on Computer Aided Design (ICCAD).
(2025). LLM. 265: Video Codecs are Secretly Tensor Codecs. Proceedings of the 58th IEEE/ACM International Symposium on Microarchitecture® 🏆Best Paper Award.
(2025). H2-LLM: Hardware-Dataflow Co-Exploration for Heterogeneous Hybrid-Bonding-based Low-Batch LLM Inference. Proceedings of the 52nd Annual International Symposium on Computer Architecture. 🏆Best Paper Award.
(2025). UniNDP: A Unified Compilation and Simulation Tool for Near DRAM Processing Architectures. 2025 IEEE International Symposium on High Performance Computer Architecture (HPCA). 🏆Best Paper Honorable Mention Award.
Recent Publications

A comprehensive list of publications is maintained on Prof. Xie’s Google Scholar profile.

(2026). PF-LLM: Large Language Model Hinted Hardware Prefetching. Proceedings of the 31th ACM International Conference on Architectural Support for Programming Languages and Operating Systems.
(2025). AccelStack: A Cost-Driven Analysis of 3D-Stacked LLM Accelerators. 2025 IEEE/ACM International Conference on Computer Aided Design (ICCAD).
(2025). LLM. 265: Video Codecs are Secretly Tensor Codecs. Proceedings of the 58th IEEE/ACM International Symposium on Microarchitecture® 🏆Best Paper Award.
(2025). H2-LLM: Hardware-Dataflow Co-Exploration for Heterogeneous Hybrid-Bonding-based Low-Batch LLM Inference. Proceedings of the 52nd Annual International Symposium on Computer Architecture. 🏆Best Paper Award.
(2025). UniNDP: A Unified Compilation and Simulation Tool for Near DRAM Processing Architectures. 2025 IEEE International Symposium on High Performance Computer Architecture (HPCA). 🏆Best Paper Honorable Mention Award.
(2024). Enabling efficient sparse multiplications on GPUs with heuristic adaptability. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
(2024). Evt: Accelerating deep learning training with epilogue visitor tree. Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3.
(2024). Klotski v2: Improved DNN Model Orchestration Framework for Dataflow Architecture Accelerators. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
(2024). NoCFuzzer: Automating NoC Verification in UVM. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
(2024). Salus: A Practical Trusted Execution Environment for CPU-FPGA Heterogeneous Cloud Platforms. Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 4.
(2023). ArchExplorer: Microarchitecture exploration via bottleneck analysis. Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture.
(2023). Dynamic n: M fine-grained structured sparse attention mechanism. Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming.
(2023). Klotski: DNN model orchestration framework for dataflow architecture accelerators. 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD).
(2023). Rm-stc: Row-merge dataflow inspired gpu sparse tensor core for energy-efficient sparse acceleration. Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture.
(2023). Spada: Accelerating sparse matrix multiplication with adaptive dataflow. Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2.
(2023). Tt-gnn: Efficient on-chip graph neural network training via embedding reformation and hardware optimization. Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture.
(2022). Ai-assisted synthesis in next generation eda: Promises, challenges, and prospects. 2022 IEEE 40th International Conference on Computer Design (ICCD).
(2022). Autocomm: A framework for enabling efficient communication in distributed quantum programs. 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO).
(2022). Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting. 2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI).
(2022). Saarsp: An architecture for systolic-array acceleration of recurrent spiking neural networks. ACM Journal on Emerging Technologies in Computing Systems (JETC).
(2022). Toward robust spiking neural network against adversarial perturbation. Advances in Neural Information Processing Systems.