Lijie (Derrick) Yang

Department of Computer Science, Princeton University

prof_pic.jpg

I am a CS PhD at Princeton University fortunate to be advised by Prof. Ravi Netravali. I obtained my bachelor degree in Computer Science from Carnegie Mellon, where I was advised by Prof. Zhihao Jia and worked closely with Prof. Tianqi Chen.

Research Interests: My work focuses on building efficient deep learning systems with the co-design of hardware and software. I’m particularly passionate about exploring the potential of state-of-the-art AI models like language models in reasoning and long-context tasks.

news

Aug 13, 2025 Our paper on sparse attention for efficient reasoning, LessIsMore, is on ArXiv!
Jul 21, 2025 Graduated from CMU, more than excited about starting my PhD at Princeton University :tada:!
May 15, 2025 Honored to receive The Allen Newell Award for Research Excellence, Honorable Mention :tada:!
Jan 20, 2025 TidalDecode is accepted to ICLR 2025, see you in Singapore :tada:!
Nov 14, 2024 Gave a talk at CMU Catalyst Lab on TidalDecode
Oct 08, 2024 Our project on sparse attention for long-context models, TidalDecode, is on ArXiv!
Aug 24, 2024 Honored to be an early inductee into Phi Beta Kappa (ΦΒΚ) of Class 2025 :tada:!
Jul 03, 2024 BWE is accepted to LCN 2024 :tada:!
May 01, 2024 RalmSpec is accepted to ICML 2024 :tada:!
Mar 03, 2024 SpecInfer is accepted to ASPLOS 2024 :tada:!

selected publications

  1. arXiv
    Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning
    Lijie Yang*, Zhihao Zhang*, Arti Jain, Shijie Cao, Baihong Yuan, Yiwei Chen, Zhihao Jia, and Ravi Netravali
    2025
  2. ICLR
    TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
    Lijie Yang*, Zhihao Zhang*, Zhuofu Chen, Zikun Li, and Zhihao Jia
    In proceedings of International Conference on Learning Representations, 2025
  3. LCN
    Blocking-Waived Estimation: Improving the Worst-Case End-To-End Delay Analysis in Switched Ethernet
    Lijie Yang, Théo Docquier, Ludovic Thomas, and Ye-Qiong Song
    In Proceedings of Local Computer Networks, 2024
  4. ICML
    Accelerating Retrieval-augmented Language Model Serving with Speculation
    Zhihao Zhang, Alan Zhu, Lijie Yang, Yihua Xu, Lanting Li, Phitchaya Mangpo Phothilimthana, and Zhihao Jia
    In Proceedings of International Conference on Machine Learning, 2024
  5. ASPLOS
    SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification
    Xupeng Miao*, Gabriele Oliaro*, Zhihao Zhang*, Xinhao Cheng*, Zeyu Wang, Zhengxin Zhang, Rae Ying Yee Wong, Alan Zhu, Lijie Yang, and 6 more authors
    In Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3, Apr 2024