publications

publications by categories in reversed chronological order.

2025

  1. IoT
    From Machine Learning-Based to LLM-Enhanced: An Application-Focused Analysis of How Social IoT Benefits from LLMs
    Lijie Yang, and Runbo Su
    IEEE Internet of Things Journal, Apr 2025
  2. ICLR
    TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
    Lijie Yang*, Zhihao Zhang*, Zhuofu Chen, Zikun Li, and Zhihao Jia
    In proceedings of International Conference on Learning Representations, Apr 2025

2024

  1. LCN
    Blocking-Waived Estimation: Improving the Worst-Case End-To-End Delay Analysis in Switched Ethernet
    Lijie Yang, Théo Docquier, Ludovic Thomas, and Ye-Qiong Song
    In Proceedings of Local Computer Networks, Apr 2024
  2. ICML
    Accelerating Retrieval-augmented Language Model Serving with Speculation
    Zhihao Zhang, Alan Zhu, Lijie Yang, Yihua Xu, Lanting Li, Phitchaya Mangpo Phothilimthana, and Zhihao Jia
    In Proceedings of International Conference on Machine Learning, Apr 2024
  3. ASPLOS
    SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification
    Xupeng Miao*, Gabriele Oliaro*, Zhihao Zhang*, Xinhao Cheng*, Zeyu Wang, Zhengxin Zhang, Rae Ying Yee Wong, Alan Zhu, Lijie Yang, and 6 more authors
    In Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3, Apr 2024

2023

  1. HAL
    Technical Report: Worst-case Delay Analysis: a Simulation-based Comparison between Flow Aggregation and CPA
    Lijie Yang
    Jan 2023