Research Interests

🧠
Large Language Models
Instruction tuning, alignment, long-context reasoning, and domain-specific LLMs.
Efficient AI
KV cache compression, speculative decoding, prompt compression, and quantization.
🖼️
Multimodal Models
Vision-language, document intelligence, audio-language, and music AI.
📝
NLP & Structure Parsing
Information extraction, syntactic/semantic parsing, and structured reasoning.

Selected Publications Full list →

(# Equal Contribution; * Corresponding Author)

2025 – 2026
SmallKV: Small Model Assisted Compensation of KV Cache Compression for Efficient LLM Inference
Y Zhao, Y Peng, CT Nguyen, Zuchao Li, X Wang, H Zhao, X Fu
NeurIPS 2025
🔬
What Limits Bidirectional Model's Generative Capabilities? A Uni-Bi-Directional Mixture-of-Expert Method For Bidirectional Fine-tuning
Zuchao Li, Y Hei, Q Li, L Zhang, P Wang, B Qi, L Guoming
ICML 2025
📖
Segment First or Comprehend First? Explore the Limit of Unsupervised Word Segmentation with Large Language Models
Zihong Zhang, Liqi He, Zuchao Li, L Zhang, Hai Zhao, Bo Du
ACL 2025 Oral
PDF
🚀
Scaling LLM Speculative Decoding: Non-Autoregressive Forecasting in Large-Batch Scenarios
L Shi, Zuchao Li, L Zhang, B Qi, G Liu, H Zhao
AAAI 2026
PDF
2024
♾️
SirLLM: Streaming Infinite Retentive LLM
Yao Yao, Zuchao Li*, Hai Zhao*
ACL 2024
✂️
Sparse is Enough in Fine-tuning Pre-trained Large Language Models
Weixi Song, Zuchao Li*, Lefei Zhang, Hai Zhao, Bo Du
ICML 2024
PDF
View all publications →

News

  • Jan 2026 3 papers accepted to AAAI 2026: Scaling LLM Speculative Decoding, End-to-end Contrastive Language-Speech Pretraining, and Ghost in the Transformer.
  • Dec 2025 Paper accepted to NeurIPS 2025: SmallKV — small model assisted KV cache compression.
  • Nov 2025 6 papers accepted to EMNLP 2025: ToM, XQuant, Faster In-Context Learning, CoViPAL, and more.
  • Jul 2025 6 papers accepted to ACL 2025 (1 Oral) and 2 papers to NAACL 2025.
  • May 2025 Paper accepted to ICML 2025: Uni-Bi-Directional Mixture-of-Expert method.
  • Feb 2025 2 papers accepted to AAAI 2025: Imitate Before Detect and SongSong.
  • Jul 2024 3 papers accepted to ACL 2024: SirLLM, Hypergraph Document Understanding, and Selective Prefix Tuning.
  • May 2024 Paper accepted to ICML 2024: SIFT — Sparse is Enough in Fine-tuning Pre-trained LLMs.
  • Jan 2024 2 papers accepted to AAAI 2024: PromptKD and Prompt Compression.
  • Dec 2023 5 papers accepted to EMNLP 2023.
  • May 2023 3 papers accepted to ACL 2023.