Research Interests
Large Language Models
Instruction tuning, alignment, long-context reasoning, and domain-specific LLMs.
Efficient AI
KV cache compression, speculative decoding, prompt compression, and quantization.
Multimodal Models
Vision-language, document intelligence, audio-language, and music AI.
NLP & Structure Parsing
Information extraction, syntactic/semantic parsing, and structured reasoning.
Selected Publications Full list →
(# Equal Contribution; * Corresponding Author)
2025 – 2026
⚡
🔬
📖
Segment First or Comprehend First? Explore the Limit of Unsupervised Word Segmentation with Large Language Models
ACL 2025
Oral
🚀
Scaling LLM Speculative Decoding: Non-Autoregressive Forecasting in Large-Batch Scenarios
AAAI 2026
2024
✂️
News
- Jan 2026 3 papers accepted to AAAI 2026: Scaling LLM Speculative Decoding, End-to-end Contrastive Language-Speech Pretraining, and Ghost in the Transformer.
- Dec 2025 Paper accepted to NeurIPS 2025: SmallKV — small model assisted KV cache compression.
- Nov 2025 6 papers accepted to EMNLP 2025: ToM, XQuant, Faster In-Context Learning, CoViPAL, and more.
- Jul 2025 6 papers accepted to ACL 2025 (1 Oral) and 2 papers to NAACL 2025.
- May 2025 Paper accepted to ICML 2025: Uni-Bi-Directional Mixture-of-Expert method.
- Feb 2025 2 papers accepted to AAAI 2025: Imitate Before Detect and SongSong.
- Jul 2024 3 papers accepted to ACL 2024: SirLLM, Hypergraph Document Understanding, and Selective Prefix Tuning.
- May 2024 Paper accepted to ICML 2024: SIFT — Sparse is Enough in Fine-tuning Pre-trained LLMs.
- Jan 2024 2 papers accepted to AAAI 2024: PromptKD and Prompt Compression.
- Dec 2023 5 papers accepted to EMNLP 2023.
- May 2023 3 papers accepted to ACL 2023.