Efficient LLMs
KV cache compression, inference acceleration, speculative decoding, context compression, and parameter-efficient fine-tuning for large language models.
KV Cache Compression
Ongoing
Reducing the memory footprint of KV cache in LLMs through dimensional-level reduction, quantization, and layer-aware strategies, enabling efficient long-context inference without quality loss.
Representative works: KV-Latent (ACL 2025), SpindleKV (ACL 2025), XQuant (EMNLP 2025), SmallKV (NeurIPS 2025), CoViPAL: Layer-wise Contextualized Visual Token Pruning (EMNLP Findings 2025)
Inference Acceleration & Speculative Decoding
Ongoing
Speeding up LLM inference via speculative decoding, attention mapping between models of different scales, and N-gram trie based in-context learning acceleration for large-batch scenarios.
Representative works: IAM (ACL 2025), Faster In-Context Learning via N-Gram Trie Speculative Decoding (EMNLP 2025), Scaling LLM Speculative Decoding (AAAI 2026)
Context & Prompt Compression
Ongoing
Compressing long contexts and prompts to reduce computational cost while preserving semantic fidelity, including dynamic attention-aware prompt compression and task-agnostic approaches for streaming LLMs.
Representative works: DAC (ACL 2025), SirLLM (ACL 2024), Compressing Context to Enhance Inference Efficiency (EMNLP 2023), ToM (EMNLP 2025)
Parameter-Efficient Fine-Tuning
Ongoing
Exploring sparse fine-tuning, selective prefix tuning, and bidirectional model fine-tuning with mixture-of-expert methods to adapt large pre-trained models with minimal parameter updates.
Representative works: Sparse is Enough in Fine-tuning Pre-trained Large Language Models (ICML 2024), Selective Prefix Tuning (ACL Findings 2024), Uni-Bi-Directional MoE (ICML 2025)
Language Understanding and Reasoning
LLM reasoning, computational linguistics, and a broad range of NLP tasks including information extraction, machine translation, and sentiment analysis.
LLM Reasoning
Ongoing
Enhancing LLM reasoning capabilities through structured thought processes such as graph-of-thought, tree-oriented MapReduce, knowledge transfer, and reference-trustable decoding for more reliable inference.
Representative works: GoT: Effective Graph-of-Thought Reasoning (NAACL Findings 2024), Reference Trustable Decoding (NeurIPS 2024), GKT (ACL Findings 2024), ToM (EMNLP 2025)
Computational Linguistics
Completed
Developing unified and syntax-aware frameworks for semantic role labeling and dependency parsing, as well as neural machine translation with explicit compression and unsupervised approaches with universal grammar.
Representative works: A Unified Syntax-aware Framework for SRL (EMNLP 2018), Syntax-Guided SRL (ACL 2022), Global Greedy Dependency Parsing (AAAI 2020), Explicit Sentence Compression for NMT (AAAI 2020)
NLP Tasks
Ongoing
Advancing a broad range of NLP tasks including universal information extraction, named entity recognition, aspect-based sentiment analysis, and retrieval-augmented question answering.
Representative works: Label Drop for Universal Information Extraction (NAACL 2025), Named Entity Recognition as Corpus Aware Holistic Structure Parsing (COLING 2022), A Novel Energy Based Model for Multi-Modal ABSA (AAAI 2024), Unsupervised NMT with Universal Grammar (EMNLP 2021)
Multimodal AI
Multimodal understanding and generation, vision-language models, speech recognition, music AI, and document intelligence.
Vision-Language Models
Ongoing
Building and improving large vision-language models for multimodal understanding and generation, including visual token pruning, cross-modal adaptation, and multi-modal auto-regressive modeling.
Representative works: Multi-modal Auto-regressive Modeling via Visual Tokens (ACM Multimedia 2024), AMIA: Automatic Masking and Joint Intention Analysis Makes LVLMs Robust Jailbreak Defenders (EMNLP Findings 2025)
Speech & Audio Understanding
Ongoing
Advancing automatic speech recognition with multimodal signals (vision hotwords) and joint structure learning, as well as contrastive language-speech pretraining for long-form spoken question answering.
Representative works: VHASR: A Multimodal Speech Recognition System With Vision Hotwords (EMNLP 2024), Joint Automatic Speech Recognition And Structure Learning (ICASSP 2025)
Music AI
Ongoing
Applying AI to music understanding, generation, and evaluation — including symbolic music understanding, music notation comprehension for LLMs, large-scale music evaluation benchmarks, and lyric generation.
Representative works: NOTA: Multimodal Music Notation Understanding (NAACL Findings 2025), N-gram Unsupervised Compoundation for Symbolic Music Understanding (AAAI 2024), The Music Maestro Benchmark (ACL Findings 2024), SongSong (AAAI 2025)
Document Intelligence
Ongoing
Understanding complex document structures using hypergraph-based representations and dependency-aware attention for semantic entity recognition, information extraction, and document-level understanding.
Representative works: Hypergraph based Understanding for Document Semantic Entity Recognition (ACL 2024), Multi-Modal Latent Space Learning for Chain-of-Thought Reasoning (AAAI 2024)