Efficient LLMs

KV cache compression, inference acceleration, speculative decoding, context compression, and parameter-efficient fine-tuning for large language models.

KV Cache Compression
Ongoing
Reducing the memory footprint of KV cache in LLMs through dimensional-level reduction, quantization, and layer-aware strategies, enabling efficient long-context inference without quality loss.
Representative works: KV-Latent (ACL 2025), SpindleKV (ACL 2025), XQuant (EMNLP 2025), SmallKV (NeurIPS 2025), CoViPAL: Layer-wise Contextualized Visual Token Pruning (EMNLP Findings 2025)
KV Cache Quantization Long-Context
Inference Acceleration & Speculative Decoding
Ongoing
Speeding up LLM inference via speculative decoding, attention mapping between models of different scales, and N-gram trie based in-context learning acceleration for large-batch scenarios.
Representative works: IAM (ACL 2025), Faster In-Context Learning via N-Gram Trie Speculative Decoding (EMNLP 2025), Scaling LLM Speculative Decoding (AAAI 2026)
Speculative Decoding Attention Inference Speed
Context & Prompt Compression
Ongoing
Compressing long contexts and prompts to reduce computational cost while preserving semantic fidelity, including dynamic attention-aware prompt compression and task-agnostic approaches for streaming LLMs.
Representative works: DAC (ACL 2025), SirLLM (ACL 2024), Compressing Context to Enhance Inference Efficiency (EMNLP 2023), ToM (EMNLP 2025)
Prompt Compression Long Context Streaming
Parameter-Efficient Fine-Tuning
Ongoing
Exploring sparse fine-tuning, selective prefix tuning, and bidirectional model fine-tuning with mixture-of-expert methods to adapt large pre-trained models with minimal parameter updates.
Representative works: Sparse is Enough in Fine-tuning Pre-trained Large Language Models (ICML 2024), Selective Prefix Tuning (ACL Findings 2024), Uni-Bi-Directional MoE (ICML 2025)
PEFT Sparse Fine-tuning MoE
🔤

Language Understanding and Reasoning

LLM reasoning, computational linguistics, and a broad range of NLP tasks including information extraction, machine translation, and sentiment analysis.

LLM Reasoning
Ongoing
Enhancing LLM reasoning capabilities through structured thought processes such as graph-of-thought, tree-oriented MapReduce, knowledge transfer, and reference-trustable decoding for more reliable inference.
Representative works: GoT: Effective Graph-of-Thought Reasoning (NAACL Findings 2024), Reference Trustable Decoding (NeurIPS 2024), GKT (ACL Findings 2024), ToM (EMNLP 2025)
Reasoning RAG Knowledge Transfer
Computational Linguistics
Completed
Developing unified and syntax-aware frameworks for semantic role labeling and dependency parsing, as well as neural machine translation with explicit compression and unsupervised approaches with universal grammar.
Representative works: A Unified Syntax-aware Framework for SRL (EMNLP 2018), Syntax-Guided SRL (ACL 2022), Global Greedy Dependency Parsing (AAAI 2020), Explicit Sentence Compression for NMT (AAAI 2020)
SRL Dependency Parsing Machine Translation
NLP Tasks
Ongoing
Advancing a broad range of NLP tasks including universal information extraction, named entity recognition, aspect-based sentiment analysis, and retrieval-augmented question answering.
Representative works: Label Drop for Universal Information Extraction (NAACL 2025), Named Entity Recognition as Corpus Aware Holistic Structure Parsing (COLING 2022), A Novel Energy Based Model for Multi-Modal ABSA (AAAI 2024), Unsupervised NMT with Universal Grammar (EMNLP 2021)
Information Extraction NER Sentiment Analysis
🎭

Multimodal AI

Multimodal understanding and generation, vision-language models, speech recognition, music AI, and document intelligence.

Vision-Language Models
Ongoing
Building and improving large vision-language models for multimodal understanding and generation, including visual token pruning, cross-modal adaptation, and multi-modal auto-regressive modeling.
Representative works: Multi-modal Auto-regressive Modeling via Visual Tokens (ACM Multimedia 2024), AMIA: Automatic Masking and Joint Intention Analysis Makes LVLMs Robust Jailbreak Defenders (EMNLP Findings 2025)
VLM Visual Tokens Multimodal Generation
Speech & Audio Understanding
Ongoing
Advancing automatic speech recognition with multimodal signals (vision hotwords) and joint structure learning, as well as contrastive language-speech pretraining for long-form spoken question answering.
Representative works: VHASR: A Multimodal Speech Recognition System With Vision Hotwords (EMNLP 2024), Joint Automatic Speech Recognition And Structure Learning (ICASSP 2025)
ASR Speech Audio-Language
Music AI
Ongoing
Applying AI to music understanding, generation, and evaluation — including symbolic music understanding, music notation comprehension for LLMs, large-scale music evaluation benchmarks, and lyric generation.
Representative works: NOTA: Multimodal Music Notation Understanding (NAACL Findings 2025), N-gram Unsupervised Compoundation for Symbolic Music Understanding (AAAI 2024), The Music Maestro Benchmark (ACL Findings 2024), SongSong (AAAI 2025)
Music Generation Symbolic Music Music LLM
Document Intelligence
Ongoing
Understanding complex document structures using hypergraph-based representations and dependency-aware attention for semantic entity recognition, information extraction, and document-level understanding.
Representative works: Hypergraph based Understanding for Document Semantic Entity Recognition (ACL 2024), Multi-Modal Latent Space Learning for Chain-of-Thought Reasoning (AAAI 2024)
Document Understanding Hypergraph Entity Recognition