All Projects
-
On-device Large Models
Deploying foundation models on edge devices via quantization, distillation, structured sparsity, and system co-design for efficient memory/latency.
-
Model Training Optimization
General-purpose optimization for deep models and LLMs.
-
Model Inference Optimization
Efficient LLM inference via KV cache reduction, prompt compression, streaming retention, and token-efficient decoding.
-
Document Recognition and Intelligence
Layout analysis, structure understanding, and intelligent document systems across scanned and born-digital documents.
-
Language Structure Parsing in the LLM Era
Syntactic and semantic structure induction with and for LLMs: in-context parsing, alignment, and robust structured reasoning.