On the Shoulders of LLMs: From LLM Optimization to LLM Agents
ICPRAI 2024 Tutorial
Website Slides

Summary

The tutorial will start with the basic architecture of LLMs and move on to advanced optimization techniques. Participants will learn how to enhance the performance of LLMs during the inference stage by optimizing KV cache for quicker response times, extended memory capabilities, and improved answer quality. Furthermore, the tutorial will cover diverse LLM reasoning method including, Chain-of-Thought (CoT) reasoning, to enhance the interpretability, controllability, and flexibility of LLMs.
With a robust understanding of LLM enhancements, we turn to the emerging field of language agents powered by LLMs. The emergence of LLMs has significantly accelerated the evolution of AI agents, pushing closer to the long-standing goal of building intelligent, autonomous agents that can learn and act in distinct environments. This session will guide attendees through the concepts of agents, how LLMs empower these agents, and the challenges these agents might face in the future aiming to equip participants with the knowledge to effectively implement and manage LLMs in a responsible and efficient manner.

Resources

  • Igniting Language Intelligence: The Hitchhiker’s Guide From Chain-of-Thought Reasoning to Language Agents [Paper]
  • R-Judge: Benchmarking Safety Risk Awareness for LLM Agents [Paper]
  • Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science [Paper]
  • You Only Look at Screens: Multimodal Chain-of-Action Agents [Paper]
  • Multimodal Chain-of-Thought Reasoning in Language Models [Paper]
  • Automatic Chain of Thought Prompting in Large Language Models [Paper]
  • Identifying the Risks of LM Agents with an LM-Emulated Sandbox [Paper]
  • A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents [Paper]
  • Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations [Paper]
  • ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors [Paper]
  • Efficient Memory Management for Large Language Model Serving with PagedAttention [Paper]
  • GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints [Paper]
  • Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models [Paper]
  • SirLLM: Streaming Infinite Retentive LLM [Paper]

Tutorial Organizers

 

Mr. Zuchao Li

Associate Researcher

Wuhan University

 

Ms. Yao Yao

Ph.D Candidate

Shanghai Jiao Tong University

 

Mr. Zhuosheng Zhang

Associate Professor

Shanghai Jiao Tong University

 

Tutorial Contributors

 

Mr. Teng Xiao

Ph.D Candidate

Wuhan University

 

Mr. Luohe Shi

Undergraduate Student

Wuhan University

 

Mr. Tongxin Yuan

Graduate Student

Shanghai Jiao Tong University