Li Xiangtian - Research in AI & Machine Learning

About Me

I am a researcher at Shanghai Artificial Intelligence Laboratory (Shanghai AI Lab), focusing on cutting-edge research in AI safety, large language models, and multi-agent systems. My current research interests include:

  • LLM Safety & Alignment: Developing robust safety frameworks for large language models, including jailbreak defense, safety-oriented RL post-training, and multi-agent safeguarding
  • Reinforcement Learning: Exploring reinforcement learning techniques for LLM post-training, self-evolving agents, and multi-agent coordination
  • Multi-agent Systems: Investigating safety, coordination, and collaboration mechanisms in multi-agent environments
  • Large Language Model Post-training: Developing efficient methods for fine-tuning, alignment, and post-training of large language models
  • Multimodal AI: Working on integrating vision and language in multimodal large language models (MLLM)

Research Projects

My research spans several exciting areas:

Multi-round Reinforcement Learning and Self-Evolving Agents

Exploring multi-round reinforcement learning and self-evolving agent architectures for developing adaptive and continuously improving AI systems. This ongoing research focuses on agents that can learn and adapt across multiple rounds of interaction, continuously evolving their strategies and capabilities.

Reinforcement Learning Framework Integration

Integrating and improving reinforcement learning frameworks including veRL, openRLHF, and slime, with algorithm enhancements for better training efficiency and performance.

Automated Evaluation Framework

Improving automated evaluation frameworks based on lmms-eval for comprehensive assessment of multimodal and language models.

Multi-agent Research Assistant

Developing intelligent multi-agent systems for research assistance, enabling collaborative problem-solving among multiple AI agents.

Multimodal Large Language Models

Working on Chain-of-Thought (CoT) reasoning and Monte Carlo Tree Search (MCTS) methods for improving reasoning capabilities in multimodal settings.

Research Tools & Infrastructure

Building MCP (Model Context Protocol) servers for research workflows, including arXiv integration and other research tools.

Text-to-Scene Generation

Previous work on generating 3D scenes from natural language descriptions, bridging the gap between language understanding and 3D scene representation.

Publications & Talks

Please see the Publications and Talks pages for details about my research contributions.

Contact