Posts by Collection

portfolio

AI-Powered Operator Network Operations Automation

Published: September 01, 2021

Developed AI-based automation systems for operator network operations, improving efficiency and reducing manual intervention in network management.

AI-Based Base Station Error Handling and Diagnosis

Published: March 01, 2022

Developed intelligent systems for automated base station error detection, diagnosis, and resolution, reducing downtime and maintenance costs.

Road Detection and Autonomous Driving

Published: June 01, 2023

Research and development on road detection and autonomous driving systems, focusing on computer vision and perception for autonomous vehicles.

Robot Reinforcement Learning

Published: September 01, 2023

Research on reinforcement learning for robotic control and manipulation, developing intelligent agents for robotic tasks.

Text-to-Scene Generation

Published: March 01, 2024

Generating 3D scenes from natural language descriptions, bridging the gap between language understanding and 3D scene representation.

MLLM-MCTS-CoT: Monte Carlo Tree Search for Multimodal Reasoning

Published: November 26, 2024

Combining Monte Carlo Tree Search (MCTS) with Chain-of-Thought reasoning for enhanced reasoning in multimodal large language models.

Multimodal Large Language Model Chain-of-Thought

Published: December 04, 2024

Research on Chain-of-Thought (CoT) reasoning for multimodal large language models, improving reasoning capabilities in vision-language tasks.

Automated Evaluation Framework Improvements (lmms-eval)

Published: January 01, 2025

Improving automated evaluation frameworks based on lmms-eval for comprehensive assessment of multimodal and language models.

openRLHF and lightRLHF Framework Improvements

Published: January 01, 2025

Algorithm improvements and enhancements for the openRLHF reinforcement learning framework, including lightRLHF - a lightweight version based on openRLHF improvements.

veRL Framework Improvements

Published: March 01, 2025

Framework integration and improvements for veRL (Volcano Engine Reinforcement Learning), a flexible, efficient and production-ready RL training library for large language models.

Research MCP Servers

Published: August 01, 2025

A collection of Model Context Protocol (MCP) servers for research workflows, including arXiv integration and other research tools.

Multi-Agent Research Assistant

Published: August 14, 2025

An intelligent multi-agent system for research assistance, enabling collaborative problem-solving among multiple AI agents.

Multi-round Reinforcement Learning and Self-Evolving Agents

Published: November 01, 2025

Exploring multi-round reinforcement learning and self-evolving agent architectures for developing adaptive and continuously improving AI systems.

Agent Harness: Long-Term Memory for Adaptive AI Agents

Published: March 15, 2026

A memory-augmented agent harness framework that enables AI agents to continuously learn and adapt through long-term episodic and semantic memory mechanisms.

Self-Evolving Multi-Agent Systems with Hierarchical Memory

Published: May 01, 2026

A self-evolving multi-agent framework that enables agents to autonomously improve strategies through multi-round interactions and hierarchical memory structures.

publications

Collaborative Multi-Agent Reinforcement Learning for Complex Task Solving

Published in Under Review, 2025

We propose a collaborative multi-agent reinforcement learning framework that enables agents to efficiently coordinate and solve complex tasks through emergent communication and adaptive role assignment.

Recommended citation:
Download Paper

Harnessing Long-Term Memory for Adaptive AI Agents

Published in Under Review, 2025

We present a memory-augmented agent architecture that harnesses long-term episodic and semantic memory to enable adaptive behavior and continual learning in dynamic environments.

Recommended citation:
Download Paper

Self-Evolving Multi-Agent Systems with Hierarchical Memory

Published in Under Review, 2025

We propose a self-evolving multi-agent framework with hierarchical memory structures that enables agents to continuously improve their strategies through multi-round interactions and experience replay.

Recommended citation:
Download Paper

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45° Law

Published in arXiv, 2025

We introduce SafeWork-R1, a cutting-edge multimodal reasoning model that demonstrates the coevolution of capabilities and safety. It is developed by our proposed SafeLadder framework, which incorporates large-scale, progressive, safety-oriented reinforcement learning post-training, supported by a suite of multi-principled verifiers.

Recommended citation: Shanghai AI Lab et al. (2025). "SafeWork-R1: Coevolving Safety and Intelligence under the AI-45° Law." arXiv preprint arXiv:2507.18576.
Download Paper

Native Reasoning Models: Training Language Models to Reason on Unverifiable Data

Published in ICLR 2026 Poster, 2026

We propose a novel approach for training language models to reason on unverifiable data, enabling native reasoning capabilities without requiring ground-truth supervision.

Recommended citation: Wang, Y., Liu, Z., Li, X., Lu, C., & Yang, C. (2026). "Native Reasoning Models: Training Language Models to Reason on Unverifiable Data." ICLR 2026 Poster.
Download Paper

TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems

Published in arXiv, 2026

We present TrinityGuard, a unified safety framework for multi-agent systems that ensures robust and trustworthy coordination among AI agents through multi-layered safeguarding mechanisms.

Recommended citation:
Download Paper

Reflector: Internalizing Step-wise Reflection against Indirect Jailbreaks

Published in ICML 2026, 2026

We propose Reflector, a framework that internalizes step-wise reflection mechanisms to defend against indirect jailbreak attacks on large language models.

Recommended citation: Ma, J., Zhang, J., Li, X., Zou, B., Lu, C., & Yang, C. (2026). "REFLECTOR: Internalizing Step-wise Reflection against Indirect Jailbreak." ICML 2026.
Download Paper

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.

Li Xiangtian

Posts by Collection

portfolio

AI-Powered Operator Network Operations Automation

AI-Based Base Station Error Handling and Diagnosis

Road Detection and Autonomous Driving

Robot Reinforcement Learning

Text-to-Scene Generation

MLLM-MCTS-CoT: Monte Carlo Tree Search for Multimodal Reasoning

Multimodal Large Language Model Chain-of-Thought

Automated Evaluation Framework Improvements (lmms-eval)

openRLHF and lightRLHF Framework Improvements

veRL Framework Improvements

Research MCP Servers

Multi-Agent Research Assistant

Multi-round Reinforcement Learning and Self-Evolving Agents

Agent Harness: Long-Term Memory for Adaptive AI Agents

Self-Evolving Multi-Agent Systems with Hierarchical Memory

publications

Collaborative Multi-Agent Reinforcement Learning for Complex Task Solving

Harnessing Long-Term Memory for Adaptive AI Agents

Self-Evolving Multi-Agent Systems with Hierarchical Memory

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45° Law

Native Reasoning Models: Training Language Models to Reason on Unverifiable Data

TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems

Reflector: Internalizing Step-wise Reflection against Indirect Jailbreaks

talks

Talk 1 on Relevant Topic in Your Field

Tutorial 1 on Relevant Topic in Your Field

Talk 2 on Relevant Topic in Your Field

Conference Proceeding talk 3 on Relevant Topic in Your Field

teaching

Teaching experience 1

Teaching experience 2