Publications

You can also find my articles on my Google Scholar profile.

Journal Articles

TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems

Published in arXiv, 2026

We present TrinityGuard, a unified safety framework for multi-agent systems that ensures robust and trustworthy coordination among AI agents through multi-layered safeguarding mechanisms.

Recommended citation:
Download Paper

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45° Law

Published in arXiv, 2025

We introduce SafeWork-R1, a cutting-edge multimodal reasoning model that demonstrates the coevolution of capabilities and safety. It is developed by our proposed SafeLadder framework, which incorporates large-scale, progressive, safety-oriented reinforcement learning post-training, supported by a suite of multi-principled verifiers.

Recommended citation: Shanghai AI Lab et al. (2025). "SafeWork-R1: Coevolving Safety and Intelligence under the AI-45° Law." arXiv preprint arXiv:2507.18576.
Download Paper

Self-Evolving Multi-Agent Systems with Hierarchical Memory

Published in Under Review, 2025

We propose a self-evolving multi-agent framework with hierarchical memory structures that enables agents to continuously improve their strategies through multi-round interactions and experience replay.

Recommended citation:
Download Paper

Harnessing Long-Term Memory for Adaptive AI Agents

Published in Under Review, 2025

We present a memory-augmented agent architecture that harnesses long-term episodic and semantic memory to enable adaptive behavior and continual learning in dynamic environments.

Recommended citation:
Download Paper

Collaborative Multi-Agent Reinforcement Learning for Complex Task Solving

Published in Under Review, 2025

We propose a collaborative multi-agent reinforcement learning framework that enables agents to efficiently coordinate and solve complex tasks through emergent communication and adaptive role assignment.

Recommended citation:
Download Paper

Conference Papers

Reflector: Internalizing Step-wise Reflection against Indirect Jailbreaks

Published in ICML 2026, 2026

We propose Reflector, a framework that internalizes step-wise reflection mechanisms to defend against indirect jailbreak attacks on large language models.

Recommended citation: Ma, J., Zhang, J., Li, X., Zou, B., Lu, C., & Yang, C. (2026). "REFLECTOR: Internalizing Step-wise Reflection against Indirect Jailbreak." ICML 2026.
Download Paper

Native Reasoning Models: Training Language Models to Reason on Unverifiable Data

Published in ICLR 2026 Poster, 2026

We propose a novel approach for training language models to reason on unverifiable data, enabling native reasoning capabilities without requiring ground-truth supervision.

Recommended citation: Wang, Y., Liu, Z., Li, X., Lu, C., & Yang, C. (2026). "Native Reasoning Models: Training Language Models to Reason on Unverifiable Data." ICLR 2026 Poster.
Download Paper

Li Xiangtian

Publications

Journal Articles

TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45° Law

Self-Evolving Multi-Agent Systems with Hierarchical Memory

Harnessing Long-Term Memory for Adaptive AI Agents

Collaborative Multi-Agent Reinforcement Learning for Complex Task Solving

Conference Papers

Reflector: Internalizing Step-wise Reflection against Indirect Jailbreaks

Native Reasoning Models: Training Language Models to Reason on Unverifiable Data