Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Posts

Future Blog Post

less than 1 minute read

Published: January 01, 2199

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published: August 14, 2015

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published: August 14, 2014

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published: August 14, 2013

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published: August 14, 2012

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

AI-Powered Operator Network Operations Automation

Published: September 01, 2021

Developed AI-based automation systems for operator network operations, improving efficiency and reducing manual intervention in network management.

AI-Based Base Station Error Handling and Diagnosis

Published: March 01, 2022

Developed intelligent systems for automated base station error detection, diagnosis, and resolution, reducing downtime and maintenance costs.

Road Detection and Autonomous Driving

Published: June 01, 2023

Research and development on road detection and autonomous driving systems, focusing on computer vision and perception for autonomous vehicles.

Robot Reinforcement Learning

Published: September 01, 2023

Research on reinforcement learning for robotic control and manipulation, developing intelligent agents for robotic tasks.

Text-to-Scene Generation

Published: March 01, 2024

Generating 3D scenes from natural language descriptions, bridging the gap between language understanding and 3D scene representation.

MLLM-MCTS-CoT: Monte Carlo Tree Search for Multimodal Reasoning

Published: November 26, 2024

Combining Monte Carlo Tree Search (MCTS) with Chain-of-Thought reasoning for enhanced reasoning in multimodal large language models.

Multimodal Large Language Model Chain-of-Thought

Published: December 04, 2024

Research on Chain-of-Thought (CoT) reasoning for multimodal large language models, improving reasoning capabilities in vision-language tasks.

Automated Evaluation Framework Improvements (lmms-eval)

Published: January 01, 2025

Improving automated evaluation frameworks based on lmms-eval for comprehensive assessment of multimodal and language models.

openRLHF and lightRLHF Framework Improvements

Published: January 01, 2025

Algorithm improvements and enhancements for the openRLHF reinforcement learning framework, including lightRLHF - a lightweight version based on openRLHF improvements.

veRL Framework Improvements

Published: March 01, 2025

Framework integration and improvements for veRL (Volcano Engine Reinforcement Learning), a flexible, efficient and production-ready RL training library for large language models.

Research MCP Servers

Published: August 01, 2025

A collection of Model Context Protocol (MCP) servers for research workflows, including arXiv integration and other research tools.

Multi-Agent Research Assistant

Published: August 14, 2025

An intelligent multi-agent system for research assistance, enabling collaborative problem-solving among multiple AI agents.

Multi-round Reinforcement Learning and Self-Evolving Agents

Published: November 01, 2025

Exploring multi-round reinforcement learning and self-evolving agent architectures for developing adaptive and continuously improving AI systems.

Agent Harness: Long-Term Memory for Adaptive AI Agents

Published: March 15, 2026

A memory-augmented agent harness framework that enables AI agents to continuously learn and adapt through long-term episodic and semantic memory mechanisms.

Self-Evolving Multi-Agent Systems with Hierarchical Memory

Published: May 01, 2026

A self-evolving multi-agent framework that enables agents to autonomously improve strategies through multi-round interactions and hierarchical memory structures.

publications

Collaborative Multi-Agent Reinforcement Learning for Complex Task Solving

Published in Under Review, 2025

We propose a collaborative multi-agent reinforcement learning framework that enables agents to efficiently coordinate and solve complex tasks through emergent communication and adaptive role assignment.

Recommended citation:
Download Paper

Harnessing Long-Term Memory for Adaptive AI Agents

Published in Under Review, 2025

We present a memory-augmented agent architecture that harnesses long-term episodic and semantic memory to enable adaptive behavior and continual learning in dynamic environments.

Recommended citation:
Download Paper

Self-Evolving Multi-Agent Systems with Hierarchical Memory

Published in Under Review, 2025

We propose a self-evolving multi-agent framework with hierarchical memory structures that enables agents to continuously improve their strategies through multi-round interactions and experience replay.

Recommended citation:
Download Paper

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45° Law

Published in arXiv, 2025

We introduce SafeWork-R1, a cutting-edge multimodal reasoning model that demonstrates the coevolution of capabilities and safety. It is developed by our proposed SafeLadder framework, which incorporates large-scale, progressive, safety-oriented reinforcement learning post-training, supported by a suite of multi-principled verifiers.

Recommended citation: Shanghai AI Lab et al. (2025). "SafeWork-R1: Coevolving Safety and Intelligence under the AI-45° Law." arXiv preprint arXiv:2507.18576.
Download Paper

Native Reasoning Models: Training Language Models to Reason on Unverifiable Data

Published in ICLR 2026 Poster, 2026

We propose a novel approach for training language models to reason on unverifiable data, enabling native reasoning capabilities without requiring ground-truth supervision.

Recommended citation: Wang, Y., Liu, Z., Li, X., Lu, C., & Yang, C. (2026). "Native Reasoning Models: Training Language Models to Reason on Unverifiable Data." ICLR 2026 Poster.
Download Paper

TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems

Published in arXiv, 2026

We present TrinityGuard, a unified safety framework for multi-agent systems that ensures robust and trustworthy coordination among AI agents through multi-layered safeguarding mechanisms.

Recommended citation:
Download Paper

Reflector: Internalizing Step-wise Reflection against Indirect Jailbreaks

Published in ICML 2026, 2026

We propose Reflector, a framework that internalizes step-wise reflection mechanisms to defend against indirect jailbreak attacks on large language models.

Recommended citation: Ma, J., Zhang, J., Li, X., Zou, B., Lu, C., & Yang, C. (2026). "REFLECTOR: Internalizing Step-wise Reflection against Indirect Jailbreak." ICML 2026.
Download Paper

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.

Li Xiangtian

Sitemap

Pages

Posts

portfolio

publications

talks

teaching