Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

Road Detection and Autonomous Driving

Published:

Research and development on road detection and autonomous driving systems, focusing on computer vision and perception for autonomous vehicles.

Robot Reinforcement Learning

Published:

Research on reinforcement learning for robotic control and manipulation, developing intelligent agents for robotic tasks.

Text-to-Scene Generation

Published:

Generating 3D scenes from natural language descriptions, bridging the gap between language understanding and 3D scene representation.

openRLHF and lightRLHF Framework Improvements

Published:

Algorithm improvements and enhancements for the openRLHF reinforcement learning framework, including lightRLHF - a lightweight version based on openRLHF improvements.

veRL Framework Improvements

Published:

Framework integration and improvements for veRL (Volcano Engine Reinforcement Learning), a flexible, efficient and production-ready RL training library for large language models.

Research MCP Servers

Published:

A collection of Model Context Protocol (MCP) servers for research workflows, including arXiv integration and other research tools.

Multi-Agent Research Assistant

Published:

An intelligent multi-agent system for research assistance, enabling collaborative problem-solving among multiple AI agents.

publications

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45° Law

Published in arXiv, 2025

We introduce SafeWork-R1, a cutting-edge multimodal reasoning model that demonstrates the coevolution of capabilities and safety. It is developed by our proposed SafeLadder framework, which incorporates large-scale, progressive, safety-oriented reinforcement learning post-training, supported by a suite of multi-principled verifiers.

Recommended citation: Shanghai AI Lab et al. (2025). "SafeWork-R1: Coevolving Safety and Intelligence under the AI-45° Law." arXiv preprint arXiv:2507.18576.
Download Paper

Native Reasoning Models: Training Language Models to Reason on Unverifiable Data

Published in ICLR 2026 Poster, 2026

We propose a novel approach for training language models to reason on unverifiable data, enabling native reasoning capabilities without requiring ground-truth supervision.

Recommended citation: Wang, Y., Liu, Z., Li, X., Lu, C., & Yang, C. (2026). "Native Reasoning Models: Training Language Models to Reason on Unverifiable Data." ICLR 2026 Poster.
Download Paper

Reflector: Internalizing Step-wise Reflection against Indirect Jailbreaks

Published in ICML 2026, 2026

We propose Reflector, a framework that internalizes step-wise reflection mechanisms to defend against indirect jailbreak attacks on large language models.

Recommended citation: Ma, J., Zhang, J., Li, X., Zou, B., Lu, C., & Yang, C. (2026). "REFLECTOR: Internalizing Step-wise Reflection against Indirect Jailbreak." ICML 2026.
Download Paper

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.