🔥 News

2026.03: Start one year visiting in PINE Lab at EEE of Nanyang Technological University, directed by Prof. Ziwei Wang.
2026.01: One paper on “Evaluating LLMs Safety” is accepted by ICLR 2026!
2025.12: 入选2025年中国科协青年科技人才培育工程博士生专项计划 (托举学会: 中国计算机学会).
2025.12: We release one work on Reinforcement learning for LLMs Reasoning.
2025.12: 获得2025年度国家自然科学基金青年学生基础研究项目(博士研究生)资助.
2025.11: One paper on “Multi-agent reinforcement learning” is accepted by IEEE Internet of Things Journal.
2025.11: One paper on “Offline reinforcement learning” is accepted by T-NNLS.
2025.11: One paper on “Causal Multi-agent reinforcement learning” is accepted by AAAI 2026.
2025.11: One paper on “Safety on Large language models” is accepted by Machine Learning.
2025.11: One paper on “Multi-agent reinforcement learning” is accepted by Neural Network.
2025.10: 获得2025年研究生国家奖学金.
2025.06: One paper on “Benchmarking High-Dimensional Bayesian Optimization” is accepted by IJCNN 2025.
2025.04: One paper on “Causal reinforcement learning” is accepted by SCIENCE CHINA Information Sciences (SCIS).
2025.02: We release one work on Evaluating LLMs Safety.
2025.01: Two papers on “Causal reinforcement learning” are accepted by ICLR 2025.
2024.12: One paper on “Multi-agent reinforcement learning” is accepted by AAAI 2025 (Oral).
2024.10: One paper on “Bayesian Optimization” is accepted by the Journal of Software (in Chinese).
2024.06: One paper on “Causal Reinforcement Learning” is accepted by ICML 2024 Workshop: Foundations of Reinforcement Learning and Control.
2024.04: One paper on “eXplainable Reinforcement Learning” is accepted by Chinese Journal of Computers (in Chinese).
2024.04: One paper on “Multi-agent reinforcement learning” is accepted by the Journal of Software (in Chinese).
2023.12: Two papers on “Multi-agent reinforcement learning” are accepted by ICASSP 2024.
2023.10: I hosted the reinforcement learning algorithm session on ECAI-2023.
2023.09: I received the excellent master’s degree thesis from Northwestern Polytechnical University.
2023.07: One paper on “Reinforcement learning” is accepted by ECAI 2023.

📝 Selected Publications

SafeDialBench: A Fine-Grained Safety Benchmark for Large Language Models in Multi-Turn Dialogues with Diverse Jailbreak Attacks
Hongye Cao, Yanming Wang, Sijia Jing, Ziyue Peng, Zhixin Bai, Zhe Cao, Meng Fang, Fan Feng, Boyan Wang, Jiaheng Liu, Tianpei Yang, Jing Huo, Yang Gao, Fanyu Meng, Xi Yang, Chao Deng, Junlan Feng.
ICLR, 2026
Project Page GitHub
Efficient Reinforcement Learning with Semantic and Token Entropy for LLM Reasoning
Hongye Cao, Zhixin Bai, Ziyue Peng, Boyan Wang, Tianpei Yang, Jing Huo, Yuyao Zhang, Yang Gao
arXiv, 2025
Papar Code
Model-Based Offline Reinforcement Learning with Adversarial Data Augmentation
Hongye Cao, Fan Feng, Jing Huo, Shangdong Yang, Meng Fang, Tianpei Yang, and Yang Gao
T-NNLS, 2026
Causality-Aware Efficient Exploration for Cooperative Multi-Agent Reinforcement Learning
Hongye Cao, Tianpei Yang, Fan Feng, Hammadi Rafik Ouariachi, Yali Du, Meng Fang, Jing Huo, and Yang Gao.
AAAI, 2026
Causal Action Empowerment for Efficient Reinforcement Learning in Embodied Agents
Hongye Cao, Fan Feng, Jing Huo, and Yang Gao.
SCIENCE CHINA Information Sciences (SCIS), 2025
Paper Code
Towards Empowerment Gain through Causal Structure Learning in Model-Based RL
Hongye Cao, Fan Feng, Meng Fang, Shaokang Dong, Tianpei Yang, Jing Huo, and Yang Gao.
ICML 2024 Workshop: Foundations of Reinforcement Learning and Control, 2024
ICLR, 2025
OpenReview Project Page
Causal Information Prioritization for Efficient Reinforcement Learning
Hongye Cao, Fan Feng, Tianpei Yang, Jing Huo, and Yang Gao.
ICLR, 2025
OpenReview Project Page
A Survey of Interpretability Research Methods for Reinforcement Learning
Hongye Cao, Xiao Liu, Shaokang Dong, Shangdong Yang, Jing Huo, Wenbin Li, Yang Gao.
Chinese Journal of Computers, 2024
PDF
Enhancing OOD Generalization in Offline Reinforcement Learning with Energy-Based Policy Optimization
Hongye Cao, Shangdong Yang, Jing Huo, Xingguo Chen, Yang Gao.
European Conference on Artificial Intelligence (ECAI), 2023 (Acceptance Rate: 24%=391⁄1631)
PDF

🍀 Projects

National Natural Science Foundation for Ph.D. students: “Research and Application of Reinforcement Learning Integrating Causal Discovery”

🏆 Honors and Awards

2025.12, Young Scientific and Technological Talents Cultivation Project Doctoral Program, China Association for Science and Technology
2025.10, National Scholarship for Graduate Students, Nanjing University
2025.10, Outstanding Scientific Research and Innovation Project, Nanjing University
2023.09, Excellent Master’s Thesis of Northwestern Polytechnical University
2023.06, Outstanding Graduates of Shaanxi Province
2023.04, Outstanding Graduate Representative of Northwestern Polytechnical University
2022.11, Northwestern Polytechnical University Graduate Model Candidate
2022.10, National Scholarship for Graduate Students, Northwestern Polytechnical University
2021.10, National Scholarship for Graduate Students, Northwestern Polytechnical University
2021.08, National Second Prize of the 10th China Software Cup Competition (21/5543)
2019.10, China Aerospace Science and Technology Corporation second-class scholarship

👨‍🎓 Educations

2023.09 - now, Ph.D student, Department of Computer Science and Technology, Nanjing University, Nanjing.
2020.09 - 2023.04, Master, School of Software, Northwestern Polytechnical University, Xi’an.
2016.09 - 2020.06, Bachelor, School of Software, Northwestern Polytechnical University, Xi’an.

💬 Reviewer

ICML-26, ICLR-26, AAAI-26, ACL ARR
IEEE Transactions on Knowledge and Data Engineering, IEEE Transactions on Neural Networks and Learning Systems, IEEE Transactions on Artificial Intelligence, Knowledge and Information Systems, Journal of Selected Topics in Applied Earth Observations and Remote Sensing.
软件学报

💬 Chair

Session Chair: ECAI 2023 Session:Reinforcement Learning Algorithms

💬 Talk

2025.08, CCF-AI

Hongye Cao

🔥 News

📝 Selected Publications

🍀 Projects

🏆 Honors and Awards

👨‍🎓 Educations

💬 Reviewer

💬 Chair

💬 Talk