Xin Chen, Cynthia

Xin Chen, Cynthia

Hi! I’m currently a Direct PhD student at ETH Zurich, supervised by Prof. Andreas Krause. I am deeply passionate about developing AI solutions that are beneficial to society and align with human values. I previously graduated from The University of Hong Kong, and I also spent some time at UC Berkeley (Center for Human-Compatible AI), Stanford University and Columbia University prior to ETH. You can call me Cynthia or Chen Xin (陈欣). My current research interests span across:

  • Improving robustness and reliability in decision making algorithms,
  • Learning the right human preferences, and
  • Reward hacking.

I am grateful to be supported by the Open Phil AI Fellowship and the Vitalik Buterin PhD Fellowship for my research.

To help with growing the AI alignment research field, I am among the main organizers of SafeAI workshop at AAAI and AISafety workshop at IJCAI. The best way to reach me is through my email


I was previously supported by the HKU Foundation Scholarship for my undergrad studies (ranked top 0.02% in Chinese National College Entrance Exam, Gaokao). Out of my concern for poverty, animal, and climate change, I donate a considerable portion of my income to the most effective charities every year. I enjoy reading, piano, and staying in the nature outside of my academic pursuits.


  • Oct 2023: I helped facilitate the first International Dialogues on AI Safety at Ditchley Park, Oxford with Profs. Yoshua Bengio, Andrew Yao, Stuart Russell, and Ya-Qin Zhang. We published a statement here. I am honored to have authored the Chinese statement.

  • Nov 2023: Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback got accepted to TMLR! Many thanks to Casper for leading the effort.

  • Dec 2023: I will be at the Alignment Workshop and at NeurIPS. We are also organizing the Socially Responsible Language Modelling Research Workshop on Dec 16. Come find us at SoLaR, or feel free to shoot me an email if you’re interested in chatting.

Recent Publications

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback (TMLR)

Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Wang, Samuel Marks, Charbel-Raphaël Segerie, Micah Carroll, Andi Peng, Phillip Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Bıyık, Anca Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell


Learning Safety Constraints from Demonstrations with Unknown Rewards (AISTATS2024)

David Lindner, Xin Chen, Sebastian Tschiatschek, Katja Hofmann, Andreas Krause


Arch-Graph: Acyclic Architecture Relation Predictor for Task-Transferable Neural Architecture Search (CVPR2022)

Minbin Huang, Zhijian Huang, Changlin Li, Xin Chen, Hang Xu, Zhenguo Li, Xiaodan Liang


An Empirical Investigation of Representation Learning for Imitation (NeurIPS2021) Undergraduate thesis

Xin Chen*, Sam Toyer*, Cody Wild*, Scott Emmons, Ian Fischer, Kuang-Huei Lee, Neel Alex, Steven H Wang, Ping Luo, Stuart Russell, Pieter Abbeel, Rohin Shah

[Paper] [Code] [Talk]

Exploring Geometry-aware Contrast and Clustering Harmonization for Self-supervised 3D Object Detection (ICCV2021)

Hanxue Liang*, Dapeng Feng*, ChenHan Jiang, Xin Chen, Hang Xu, Xiaodan Liang, Zhenguo Li, Wei Zhang, Luc Van Gool


TransNAS-Bench-101: Improving transferability and Generalizability of Cross-Task Neural Architecture Search (CVPR2021)

Yawen Duan*, Xin Chen*, Hang Xu, Zewei Chen, Xiaodan Liang, Tong Zhang, Zhenguo Li

[Paper] [Benchmark]

CATCH: Context-based Meta Reinforcement Learning for Transferrable Architecture Search (ECCV2020)

Xin Chen*, Yawen Duan*, Zewei Chen, Hang Xu, Zihao Chen, Xiaodan Liang, Tong Zhang, Zhenguo Li

[Paper] [Website]

Blog Posts

Suggestions on Research-based Thesis