Xin Chen, Cynthia

Hi! I’m currently a Direct PhD student at ETH Zurich, jointly supervised by Prof. Andreas Krause and Prof. Florian Tramèr. I am deeply passionate about developing AI solutions that are beneficial to society and align with human values. I previously graduated from The University of Hong Kong, and I also spent some time at UC Berkeley (Center for Human-Compatible AI), Stanford University and Columbia University prior to ETH. You can call me Cynthia or Chen Xin (陈欣). My current research interests span across:

Understanding the science of LLM Alignment,
Learning human values, and
Reward hacking.

I am grateful to be supported by the Open Phil AI Fellowship and the Vitalik Buterin PhD Fellowship for my research.

I believe coordinated global efforts are crucial to mitigate the risks of AI, hence I help organize the International Dialogues on AI Safety. The best way to reach me is through my email xin.chen@inf.ethz.ch.

(More)

I was previously supported by the HKU Foundation Scholarship for my undergrad studies (ranked top 0.02% in Chinese National College Entrance Exam, Gaokao). Out of my concern for poverty, animal, and climate change, I donate a considerable portion of my income to the most effective charities every year. For most of my adulthood, I try to read 50 books a year.

Recent Publications

Learning Safety Constraints for Large Language Models (ICML2025 Spotlight, top 2.6%)

Xin Chen, Yarden As, Andreas Krause

[Paper] [Code]

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback (TMLR, Outstanding Paper Finalist)

Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Wang, Samuel Marks, Charbel-Raphaël Segerie, Micah Carroll, Andi Peng, Phillip Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Bıyık, Anca Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell

[Paper]

Learning Safety Constraints from Demonstrations with Unknown Rewards (AISTATS2024)

David Lindner, Xin Chen, Sebastian Tschiatschek, Katja Hofmann, Andreas Krause

[Paper]

Arch-Graph: Acyclic Architecture Relation Predictor for Task-Transferable Neural Architecture Search (CVPR2022)

Minbin Huang, Zhijian Huang, Changlin Li, Xin Chen, Hang Xu, Zhenguo Li, Xiaodan Liang

[Paper]

An Empirical Investigation of Representation Learning for Imitation (NeurIPS2021) Undergraduate thesis

Xin Chen*, Sam Toyer*, Cody Wild*, Scott Emmons, Ian Fischer, Kuang-Huei Lee, Neel Alex, Steven H Wang, Ping Luo, Stuart Russell, Pieter Abbeel, Rohin Shah

[Paper] [Code] [Talk]

Exploring Geometry-aware Contrast and Clustering Harmonization for Self-supervised 3D Object Detection (ICCV2021)

Hanxue Liang*, Dapeng Feng*, ChenHan Jiang, Xin Chen, Hang Xu, Xiaodan Liang, Zhenguo Li, Wei Zhang, Luc Van Gool

[Paper]

TransNAS-Bench-101: Improving transferability and Generalizability of Cross-Task Neural Architecture Search (CVPR2021)

Yawen Duan*, Xin Chen*, Hang Xu, Zewei Chen, Xiaodan Liang, Tong Zhang, Zhenguo Li

[Paper] [Benchmark]

CATCH: Context-based Meta Reinforcement Learning for Transferrable Architecture Search (ECCV2020)

Xin Chen*, Yawen Duan*, Zewei Chen, Hang Xu, Zihao Chen, Xiaodan Liang, Tong Zhang, Zhenguo Li

[Paper] [Website]

Xin Chen, Cynthia

Recent Publications

Blog Posts

Suggestions on Research-based Thesis