Hi, I'm Hantao Lou!

I am a final-year undergraduate student ('27) at Yuanpei College, Peking University, and a member and the monitor of Tong Class, a pilot class in Artificial Intelligence.

I work on large language model alignment at the Alignment and Interaction Lab and the Center of AI Safety and Governance, Institute of Artificial Intelligence, Peking University, advised by Prof. Yaodong Yang. I was also a scholar at the MATS program, where I worked on interpretability of large language models under Evan Hubinger.

My research aims to provide and utilize supervision signals for scalable AI alignment. I have contributed to six papers, including three first/co-first author works, across venues including ICML, ACL, NeurIPS, and AAAI, and I serve as a reviewer for major machine learning conferences. I am also an active open-source contributor, creating and maintaining alignment and formal-verification projects with 4,500+ stars.

Most of my current research is driven by trying to find an answer to the question of:

How to make advanced, complex AI systems reliable and scalable?

To answer this question, I'm exploring the intersection of large language model alignment, mechanistic interpretability, scalable oversight, and formal verification.

Please feel free to get in touch!