Hello, I’m Jan Wehner a PhD student at CISPA Helmholtz Center for Information working on AI Safety. I believe AI has enoromous potential but also poses grave risks that need to be addressed urgently. Thus I researsch methods for improving the safety of frontier AI systems. I am especially interested in developing methods to understand, monitor and control the behavior of LLMs. I’ve done research on Representation Engineering, Harmful fine-tuning attacks and Inverse Reinforcement Learning.

My PhD is supervised by Prof. Mario Fritz. As part of the ELLIS PhD Program I am co-supervised by Prof. David Krueger at MILA. I’ve previously obtained my master’s degree at TU Delft and a bachelor’s degree at Otto-Friedrich-University Bamberg.

Please don’t hesitate to reach out ot me at jan.wehner@cispa.de!