Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
Future Blog Post
This post will show up by default. To disable scheduling of future posts, edit config.yml
and set future: false
.
Blog Post number 4
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
portfolio
Portfolio item number 1
Short description of portfolio item number 1
Portfolio item number 2
Short description of portfolio item number 2
publications
On robust vs fast solving of qualitative constraints
Jan Wehner, Michael Sioutis, Diedrich Wolter in Journal of Heuristics, 2023
This paper introduces the notion of Robustness to Qualitative Contraint Networks and finds a tradeoff between speed and robustness in heuristics for solving QCNs.
Explaining Learned Reward Functions with Counterfactual Trajectories
Jan Wehner, Frans Oliehoek, Luciano Cavalcante Siebert forthcoming in AIEB workshops at ECAI 2024, 2024
We propose a method for explaining reward functions by showing the rewards given to counterfactual trajectories.
Immunization against harmful fine-tuning attacks
Domenic Rosati, Jan Wehner, Kai Williams, Lukasz Bartoszcze, Hassan Sajjad, Frank Rudzicz in Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
LLMs can be fine-tuned with harmful data to remove their safeguards. We formalize the problem and set out conditions for a solution.
Representation Noising: A Defence Mechanism Against Harmful Finetuning
Domenic Rosati, Jan Wehner, Kai Williams, Łukasz Bartoszcze, David Atanasov, Robie Gonzales, Subhabrata Majumdar, Carsten Maple, Hassan Sajjad, Frank Rudzicz in NeurIPS, 2024
We propose Representation Noising which prevents harmful fine-tuning by removing harmful representations.
Safety is Essential for Responsible Open-Ended Systems
Ivaxi Sheth, Jan Wehner, Sahar Abdelnabi, Ruta Binkyte, Mario Fritz forthcoming in SSI-FM ICLR 2025 Workshop, 2025
Open-ended AI is a growing paradigm where AI continuously explores novel and interesting artifacts. This position paper describes specific safety challenges in Open-Ended AI and how they can be mitigated.
Taxonomy, Opportunities, and Challenges of Representation Engineering for Large Language Models
Jan Wehner, Sahar Abdelnabi, Daniel Tan, David Krueger, Mario Fritz in arXiv preprint, 2025
This survey paper reviews the literature on Representation Engineering, a technique for controlling LLMs through their internal representations. We set out a unifying taxonomy, describe methods and applications and showcase weaknesses and opportunities.
talks
Talk 1 on Relevant Topic in Your Field
This is a description of your talk, which is a markdown file that can be all markdown-ified like any other post. Yay markdown!
Conference Proceeding talk 3 on Relevant Topic in Your Field
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
teaching
Teaching experience 1
This is a description of a teaching experience. You can use markdown like any other post.
Teaching experience 2
This is a description of a teaching experience. You can use markdown like any other post.