I am a postdoctoral researcher at Mila - Quebec AI Institute and a postdoctoral fellow at McGill University in Montrรฉal ๐จ๐ฆ. Before that, I did my PhD in Computer Science at Saarland University in Saarbrรผcken ๐ฉ๐ช.
The goal of my research is to enable reliable, controllable, and adaptable AI systems, particularly the Large Language Models (LLMs) that millions interact with daily. LLMs require significant adaptation (also known as fine-tuning or post-training) to become specialized, safe, and aligned with specific requirements after pre-training. My research program centers on building a fundamental, scientific understanding of this crucial adaptation stage.
In addition to model adaptation, I am broadly interested in the interpretability of LLMs โ and in particular in how to make interpretability research more actionable.
For our work, my collaborators and I have received a Best Paper Award ๐ at COLING 2022, the Best Theme Paper Award ๐ at ACL 2023, and the Most Interesting Paper Award ๐ at the BabyLM Challenge 2023.
Latest News
-
Check out our new preprint on forecasting downstream performance of LLMs with proxy metrics ๐.
-
Three papers accepted to ICML 2026 ๐ฐ๐ท: LatentLens, Operationalising the Superficial Alignment Hypothesis, and the position paper Interpretability Can Be Actionable.
-
Two papers accepted to ACL 2026 ๐บ๐ธ: Do Generalisation Results Generalise? and CLaS-Bench: A Cross-Lingual Alignment and Steering Benchmark.
-
Two new preprints: LLM2Vec-Gen proposes a self-supervised method that produces embeddings directly in an LLM's output space, and The Illusion of Superposition? investigates whether language models actually leverage superposition when reasoning with latent chain-of-thoughts.
