Group
Starting in October 2026, I will join the Department of Language Science and Technology at Saarland University on the Saarland Informatics Campus in Saarbrücken 🇩🇪 as a tenure-track professor, where I am building a new research group. I will additionally serve as a scientific director at the German Research Center for Artificial Intelligence (DFKI).
My group studies large language models (LLMs), the technology behind tools like ChatGPT and Claude that millions of people now use every day. Today, almost every real-world LLM application relies on adaptation: taking a pre-trained model and modifying it for new domains, languages, modalities, or to make it safer and better aligned. Yet we still understand surprisingly little about when, why, and how adaptation actually works. Our goal is to turn adaptation into a principled science and use that understanding to build more reliable and adaptable AI systems. This is part of a longer-term vision for systems that adapt transparently, robustly, and continuously over time.
Research themes
Concretely, we work along three directions:
- Interpretability. How do LLMs actually work, and how does adaptation change them? I want to develop interpretability methods that reveal how models represent knowledge and reach their decisions, and use them to make models more transparent and easier to adapt.
- Generalization. Models excel on benchmarks but can still fail in the real world. I want to understand when and why models generalize, and build ones we can trust when they face data outside their training distribution, whether it comes from new tasks, different domains, or harder reasoning problems. That means taking evaluation seriously and designing controlled experiments that show what a model has really learned, so we can tell genuine understanding apart from shortcuts that only look like it.
- Continual learning. Most models are frozen the moment training ends. I am excited about building systems that keep learning: models that absorb new knowledge during deployment, update or remove outdated facts, and improve from interaction without expensive retraining.
For examples of my previous work on these themes, see my Papers.
Broader perspective
Our field is moving incredibly fast, and the hype around LLM-based technology reaches new heights almost daily. For academic researchers, this can be, and often is, a frustrating situation. How can we still have an impact when the most capable and widely used models are developed behind closed doors in industry anyway? Personally, I strongly believe that academic research plays a crucial role in advancing our understanding of AI technology. The research I care about most usually starts by asking a why question, challenges assumptions the community takes for granted, and involves running careful experiments to check whether things actually work the way we think. A lot of my favorite projects started exactly like that, by poking at a widely held belief and finding the full picture was more complicated. I want to build a group that follows this philosophy and becomes known for rigorous science that truly broadens our understanding.
The other thing I care deeply about is having a social lab. For me, research is a team effort, and I have benefited greatly from outstanding collaborators throughout my career. The best ideas usually come from people thinking out loud together, sharing work early, and helping each other when things get stuck. I want to create an environment where people feel comfortable admitting when they don't know something yet and asking for help. So I'm looking for people who genuinely enjoy working in a team and want to collaborate openly, both inside and outside the group.
Joining the group & getting in touch
I am recruiting! I have funding for one PhD student (starting October 2026) and research interns, with more PhD funding expected next year. If you have a background in machine learning, NLP, or a related area and are excited about these topics, fill out this form to express your interest.
For collaborations or other questions, email me at marius.mosbach@mila.quebec.