Hi! I am a researcher working on training dynamics and model alignment at London AI Safety Research Labs with Dima Krasheninnikov (Anthropic / Cambridge PhD). Before that, I was at the Supervised Program for Alignment Research, working on Mixture of Experts Interpretability with theĀ UK AI Security Institute.
My interests include science of deep learning (pre-training dynamics, RL, and Continual Learning), interpretability generalisation to modern architectures (MoE, linear attention, distilled models), and the study of model personas and model organisms as tools for understanding how models learn behaviour.
Previously, I worked for 3 years as a Machine Learning Engineer at Fourth, where I improved Time-Series Models for large restaurant chains, and STRAT7, where I conducted NLP modelling on text reviews.
My Bachelor thesis at Uni Bonn was under Prof. Teodora Boneva on the intersection of Statistics and Economics.
I used to compete in science olympiads: