Hi, I'm Thomas! I'm a PhD student working on compositional interpretability. I'm interested in understanding how neural networks encode complex behaviours and knowledge.
Modern deep networks are astonishingly intricate and difficult to disentangle. I argue that we should embrace this complexity rather than shoehorn them into simplistic feature sets, flattening the very structures we hope to understand.
My research treats a model's weights as a compositional system whose parts interact and recombine, much like phrases in a language. By developing flexible, formally grounded tools for compositional analysis, I aim to reveal how rich mechanisms and representations arise and how we can reason about them.
Outside of work, I love to go on long bike rides, game with friends, go skiing and wear shorts. I used to play a lot of Diablo 3, achieving rank 1 in grift leaderboards at some point.
New interactive visualization for manifolds
A glimpse into the geometry of large language model activations.
Spotlight at the Mechanistic Interpretability workshop
Our paper on bilinear autoencoders is a spotlight paper!
Bilinear MLPs paper spotlight at ICLR'25
Our paper on weight-based interpretability was accepted as a spotlight presentation at ICLR'25!