Hey, I'm Thomas. I work on the life sciences team at Goodfire. My research focuses on interpretability for scientific discovery: the scienceread dark art of understanding what models have learned.
Deep networks encode rich knowledge from semantic relationships in language models to hidden mechanisms in bio-models. I treat them as compositional systems whose parts interact and recombine. Foundation models should be structural priors for our goals, not shoehorned into simplistic features.
The goal of my research is to interface between how models think (numbers and geometries) and how humans think (concepts and compositions). My PhD built the mathematical foundations and now I'm scaling these tools to open scientific problems.
When I'm not researching, you'll find me on a long bike ride, gaming with friends, skiing,
wearing shorts, or
some combination
It's a sunny day!
. I used to play a lot of Diablo 3, reaching top-1suck it Elon on the grift leaderboards.
Papers.
Bilinear Autoencoders
Autoencoder architecture that decomposes activations into polynomial manifolds
Tensor Similarity
A weight-based similarity metric invariant to weight-space symmetries
Compositional Interpretability
A category-theoretic framework for formalizing mechanistic explanations of neural networks
ClinVar variant prediction
Predicting and explaining pathogenicity on 4.2M genomic variants.
Introducing SimpleStories
A synthetic dataset containing diverse but simple stories
Deep Interpretable Models
Scaling up weight-based interpretability to multi-layer models
Bilinear MLPs
Using bilinear MLPs to reverse-engineer image and language models from their weights
Tokenized SAEs
Using a per-token bias in SAEs to separate token reconstructions from interesting features
The Trifecta
Three techniques that improve the forward-forward algorithm, achieving 84% on CIFAR-10