Hey, I'm Thomas. I work on the life sciences team at Goodfire, doing interpretability: the scienceread dark art of understanding what AI models have learned. These models outperform human experts in tasks ranging from prose to games without ever being told what or how. In short, I aim to answer:
What do AI models know that we don't, and how do we pry it out?
Models think in numbers and geometries; we think in concepts and compositions. My research interfaces the two by treating networks as compositional systems whose parts interact and recombine, reading their geometric structure directly instead of shoehorning it into simplistic features. My PhD built the mathematical foundations. Now I'm pointing them at open problems in the life sciences.
When I'm not researching, you'll find me on a long bike ride, gaming
doing the computer stuff since ...
with friends, skiing, wearing shorts, or some
combination
It's a sunny day!
. I used to play a lot of Diablo 3, reaching top-1suck it Elon on the grift leaderboards.
Papers.
Bilinear Autoencoders
An autoencoder architecture that decomposes activations into polynomial manifolds.
Tensor Similarity
A weight-based similarity metric for neural networks, invariant to weight symmetries.
Compositional Interpretability
A category-theoretic framework for formalizing mechanistic explanations of neural networks.
ClinVar Variant Prediction
Predicting and explaining the pathogenicity of 4.2M genomic variants from ClinVar.
Introducing SimpleStories
A synthetic dataset of diverse yet simple stories for training small language models.
Deep Interpretable Models
Scaling up weight-based interpretability from single layers to deep multi-layer models.
Bilinear MLPs
Using bilinear MLPs to reverse-engineer image and language models from their weights.
Tokenized SAEs
Using a per-token bias in SAEs to separate token reconstructions from interesting features.
The Trifecta
Three techniques that improve the forward-forward algorithm, achieving 84% on CIFAR-10.