Papers

Interpretable variant effect prediction from genomic foundation models

Predicting and explaining pathogenicity on 4.2M genomic variants.

bioRxiv
Finding manifolds with bilinear autoencoders

Autoencoder architecture that decomposes activations into polynomial manifolds

NeurIPS'25
workshop spotlight
Parameterized synthetic text generation with SimpleStories

A synthetic dataset containing diverse but simple stories

NeurIPS'25
Compositionality unlocks deep interpretable models

Scaling up weight-based interpretability to multi-layer models

AAAI'25
workshop
Bilinear MLPs enable weight-based mechanistic interpretability

Using bilinear MLPs to reverse-engineer image and language models from their weights

ICLR'25
spotlight
Tokenized SAEs: disentangling SAE reconstructions

Using a per-token bias in SAEs to separate token reconstructions from interesting features

ICML'24
workshop
The trifecta: three techniques for deeper forward-forward networks

Three techniques to significantly improve the forward-forward algorithm achieving 84% on CIFAR-10

TMLR

Presentations