About

Publications

Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors - NeurIPS 2023, Spotlight Poster
Challenging Error Correction in Recognized Byzantine Greek - ML4AL 2024, Best Paper
Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs - IEEE SaTML 2024
Scaling Sparse Feature Circuits For Studying In-Context Learning - ICML 2025, poster
Interpreting Large Text-to-Image Diffusion Models with Dictionary Learning - CVPR 2025 Mechanistic Interpretability for Vision Workshop, poster

Preprints

Transcoders Beat Sparse Autoencoders for Interpretability
- Work done at EleutherAI.

Blog posts