About
Publications
- Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors - NeurIPS 2023, Spotlight Poster
- Challenging Error Correction in Recognized Byzantine Greek - ML4AL 2024, Best Paper
- Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs - IEEE SaTML 2024
- Scaling Sparse Feature Circuits For Studying In-Context Learning - ICML 2025, poster
- Interpreting Large Text-to-Image Diffusion Models with Dictionary Learning - CVPR 2025 Mechanistic Interpretability for Vision Workshop, poster
Preprints
- Transcoders Beat Sparse Autoencoders for Interpretability
- Work done at EleutherAI.
Blog posts
- EleutherAI
- Lesswrong