Expand each heading to view my notes.

What is Mechanistic Interpretability

An Introduction to Circuits

Toy Models of Superposition

Towards Mono-semanticity: Decomposing Language Models with Dictionary Learning