How Sparse AutoEncoders got me fascinated with LLMs

Like me, you probably have never thought of dissecting LLMs to figure out what really goes on under that dark mess.

But for some reason this video popped up in my feed and for some other reason, I clicked on it. And thus started the fascination with LLM interpretability.

https://www.youtube.com/watch?v=UGO_Ehywuxc

In my opinion, the actual reason behind this peaked interest is the fact that now I can actually get hands on with the surgical tools provided to me by https://www.neuronpedia.org/

Let’s see where it takes me cuz LLMs are fucking interesting.