Papers - Interpretability
updated
Prompt-to-Prompt Image Editing with Cross Attention Control
Paper
• 2208.01626
• Published
• 3
BERT Rediscovers the Classical NLP Pipeline
Paper
• 1905.05950
• Published
• 3
A Multiscale Visualization of Attention in the Transformer Model
Paper
• 1906.05714
• Published
• 2
Analyzing Transformers in Embedding Space
Paper
• 2209.02535
• Published
• 3
LVLM-Intrepret: An Interpretability Tool for Large Vision-Language
Models
Paper
• 2404.03118
• Published
• 25
The Geometry of Categorical and Hierarchical Concepts in Large Language
Models
Paper
• 2406.01506
• Published
• 3
Confidence Regulation Neurons in Language Models
Paper
• 2406.16254
• Published
• 10
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting
Rare Concepts in Foundation Models
Paper
• 2411.00743
• Published
• 7
Do I Know This Entity? Knowledge Awareness and Hallucinations in
Language Models
Paper
• 2411.14257
• Published
• 14
Paper
• 2412.09764
• Published
• 5