Bartosz Cywiński's picture

Bartosz Cywiński

bcywinski

·

https://cywinski.github.io/

AI & ML interests

Mechanistic Interpretability

Recent Activity

updated a model 2 days ago

bcywinski/Olmo-3-32B-Base-SAE

published a model 3 days ago

bcywinski/Olmo-3-32B-Base-SAE

updated a model 14 days ago

bcywinski/DeepSeek-R1-Distill-Llama-70B-saes

View all activity

Organizations

None yet

authored a paper 5 months ago

Eliciting Secret Knowledge from Language Models

Paper • 2510.01070 • Published Oct 1, 2025 • 6

authored a paper 9 months ago

Towards eliciting latent knowledge from LLMs with mechanistic interpretability

Paper • 2505.14352 • Published May 20, 2025 • 9

authored 2 papers about 1 year ago

Precise Parameter Localization for Textual Generation in Diffusion Models

Paper • 2502.09935 • Published Feb 14, 2025 • 12

SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders

Paper • 2501.18052 • Published Jan 29, 2025 • 8