Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

logit-lens

Here are 12 public repositories matching this topic...

Open-source EU AI Act Annex IV documentation toolkit. Mechanistic interpretability + circuit discovery for transformers. One function call generates a structured, hash-chained evidence package.

  • Updated Jun 15, 2026
  • Python

We optimize a compact latent state (frozen weights) to force failed multi-hop chains to output the missing answer D. 5 pre-registered controls show it simply injects D: carries it without the code-fact, leaves intermediates invisible, inert to hop corruption, and doesn’t transfer. No latent composition at 3B (Llama-3.2-3B, Qwen2.5-3B).

  • Updated Jun 4, 2026
  • Python

Improve this page

Add a description, image, and links to the logit-lens topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the logit-lens topic, visit your repo's landing page and select "manage topics."

Learn more

AltStyle によって変換されたページ (->オリジナル) /