Pinned Loading
-
sycophancy-construct-validity
sycophancy-construct-validity PublicPython
-
safety-concept-vectors
safety-concept-vectors PublicExtracting and validating safety concept vectors (eval-awareness, deception, sycophancy, etc.) from open-weight LLMs — extending Anthropic's emotion vectors methodology to alignment-critical concepts
Python 1
-
eval-awareness-detection
eval-awareness-detection PublicMechanistic detection of eval-awareness in language models via representation engineering
Python 1
-
does-quantization-kill-interpretability
does-quantization-kill-interpretability PublicDoes Quantization Kill Interpretability? Scaling study across 5 models (124M-2.8B): RTN destroys induction heads in small models, GPTQ preserves them at all scales.
Python 1
-
gptq-from-scratch
gptq-from-scratch PublicGPTQ post-training quantization from scratch — GPT-2, OPT, LLaMA support
Jupyter Notebook 1
-
pcm-bitslicing
pcm-bitslicing PublicPython 1
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.