Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: Seqev/dcr-attention_v3

v3.1-data — Scope memo + canonical measurement artifacts

28 May 16:19
@Seqev Seqev
4d24e34
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

Pre-release data drop for DCR-Attention v3.1. Scope memo + canonical
measurement artifacts. Full manuscript is a separate forthcoming release.

Hero result (N=32K, B=4, c=0.15):

  • M6 + M5-mixed: 187.29 ms — ×ばつ over SDPA, ×ばつ over M4 baseline
  • Parity crossing: M4 was ×ばつ (sub-parity); v3.1 now above SDPA
  • Clean theoretical ceiling: ×ばつ
  • Canonical protocol: 50 warmup, 30 timed, 3 randomized sessions (hero variance 0.098%)

Contents:

  • docs/paper_rewrite_scope_memo.md — paper scope, incl. retraction ledger (§5)
  • results/ — canonical measurements + falsification artifacts
  • 8 characterized negative results (see README)

Note: this drop documents process, not finished paper. Two intermediate
findings (Pass-2 inflation, Pass-3 deflation) were retracted pre-publication
via canonical re-measurement — kept on record in §5 as a discipline ledger.

Env: Llama-3.2-1B · RTX 4060 Ti · torch 2.5.1+cu121 · triton 3.1.0 · seed 0

Assets 2
Loading

DCR-attention: Top-K Sparse Attention for Long-Context Decode on Llama-3.2-1B (v3 Release)

25 May 20:22
@Seqev Seqev
d70d0a4
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

DCR-attention v3.0.0 — Initial public release

Top-K sparse attention for long-context decode on Llama-3.2-1B. Multi-seed
validated, pre-registered DESCRIPTIVE causal verdict, honest sub-parity
systems characterization.

Headline result

Multi-seed hero deployment point: N = 32,000 context, c = 0.15,
ΔPPL = +0.428% ± 0.096 pp (5 seeds; STRICT classification). Latency on
RTX 4060 Ti at hero point: ×ばつ SDPA — HeroQualityOnly (quality
multi-seed validated, speedup partial).

What's included

  • paper/ — v3 paper (PDF + LaTeX source, 31 pages, compiled clean)
  • dcr_attention/ — M4 Triton kernel + reference + Llama integration
  • tests/ — acceptance tests, integration tests, 6 causal-pilot iteration scripts
  • scripts/ — analysis scripts (Wilcoxon, Mann-Whitney, dose-response), figure generators
  • data/raw/ — curated per-seed measurement JSONs cited by paper

Key claims

  • Empirical scaling law ΔPPL(N, c) = A(c) · N^(-(1-α_eff(c))), coverage-dependent
  • Pre-registered matched-magnitude causal test: DESCRIPTIVE (Wilcoxon p=0.76,
    Mann-Whitney U bias check p=0.27, generalizable within tested protocol)
  • Random-spectrum baseline locates α<1 as trained property (α_random=1.0000
    vs α_trained=0.39 on 5 untrained-K seeds)
  • HeroQualityOnly mechanistically explained: Pass-3 dominance at large ×ばつB
  • ABKV architectural response demonstrated as synthetic-data feasible;
    end-to-end speedup is explicit future work

Release context

v1.0 and v2.0 Zenodo DOIs were deleted as a sober reset prior to v3 — the
single-seed v2.0 numerical headline was reproduced exactly on seed 0 in
the multi-seed re-validation but sits at the high end of the seed
distribution; the 5-seed mean places it ×ばつ lower, consistent with a
tier-boundary measurement that benefits from multi-seed protocol.

License

Apache 2.0

Citation

See CITATION info on Zenodo (DOI badge will appear in README after this
release).

Loading

AltStyle によって変換されたページ (->オリジナル) /