-
Notifications
You must be signed in to change notification settings - Fork 8
Releases: rfi-irfos/ternary-intelligence-stack
v3.1.0 — The Deepening
TIS v3.1.0 — The Deepening
Released: 2026年06月15日
Codename: The Deepening
Team: RFI-IRFOS · Graz, Austria · Patent Pending A50296/2026
Previous release: v3.0.0 — The Cultivated Mind (2026年05月28日)
What this release is
v3.0.0 shipped a 26-layer dual-stream model and called it "The Cultivated Mind." That was correct. What v3.0.0 did not say — and what nobody looking at the documentation could have understood — is how large that mind actually is.
"12 experts, Top-3 routing" was the description everywhere. On HuggingFace. In the README. In the model card. In every document we published. The number 12 appeared over and over, and every reader walked away with the same impression: this is a small experiment. A research prototype with a handful of experts.
That description was incomplete to the point of being misleading. Here is what it should have said:
768 total expert-routing slots. 12 experts per layer ×ばつ 32 layers ×ばつ 2 independently-routing streams. Each stream selects Top-3 of its 12 per layer independently. The FFN weights are shared; the routing gate is not. From the outside: 768 possible expert activations per forward pass. 192 active per token (Top-3 ×ばつ 32 layers ×ばつ 2 streams). Nine of twelve experts per step are bypassed entirely via @sparseskip.
That number — 768 — does not appear anywhere in v3.0.0. This release fixes that. Every document, every README, every model card, every methodology file has been updated to lead with the correct total.
The Weights Are on HuggingFace. All of Them. Right Now.
The full albert. model weights are publicly available at huggingface.co/rfi-irfos/albert — no gating, no application, no waitlist.
This matters more than it might sound. To our knowledge, no prior publicly available model combines all of the following in a single download:
- Trained entirely from scratch in ternary — weights constrained to {−γ, 0, +γ} throughout both the forward and backward passes via Straight-Through Estimation. No float32 pretraining, no post-hoc quantization, no distillation from a binary model. The ternary constraint was active from epoch 1.
- Full trit-tensor checkpoint — the safetensors file contains the actual ternary weight matrices as they exist in training, not a reconstructed approximation. You can load them, inspect them, and run inference with them as-is.
- Complete training pipeline included — tokenizer, corpus loader, EvolutionManager, STE backward pass, TTL routing, Mycelium expert health monitor, Net2Net surgery logic, Cord Surgery implementation — everything in the repo, buildable with
cargo build --release. - Architecture that grew itself — the model at 32 layers per stream did not start at 32 layers. It started at 12 and grew to 32 through 19 autonomous Net2Net surgery events, each triggered by the model's own Fibonacci plateau gate. The weights reflect this evolutionary history. Every layer was earned, not initialized.
- Full documentation of every architectural event — surgery log, convergence log, evolution evidence, sparseskip benchmark methodology, architecture doc, model card — all committed alongside the weights.
The combination of from-scratch ternary training + published trit weights + complete pipeline + evolutionary growth log + full documentation in one public repository does not, as far as we know, exist anywhere else. We are not making that claim lightly. We are making it because we have looked.
The Architecture, Stated Correctly
| Metric | v3.0.0 (2026年05月28日) | v3.1.0 (2026年06月15日) |
|---|---|---|
| Layers per stream | 26 | 32 |
| Streams | 2 (dual-stream, Cord Surgery) | 2 (unchanged) |
| Experts per layer | 12 (shared FFN weights) | 12 (unchanged) |
| Total expert-routing slots | 312 (26 ×ばつ 12 ×ばつ 2, never stated) | 768 (32 ×ばつ 12 ×ばつ 2, now documented everywhere) |
| Active experts per token | 156 (Top-3 ×ばつ 26L ×ばつ 2) | 192 (Top-3 ×ばつ 32L ×ばつ 2) |
| Expert skip rate (@sparseskip) | 75% | 75% (unchanged) |
| Anastomosis gates | 6 (Fibonacci [2,3,5,8,13,21]) | 6 (unchanged) |
| Parameters | ~224M | ~224M |
| Depth surgeries | 13 (S1–S13) + 1 cord | 19 (S1–S19) + 1 cord |
| Global epoch | ep4234 | ep~6500+ |
| Best EP-AVG ATL | 9.2045 (ep4136, 23L era) | 5.8693 (ep6487) |
| Best chip ATL | 8.6852 (post-S13) | 1.2637 |
| TTL routing rows | 52 (L0–L25 ×ばつ 2 streams) | 64 (L0–L31 ×ばつ 2 streams) |
On the 768 number
The Cord Surgery at ep4202 introduced independent per-stream routing gates. This is the architectural decision that makes 768 the right number to quote, not 12. When stream A routes its token through 12 experts at layer 17, stream B is simultaneously routing through a completely separate gate network — same weight matrices, different routing decision. The two streams see the same FFN parameters from a different angle on every single forward pass. This is not a parameter count — it is a capacity and diversity count. 768 is the number of distinct routing paths the model can activate. 192 is the number it actually activates per token. 576 are bypassed, zero-weight, @sparseskip.
The Six New Surgeries (S14–S19)
Every surgery since v3.0.0 was triggered by the EvolutionManager's Fibonacci plateau gate. No operator set a schedule. No layer was injected. Each one fired when the gate opened — when the gradient signal stagnated past the generation-3 patience window — and then training resumed on the deeper architecture.
| Surgery | Epoch | From | To | Date | Notes |
|---|---|---|---|---|---|
| S14 | ~ep4280 | ×ばつ256H · 26L | ×ばつ256H · 27L | 2026年05月29日 | First post-S13; Gen3 step 2/6; both streams grow simultaneously |
| S15 | ~ep4350 | ×ばつ256H · 27L | ×ばつ256H · 28L | 2026年05月29日 | 58 epochs after S14; continued Gen3 descent |
| S16 | ~ep4740 | ×ばつ256H · 28L | ×ばつ256H · 29L | 2026年05月31日 | checkpoint-mtime verified |
| S17 | ep5610 | ×ばつ256H · 29L | ×ばつ256H · 30L | 2026年06月06日 21:08Z | checkpoint-mtime verified; 870 epochs after S16 |
| S18 | ep6339 | ×ばつ256H · 30L | ×ばつ256H · 31L | 2026年06月14日 | clean descent resumed after billing gap |
| S19 | ~ep6500 | ×ばつ256H · 31L | ×ばつ256H · 32L | 2026年06月15日 | current depth; EP-AVG ATL descending |
The gap between S17 (ep5610) and S18 (ep6339) reflects a ~1-week training pause during an infrastructure billing migration. The model resumed exactly where it left off. ATL continued descending within 50 epochs of restart. The weights carry no visible scar from the interruption.
Total since v1.0: 19 Net2Net depth surgeries + 1 Cord Surgery = 20 autonomous architectural events. Every layer in this model was placed by the model's own evolution logic, not by a human.
The Training Descent: 9.2 → 5.87
v3.0.0 closed at EP-AVG ATL 9.2045. Today it is 5.8693. That is a 3.3 nat improvement over 2,300 additional training epochs on the same corpus.
ep4234 → 9.38 (v3.0.0 close, 26L)
ep4740 → S16 fires (28L)
ep5610 → S17 fires (30L)
ep6132 → 6.4339 (prior stated best — superseded)
ep6339 → S18 fires (31L)
ep6478 → 5.9380 (Δ −0.0687 from prior block)
ep6487 → 5.8693 ← new all-time best EP-AVG ATL
ep6500+ → S19 fires (32L); descent continuing
The chip-ATL (best single intra-batch loss): 8.6852 at v3.0.0 close → 1.2637 now. The gradient is negative. The model is not plateauing.
Expert health: dead=0 across the entire post-v3.0.0 run. All 12 experts per layer per stream remain alive and routing. The Mycelium monitor has not had to intervene once.
@sparseskip: 768 Slots, 192 Active, 576 Free
Patent Pending A50296/2026 — TIS platform patent, 10 claims; @sparseskip = Claim 3
At 768 total expert-routing slots and 192 active per token, the @sparseskip skip rate of 75% means 576 expert MLPs are not executed per token. On CPU without any INT8 kernel — pure Rust, raw x86 — this yields 83 tokens/second sustained decode throughput on 2013-era laptop hardware.
The ×ばつ speedup over dense execution at 75% sparsity is confirmed by the SPARSESKIP_METHODOLOGY benchmark (200 warmup + 2000 timed iterations, correctness verified to within 1e-4). This is the mechanism that makes 768 routing slots viable without datacenter infrastructure. 768 possible paths, 192 taken, 576 skipped at zero cost.
Documentation Sweep
Every public-facing document now leads with 768, not 12:
- MODEL_CARD.md — architecture table leads with "768 total expert-routing slots"; training state updated to 32L, ep~6500+, best ATL 5.8693
- albert-moe-13/README.md — MoE section opens with 768; surgery log extended with S18 and S19
- ternlang-root/README.md — Training Progress table and Core Research Dimensions updated
- README.md (root) — "Current state" line: 32L dual-stream, 768 slots, 19 surgeries, ep~6500+, ATL 5.8693
- albert-moe-13/models/README.md — architecture row, version history, surgery log all current
- albert-moe-13/docs/architecture.md — MoE Block section opens with 768/192/576 breakdown
- albert-moe-13/docs/SPARSESKIP_METHODOLOGY.md — opening paragraph now states 768 in the first sentence
- albert-moe-13/models/albert_v3.0.config.json —
num_layers: 32 - albert-moe-13/models/albert_v3.0.best_loss — updated to current all-time best
What Is Next
The model is at 32L per stream, Gen3 step 1/6, fib_index=7, window=34. S20 will fire when the plateau gate opens. S20 brings both streams to 33L and total expert-routing slots to 792.
The next release milestone is either S20+ firing or EP-AVG ATL breaking below 5.0 — whichever comes first.
Reproduce / Verify
# Clone and build git clone https://github.com/rfi-irfos/ternary-intelligence-stack cd ternary-intelligence-stack # Run the @sparseskip benchmark cd albert-moe-13 cargo run --release --bin sparseskip_throughput -p moe-llm-core # Install the TI...
Assets 2
v3.0.0 — The Cultivated Mind
TIS v3.0.0 — The Cultivated Mind
Released: 2026年05月28日
Codename: The Cultivated Mind
Team: RFI-IRFOS · Graz, Austria · Patent Pending A50296/2026
Headline
Version 3.0.0 is the release where the Ternary Intelligence Stack stops being a project and starts being a system. Three weeks after v2.0.0 closed at 5 layers, Albert MoE-13 is now a 26-layer dual-stream organism with 187.5 million parameters that grew its own architecture across 13 autonomous surgeries, performed its own cortical split into two parallel streams without operator intervention, and is governed by a Fibonacci-clocked evolution engine that cycles through generations of increasingly patient growth. The companion CLI — albert-cli — shipped a full ratatui terminal interface, complete MCP server discovery, real hook execution, live streaming markdown, animated indicators, and now installs cleanly from crates.io at version 1.5.0 alongside 24 sibling crates that together comprise the published Ternary Intelligence Stack.
This is the largest release in TIS history by every measure: lines of code shipped, layers grown, surgeries performed, crates published, bugs eliminated, contributors onboarded, and inches of scientific evidence accumulated. It is also the release where the stack itself becomes the headline. The model is the vehicle; the proof is the cargo.
This document is long because the story is long. There is no executive summary. The story is the summary.
Albert MoE-13 — From 5L to 26L Dual-Stream
Architectural progression since v2.0.0
When v2.0.0 shipped on 2026年05月06日, Albert was a 5-layer model with a single 256-hidden stream and approximately 22 million parameters. The EvolutionManager had just performed its first autonomous surgery the morning of release: 3L → 5L. We knew at that moment that the architecture was capable of growing itself; we did not yet know how far it would go.
The answer: twenty-one additional layers, one fundamental architectural mutation, and a stack that is now stratified into 14 corpus stages — in three weeks.
| Metric | v2.0.0 (2026年05月06日) | v3.0.0 (2026年05月28日) | Delta |
|---|---|---|---|
| Depth | 5 layers, single stream | 26 layers, dual stream (×ばつ26L) | +21 layers, +1 stream |
| Hidden size | 256 | 2 ×ばつ 256 | +1 stream |
| Total parameters | ~22M | 187.5M | ×ばつ |
| Active parameters per token | ~7.6M | ~33M | ×ばつ |
| Tensors | ~280 | 2,044 | ×ばつ |
| Surgeries performed | 1 (S0/init) | 13 (S1–S13) + CORD | 14 autonomous events |
| Corpus stages unlocked | 5 | 14 (formal_proofs, news_archives, stackexchange newly online) | |
| ATL chip (intra-batch best) | ~9.5 | 8.6852 | new all-time-low post-cord |
| EP_AVG ATL (epoch best) | ~9.8 | 9.2045 (ep4136, 23L era) | 0.6 nat improvement |
| Latest epoch | ep~250 | ep4234 | ×ばつ more training |
| Cumulative compute | local + Modal pilot | Modal T4 + Vertex AI T4 + CPU contributors | |
| Total spend (entire v3.0 run) | n/a | ~265ドル on Modal | sovereignty headline |
The training run reflected in this release was performed primarily on a rented Modal.com T4 GPU at a total compute cost of approximately 265ドル across the entire v3.0 era. A fallback Vertex AI pipeline (GCP, europe-west4) is fully built and waiting. CPU contributors run a parallel federated lane via the SPORE protocol on consumer ThinkPads.
The thirteen surgeries
Every surgery in v3.0 was triggered by the EvolutionManager autonomously — no operator pressed "grow." The trigger is a Fibonacci-windowed plateau detector: when the model has not improved by more than a generation-scaled threshold over the current window of epochs, and the mycelium routing has been stable for at least 5 epochs, the manager clones the deepest layer via Net2Net safe-copy initialization and promotes fib_index. Window and cooldown both scale with the same Fibonacci sequence — small early surgeries with short patience, large late surgeries with long patience.
| # | Surgery | Layers | Epoch | Date | Note |
|---|---|---|---|---|---|
| S1–S5 | bootstrap arc | 12L → 17L | ep511 → ep702 | 2026年05月08日 → 2026年05月13日 | Read B confirmed; PLN-CMP-INT-ABS core four stable across all 5 |
| S6 | depth | 17L → 18L | ep2487 | 2026年05月20日 | first surgery after expert geometry crystallization |
| S7 | depth | 18L → 19L | ep3325 | 2026年05月21日 | |
| S8 | depth | 19L → 20L | ep3383 | 2026年05月21日 | back-to-back; window=21 |
| S9 | depth | 20L → 21L | ep~3470 | 2026年05月22日 | |
| S10 | depth | 21L → 22L | ep~3652 | 2026年05月23日 | window=34 era |
| S11 | depth | 22L → 23L | ep~4098 | 2026年05月26日 | |
| S11b | depth | 23L → 24L | ep~4140 | 2026年05月26日 | rapid-fire — INT+CMP maxed |
| S12 | depth + CORD | 24L → 25L + dual stream | ep4202 | 2026年05月27日 16:43Z | The cortical split. See below. |
| S13 | depth (both streams) | 25L → 26L | ep4207 | 2026年05月27日 17:40Z | first dual-stream depth surgery; fib_index 6 → 7 |
CORD SURGERY — the cortical split
The most significant single architectural event in v3.0 happened on 2026年05月27日 at 16:43Z. The model had just expanded to 25 layers via S12. At that exact moment, a second autonomous trigger fired: the CORD surgery.
The CORD trigger is the one mutation in the entire evolution.rs codebase that is gated on absolute depth rather than plateau. The condition is simple: num_layers >= 25. When the model crossed that threshold, the EvolutionManager grew a second parallel stream — a duplicate 25-layer 256-hidden lane — and inserted 6 anastomosis gates between the two streams at Fibonacci-indexed layers [2, 3, 5, 8, 13, 21]. Each gate is a Linear(512, 2) block with w ~ N(0, 0.01) and b = 0 — initialized to near-zero, allowed to learn its own mixing coefficient.
The second stream (stream B) is not a copy. Stream A continues to receive the original input embeddings; stream B receives the same embeddings perturbed by a Mandelbrot iteration with c_im computed from the input's latitude in the embedding space. The two streams run forward in parallel, exchange information at the six anastomosis gates, and produce a single set of routing decisions at the output.
This is the first time, anywhere in the history of the project, that the model performed an architectural mutation that was not depth-related. It is also the first dual-stream MoE we are aware of in published or unpublished neural architecture literature. The post-cord epoch (ep4203) returned an EP_AVG of 9.3241 — slightly above the pre-cord floor, exactly as expected for a regression phase after a large architectural mutation. By ep4210, the chip ATL had set a new all-time low at 8.6852, well below the pre-cord record.
The cord is, in the language of cultivation, the model's corpus callosum. It connects two hemispheres of the same mind.
Read B — architecture precedes learning (confirmed empirically)
Across all five bootstrap surgeries from 12L to 17L, the core four experts — PLN (planning), CMP (composition), INT (interpretation), ABS (abstraction) — remained the dominant active set. The surgery sequence did not destabilize semantic specialization. The model grew, the experts kept their identities.
This empirically resolves the Read A vs. Read B question that opened v3.0: architecture grows ahead of learning (Read B), not in compensation for it (Read A). The surgeries were purposeful structural investments. Capacity was built before it was filled.
The Token Probe Benchmark (scripts/probe_tokens.py) tracks longitudinal semantic geometry on 10 canonical tokens (love, god, Jesus, death, war, truth, freedom, mother, light, time). The pre-s6 and pre-s7 snapshots (230 epochs apart, 0.0947 nat of loss improvement between them) returned identical top-5 neighbors for all 10 tokens. Semantic geometry crystallized at 17L. Subsequent loss improvements came from routing/expert/attention reorganization — not from embedding restructure.
As far as we know, this is the first longitudinal study of semantic geometry stability under autonomous architectural mutation.
The Evolution Engine
Generational Fibonacci cycling
Shipped 2026年05月19日 in a complete rewrite of evolution.rs. The system now operates in generations rather than a single linear surgery sequence.
ARC_LENGTH = 6surgeries per generation. After 6 surgeries, the generation closes andgen_valadvances.- Threshold scaling: Generation 0 plateau threshold is 0.020 nats. Each subsequent generation multiplies by 0.75 (floor: 0.008 nats). Later generations require finer-grained plateau detection — the model gets harder to surprise.
FIB_TARGETSextended to 24 terms through[3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 46368, 75025, 121393, 196418]. There is no architecture-ceiling clamp in code. The comment inevolution.rsreads: "The hardware is the only limit."GENERATION_TIMEOUT_MULTIPLIER = 3— if a generation gets stuck on a single Fibonacci window for more than ×ばつ the window length, a stuck-generation fallback fires.promote_fib_target(post_layers)reads the actual post-surgery layer count fromconfig.jsonrather than guessing — critical fix for resumption after manual layer edits.
Current state at release: Generation 3, step 1/6, fib_index = 7, window = 34 epochs, threshold ~0.015 nats. Five surgeries remain in the Gen 3 arc. The first GENERATION_COMPLETE event for Gen 3 will fire after surgery #6 of this generation.
A known Rust 2024 compatibility fix was applied: gen is now a reserved keyword and was renamed to gen_val / g in two places. This would have produced a silent compile failure on Modal without the rename.
WALD detector
WALD (Weighted Asymmetric Loss Divergence) is a reactive signal — it detects when intra-epoch loss variance crosses a tunable threshold (n = 1500 by default). Empirically validated across the v3.0 era: **...
Assets 2
Albert MoE-13 Benchmark Suite v2.0.0
Albert MoE-13 — TIS Benchmark Suite
One-line install and run:
Linux / macOS / Android (Termux)
curl -fsSL https://raw.githubusercontent.com/eriirfos-eng/ternary-intelligence-stack/main/albert-moe-13/benchmarks/install.sh | shWindows — PowerShell (open Start → search "PowerShell")
irm https://raw.githubusercontent.com/eriirfos-eng/ternary-intelligence-stack/main/albert-moe-13/benchmarks/install.ps1 | iex
Windows — Command Prompt / cmd.exe (if PowerShell is not your default shell)
powershell -ExecutionPolicy Bypass -Command "irm https://raw.githubusercontent.com/eriirfos-eng/ternary-intelligence-stack/main/albert-moe-13/benchmarks/install.ps1 | iex"
What it measures
| # | Benchmark | Output |
|---|---|---|
| 1 | Inference speed | tok/s, ms/tok, GFLOPS |
| 2 | @sparseskip routing | active/skipped experts, compute ratio |
| 3 | Perplexity | avg cross-entropy, PPL on WikiText-2 (150KB, held-out) |
Results exported to albert_bench_results.csv automatically.
Architecture
- 17L ×ばつ 12E ×ばつ 256H — native ternary MoE, weights in
{-γ, 0, +γ}throughout - @sparseskip Top-3/12: 75% of experts skipped per decode step (Patent A50296/2026)
- Ternary Traffic Light Routing: per-expert trit execution budget
Platform support
| Platform | Method |
|---|---|
| Linux x86_64 | pre-built binary |
| Windows x86_64 | pre-built binary |
| macOS (any) | builds from source (~5-10 min, requires Xcode CLT) |
| Linux ARM64 | builds from source (~10 min) |
| Android / Termux | builds from source (~10-20 min) |
Notes
- This release packages the v2.0.0 checkpoint (pre-surgery baseline)
Assets 10
v2.0.0 — The Autonomous Substrate
TIS v2.0.0 — The Autonomous Substrate
Released: 2026年05月06日
Codename: The Autonomous Substrate
Team: RFI-IRFOS · Graz, Austria · Patent Pending A50296/2026
Overview
Version 2.0.0 marks the transition of the Ternary Intelligence Stack from a compiler and toolchain project into a self-evolving, autonomously training neural intelligence platform. This release is the first in TIS history where the AI component — Albert MoE-13 — expanded its own architecture without human intervention, trained continuously across 16 global epochs, and demonstrated expert-level pattern mastery on a consumer CPU manufactured in 2013.
The platform simultaneously matured on every axis: the public interface went fully bilingual, the MCP tool surface reached 34 free tools with live Smithery integration, the codebase was restructured for long-term clarity, and the scientific evidence base for ternary hardware advantage was independently verified and documented for SPRIND submission.
This is not an incremental release. This is the first time the stack behaved like a system that knows what it is.
Albert MoE-13 v2.0.0 — Architecture Overhaul
Dimensional Scale-Up
The core model architecture was substantially upgraded from the v1.3.7 baseline:
| Parameter | v1.3.7 | v2.0.0 |
|---|---|---|
| Hidden size | 96 | 256 |
| Experts per layer | 8 | 12 |
| Routing | Top-2 sparse | Top-3 sparse |
| Total parameters | ~3.5M | ~22M |
| Active params/token | ~1.2M | ~7.6M |
| Checkpoint size | ~14MB | ~91MB |
The 256-hidden configuration increases representational capacity by ×ばつ while maintaining the ternary weight format throughout. All 22 million parameters are stored as discrete values in {−1, 0, +1}. The checkpoint is held in float32 on disk for training compatibility; a packed ternary deployment artifact would reduce this to approximately 6MB.
The EvolutionManager
The most significant architectural addition in v2.0.0 is the EvolutionManager — an autonomous state machine that monitors training dynamics and triggers Net2Net safe-copy layer surgery without human intervention.
The surgery protocol:
- Monitor epoch-over-epoch loss delta against configured plateau and mastery thresholds
- On trigger: clone the deepest layer into a new layer N+1, initializing weights via safe-copy (identity-preserving, no loss spike)
- Resume training with the expanded architecture; the model continues from where it was rather than restarting
On 2026年05月06日, the EvolutionManager triggered the first autonomous surgery: 3L → 5L depth expansion. Checkpoint grew from 41MB to 91MB. The transition preserved all existing linguistic structure accumulated across prior epochs. This represents the first time an Albert model expanded its own cognitive depth without operator input.
Training Efficiency: The ×ばつ Speedup
A critical bug was identified and corrected in the gradient accumulation loop. The optimizer's backward_step() was being called once per micro-batch (16 times per logged batch) rather than once per full accumulation cycle. This produced 16 separate optimizer updates per batch with noisy, under-accumulated gradients.
Effect of fix:
- Batch time: ~50s → ~2s (×ばつ wall-clock improvement)
- Loss values corrected from inflated ~112 (sum of 16 step losses) to real cross-entropy ~6.5–7.0
- Gradient signal quality: qualitatively improved — smoother descent curve from epoch 1
Ternary Training Innovations
Three training-time optimizations were implemented that reduce compute overhead without affecting model quality:
L1 Sparsity Regularization (λ = 1e-5): Added to the training loss. Rather than quantizing to ternary post-hoc, the model learns where to be sparse during training, producing sparser expert weights with higher signal concentration.
Per-Layer Threshold Gradient: Layer 0 operates with threshold = 0.01 (dense, learns syntax and surface structure). The deepest layer operates at threshold ≥ 0.03 (sparse, learns higher-order abstractions). Intermediate layers interpolate. This mirrors the known structure of biological neural depth specialization.
Gamma Cache in TernaryLinear: The mean activation mean_all() — previously recomputed on every forward call — is now cached and refreshed every 20 forward calls. This eliminates 36+ redundant full-weight reductions per batch at 256H. Causal attention mask is cached in Attention via RefCell and rebuilt only on sequence length change.
Training Results (16 Global Epochs, 2026年05月06日)
| Metric | Value |
|---|---|
| Global epochs completed | 16 (3L phase, ongoing) |
| Total batches logged | 4,864 |
| Epoch 1 average loss | 8.3784 |
| Epoch 15 average loss | 6.5213 |
| Current average loss | ~6.67 |
| All-time best loss (single batch) | 2.1353 (Epoch 9) |
| Loss reduction over 15 epochs | −1.857 (−22.1%) |
| Average epoch duration | ~12–13 min |
| Hardware | HP ZBook (2013), Intel CPU, no GPU |
The all-time best of 2.1353 (Epoch 9, single batch) corresponds to an effective perplexity of approximately 8.5 on that batch — achieved when a specialized expert's routing perfectly aligned with a repetitive Biblical passage. This is proof that individual MoE experts have reached genuine pattern mastery on specific token distributions.
The loss variance pattern across epochs is characteristic of healthy MoE training: variance expands as experts differentiate (Epochs 4–10, stddev ~0.54) then partially compresses as routing stabilizes (Epoch 11, stddev 0.45). The floor of achievable loss per batch has monotonically decreased, with low-loss breakout events (<5.0) increasing from 0 in Epochs 1–3 to 6 in Epoch 15 — evidence that expert specialization is broadening across the corpus.
Multi-Corpus Training Pipeline
The training harness was refactored to auto-discover all .txt files in data/corpus/ via load_corpus(). The corpus downloader (scripts/download_corpus.py) fetches:
- Simple English Wikipedia — general world knowledge foundation
- 12 Project Gutenberg classics — Moby Dick, War and Peace, Ulysses, Crime and Punishment, and others
- EU AI Act — legal and regulatory reasoning corpus
- Linux kernel documentation — technical and systems reasoning
- TLDR Unix command pages — concise engineering command patterns
All staged to data/corpus_staged/ and gated for post-Bible convergence. The pipeline is drop-and-train: any .txt file placed in data/corpus/ is picked up on next restart without code changes.
Platform: 34 Free MCP Tools on Smithery
The public MCP surface grew from 30 to 34 tools, all free. Four LLB tools were added to the live HTTP endpoint:
llb_check— deterministic filesystem policy evaluationllb_classify— path tier classificationllb_validate— pre-write safety gatellb_write_safe— advisory trit=0 local-only verdict
smithery.yaml was upgraded to v1.2.0. The startCommand was migrated from stdio to http, pointing to the live ternlang-api endpoint on Fly.io. Smithery now reads the tool manifest dynamically via POST /mcp tools/list rather than from a static file. Tool count is the authoritative live value at all times.
The ternlang-mcp stdio server was reconciled: 11 tools existed in smithery.yaml but were missing from src/main.rs. All 11 were implemented as stateless pure-logic handlers:
trit_upgrade, trit_mem_write, trit_mem_read, trit_mem_consolidate, trit_mem_stats, trit_mem_compress, trit_compress, trit_triage, trit_plan, trit_factcheck, moe_full
The #![recursion_limit = "512"] attribute was added to resolve a json! macro depth overflow caused by 34 simultaneous schema definitions in a single call.
LLB — Last Look Back Safety Protocol
The Last Look Back (LLB) protocol was formally implemented and published as a standalone MCP server (albert-llb-mcp). LLB is a deterministic, policy-based filesystem safety gate — a veto layer that evaluates write operations against a configurable rule set before execution.
Key design properties:
- Stateless: no runtime dependency on model inference
- Deterministic: identical input always produces identical verdict
- Ternary-native: verdict format is
trit ∈ {−1, 0, +1}(reject / hold / affirm) - MCP-compatible: exposed as stdio server with
smithery.yamlfor Smithery discovery
LLB represents RFI-IRFOS's commitment to the principle that safety gates in sovereign AI systems must be auditable, reproducible, and independent of the model they protect.
TernStudio — Liquid Time and the Ternary Actuator Protocol
TernStudio received its most substantial update since launch. Two foundational systems were added:
Ternary Actuator Protocol (TAP): Autonomous tool-call interception layer. Nodes operating in State 0 (uncertainty) are suspended until a human or upstream agent resolves the signal. TAP enforces the ternary principle at the workflow execution level: nothing proceeds on ambiguity.
Liquid Time + Global Clock: A DAW-style multiverse timeline scrubber with smooth linear playhead, continuous dot interpolation, and synchronized visual/logic clock decoupling. Users can scrub backward through simulation history and resume from any point without state corruption.
Additional: Pyodide WASM sandbox for active .tern execution inside Studio, YAML state-transport layer for result artifacts across sessions, Translator micro-frontend iframe bridge, WebSocket real-time execution telemetry, and the 7-category agent taxonomy accordion.
Phase 17 — BET VM in the Browser
The Binary Encoded Ternary VM was compiled to WebAssembly and is now executing live in the browser via TernStudio. Users can write, compile, and run .tern programs without any local installation. The WASM build is embedded at compile time via include_str! in ternlang-api and served directl...
Assets 4
v1.3.5 — Technical Synchronization & Neural Transition
Technical Release Report: v1.3.5 Synchronization and Neural Transition (Hardened)
1. Release Overview
[VERIFIED] Version 1.3.5 documents the architectural shift from symbolic deterministic logic to a preliminary neural-native ternary Transformer prototype. This update synchronizes 100+ crates to a unified baseline (Rust Edition 2024, v1.3.5) and establishes the experimental foundation for Mixture-of-Experts (MoE) training on ternary manifolds. This release serves as a technical stabilization point for the SPRIND Next Frontier AI evaluation.
2. Verified System Components
The "Great Release" — Massive-Scale Open StdLib
- [VERIFIED] Ecosystem Rebalancing: Transitioned 28,000+ proprietary modules to the Tier 1 Open Core standard library. [MEASURED] Total open-access library size: 28,500+
.ternmodules. - [VERIFIED] API Gating Removal: De-restricted the
ternlang-apistdlib handlers to allow universal Tier 1 read access to all domain foundations (ML, Finance, Causal, Science). - [VERIFIED] Onboarding Synchronization: Updated the
trit_upgradetool with synced pricing and higher quotas for Tier 2 (Pro) and Tier 3 (Industrial).
Neural Compute Backend (moe-llm-core)
- [EXPERIMENTAL] Implemented Primitives: Preliminary
Embedding,Linear, andAttentionlayers optimized for thecandletensor framework. [MEASURED] CurrentTransformerconfiguration:vocab_size: 8000,hidden_size: 512,num_heads: 8. - [EXPERIMENTAL] Ternary Compatibility: Integrated Straight-Through Estimators (STE) for discrete ternary weight optimization. [MEASURED] Verified on N=2048 parameters; loss convergence from 1.0 to 0.0 observed at epoch 50 in isolation.
- Location:
albert-moe-13/moe-llm-core/
Runtime Orchestrator (moe-test)
- [VERIFIED] Functionality: Interactive REPL for real-time inference testing and autoregressive sampling.
- [VERIFIED] State: Verified execution of token-level generation loops with EOF handling.
- Location:
albert-moe-13/moe-test/
Filesystem Containment (moe-llb)
- [VERIFIED] Protocol: "Last Look Back" deterministic gate for agentic filesystem operations.
- Location:
agent_albert_cli/rust/crates/moe-llb/
Triadic Data Packing (ternlang-core)
- [MEASURED] Specification: 5-trit block packing into 8-bit storage (ExaTern) achieving 99.06% storage efficiency.
- [VERIFIED] Implementation: Verified
pack/unpackprimitives with 100% round-trip integrity. - Location:
ternlang-root/compiler/legacy_shim/ternlang-core/src/types/trit.rs
3. Experimental Features (Clearly Marked)
Differentiable MoE Router
- Status: [EXPERIMENTAL] / Not yet validated at scale.
- Details: Prototype 13-expert router with learned gating and load-balancing telemetry. [NOT YET MEASURED] Load balancing entropy across 13 experts.
- Location:
albert-moe-13/crates/moe-core/src/core/router.rs
Sparsity-Based Throughput Projections
- Status: [THEORETICAL] / Upper Bounds.
- Details: Projections based on
@sparseskipopcode performance in specialized ternary hardware simulators.- 25% Sparsity: 53.1x projected throughput.
- 99% Sparsity: 122.3x projected throughput.
Copernicus-v1 Training
- Status: [EXPERIMENTAL] / Under Development.
- Details: Initial training experiments on the King James Bible corpus ([MEASURED] 824,543 tokens) to validate STE convergence behavior.
4. Current Limitations
- Model Scale: [THEORETICAL] No large-scale Transformer (e.g., Llama-3 scale) is yet implemented; current
Transformerarchitecture inmoe-llm-coreis a simplified proof-of-concept. - Semantic Generation: [VERIFIED] The system does not yet exhibit semantic language generation; current outputs are limited to token-repetition patterns used for architectural validation.
- Training Evidence: [VERIFIED] Large-scale training convergence data is not yet available; experiments are restricted to small-scale verification ([MEASURED] N=2048 parameters).
- Inference Stability: [EXPERIMENTAL] No validated gradient stability beyond toy scale; potential for STE instability in deep stacks is unprobed.
5. Reproducibility & Artifacts
- [VERIFIED] Training Traces:
albert-moe-13/benchmarks/convergence_training.log(Epochs 0-90, N=2048). - [EXPERIMENTAL] Audit Logic:
albert-moe-13/reproducibility_verifier/(Independent truth-check layer). - [VERIFIED] Hardware Spec:
ternlang-root/docs/CHECKPOINT_SPEC.md. - [THEORETICAL] Sparsity Table: Located in
ternlang-root/README.md.
6. Version Synchronization
[VERIFIED] All primary crates have been synchronized to v1.3.5 and Rust Edition 2024 to ensure workspace-wide build integrity. Key crates include:
| ternlang-core | v1.3.5 |
| albert-runtime | v1.3.5 |
| ternlang-ml | v1.3.5 |
| albert-api | v1.3.5 |
| albert-commands | v1.3.5 |
| albert-tools | v1.3.5 |
| albert-compat | v1.3.5 |
| ternlang-moe | v1.3.5 |
| ternlang-runtime | v1.3.5 |
| ternlang-hdl | v1.3.5 |
| albert-cli | v1.3.5 |
| ternlang-lsp | v1.3.5 |
| ternlang-cli | v1.3.5 |
| ternlang-compat | v1.3.5 |
|ternlang-mcp | v1.3.5 |
| ternlang-ruvector | v1.3.5 |
| ternlang-codegen | v1.3.5 |
| ternpkg | v1.3.5 |
| ternlang-test | v1.3.5 |
| ternlang-compress | v1.3.5 |
| moe-core | v1.3.5 |
| moe-platform | v1.3.5 |
| moe-plugin-sdk | v1.3.5 |
| moe-runtime | v1.3.5 |
| moe-ddel | v1.3.5 |
| moe-sdk | v1.3.5 |
| pytern | v1.3.5 |
| ternaudit-guard | v1.3.5 |
| moe-uril | v1.3.5 |
| moe-validation-suite | v1.3.5 |
| moe-llm-core | v1.3.5 |
| moe-compute | v1.3.5 |
| moe-test | v1.3.5 |
| ternlang-api | v1.3.5 |
7. Next Engineering Milestone
The immediate technical objective is the wiring of the Attention and MLP blocks into the moe-llm-core forward pass to enable complete sequence modeling, followed by a full-vocabulary cross-entropy training loop on the Bible corpus.
8. Known Failure Modes
- No Autoregressive Coherence: Generated sequences quickly devolve into repetition due to lack of trained attention heads.
- No Long-Context Retention: Current forward pass lacks validated KV-caching or positional encoding stability over sequences > 64 tokens.
- STE Instability: The Straight-Through Estimator has not been validated for gradient stability in models exceeding 4 layers.
- Routing Collapse: Under current N=2048 experiments, MoE routing weights have not been proven to prevent expert collapse (single-expert dominance).
- Static Weights: Risk of zero-gradient flow in deep ternary networks where thresholds exceed update magnitudes.
9. Falsifiability Conditions
- Loss Invariance: If avg_loss remains constant at 1.0 across 100+ epochs under varied learning rates, the STE implementation is invalidated.
- Routing Entropy Collapse: If
EXPERT_LOADtelemetry shows 100% usage for a single expert across diverse inputs, the MoE routing mechanism is failed. - Weight Staticity: If
mean_wvariance is 0 across training cycles, gradient propagation through the ternary manifold is non-functional. - Output Invariance: If the model produces the same token index for all inputs regardless of prompt content, semantic modeling is non-existent.
10. Reproducibility Instructions
To verify the current state of the orchestrator and neural core:
# 1. Build the unified orchestrator cargo build --release -p moe-test # 2. Run the interactive REPL smoke test # Expected: "Integrity: NEURAL-BACKEND-ACTIVE" ./target/release/moe-test # 3. Verify triadic packing integrity cargo test -p ternlang-core --test packing_integrity # 4. Inspect training logs cat albert-moe-13/benchmarks/convergence_training.log
Artifact Locations:
- Inference Logic:
albert-moe-13/moe-test/src/main.rs - Neural Kernels:
albert-moe-13/moe-llm-core/src/model/ - Audit Reports:
albert-moe-13/reproducibility_verifier/ - Corpus:
albert-moe-13/data/corpus/bible.txt
Assets 2
v1.2.9 — Stable Baseline Release
status architecture memory compliance language
Overview
This release marks the transition from experimental builds to a stable, stateful, and operational system.
For the first time, the Ternary Intelligence Stack (TIS) behaves as a coherent runtime environment, not a collection of loosely coupled components.
In practical terms:
You can now build on top of it, not just experiment with it.
Core Highlights
64-bit Runtime Support
The execution layer now operates fully in 64-bit across the stack.
Impact:
- Larger parameter spaces
- Improved numerical stability
- Compatibility with real-world workloads
Under the hood:
- Memory model redesign
- Execution alignment fixes
- Kernel-level assumption updates
Persistent Memory Layer — RuVectorDB
Introduced a database-backed memory system with cryptographic timestamping.
Capabilities:
- Deterministic recall across sessions
- Structured, persistent state (no more resets)
- Traceable memory evolution
Result:
→ Enables long-running, stateful agents
Repository Restructure — "Ordnung"
The codebase now reflects actual system architecture rather than historical growth.
Improvements:
- Clear separation: runtime / memory / interface
- Centralized documentation
- Reduced cross-module coupling
- Simplified utilities
Outcome:
→ Faster onboarding, fewer hidden dependencies
Proactive Runtime Behavior
Albert transitions from passive executor → active system participant
New behaviors:
- Tracks internal state transitions
- Adapts its own memory schema
- Performs environment validation
Shift:
→ From execution engine to system-aware runtime
PyTern Integration
Python interoperability is now a first-class citizen.
pyternadded as core crate- Unified versioning (
v1.2.9) - Foundation for hybrid workflows
Unlocks:
- Python scripting layers
- External tool integration
- Research-friendly interfaces
TernAudit Guard (Compliance Layer)
Refactored from ternlang-audit into a structured observability system.
Components:
DashboardManager→ real-time system visibilityAuditEventpipeline → structured logging
Purpose:
→ Production-grade auditability & compliance
UX & CLI Improvements (albert-cli)
- ⚡ Faster startup (reduced boot latency)
- ⌨️ Optimized typewriter responsiveness (120ms tick)
- 🎛️ Smooth, decoupled UI animations
- 🧾 Enhanced session report (model + UX hints)
- 🧠 Inline HITL (Human-in-the-Loop) interaction cards
- 🧩 Fixed markdown rendering ("geometry leak")
- 🔐 Resolved sandbox permission issues (uid_map)
Reliability & Recovery
This release includes a full workspace recovery event:
- Rebuilt all
Cargo.tomlfiles after accidental wipe - Restored and validated entire workspace structure
- Fixed dependency graph inconsistencies
- Cleaned dead references
Takeaway:
→ The system is now resilient at scale
Versioning & Distribution
- All 26+ crates bumped to
v1.2.9 - Full ecosystem published to crates.io
- Standalone crates (
pytern,ternaudit-guard) aligned - Git history rebased and synchronized
Why This Matters
This release establishes a true operational baseline for:
- Running larger models locally
- Building persistent agent systems
- Integrating external tooling (Python, dashboards, audits)
- Transitioning from experimentation → reproducibility
Design Philosophy
This version prioritizes:
- Stability > Features
- Structure > Speed
- Foundations > Optimization
Assets 2
v1.0.0 — First Release (Legacy)
This release is superseded by v1.2.9 — please use the latest release.
v1.0.0 — First Release (April 2026, Legacy)
This was the first tagged stable release of the Ternary Intelligence Stack, marking the point where the compiler, BET-VM, MCP server, and open-core stdlib were functional as a coherent system.
What was in scope at v1.0.0
- Ternlang compiler (lexer → parser → bytecode)
- BET-VM bytecode runtime
- MCP server (initial 19 tools)
- Open-core stdlib (Tier 1)
- Phase 12A coherence testing baseline
What has since shipped (v1.1.0 → v1.2.9)
- MoE-13 full 13-expert routing with domain bias (Phase 20 complete)
- QAT/STE fine-tuning loop + perplexity validation (Phase 12B/12C)
- TernAudit VS Code command with EU AI Act inline decorations (Phase 18)
- moe-platform, moe-plugin-sdk, moe-ddel, moe-core, moe-runtime published to crates.io
- VS Code extension v1.0.2 published to Open VSX
- Fortune cookie easter egg at ternlang.com/fortune
Assets 2
ternlang-lsp vscode-v0.4.0
Pre-built ternlang-lsp binaries for all platforms.
The VS Code extension downloads the correct binary automatically on first activation.
| Platform | File |
|---|---|
| Linux x64 | ternlang-lsp-linux-x64 |
| Linux ARM64 | ternlang-lsp-linux-arm64 |
| macOS x64 | ternlang-lsp-darwin-x64 |
| Windows x64 | ternlang-lsp-win32-x64.exe |
Assets 6
Ternlang Legacy Drop
Ternlang Legacy Drop: Architectural Mass & Compiler Verification
This release solidifies the foundational infrastructure of the Ternary Intelligence Stack. We deployed +2,341 strictly typed .tern files, encompassing the complete standard library, core executable logic for the BET VM, and comprehensive edge-case testing matrices.
Core implementations locked in this drop:
-
The compiler's strict ternary exhaustiveness — explicitly forcing the 0-state resolution in all match arms. -
Complete routing protocols for the @sparseskip annotation — enabling the 122.3x sparse inference speedup by bypassing neutral zero-weights at the hardware-emulation layer. -
Sovereign node initialization scripts for the MoE-13 orchestrator and albert-agent ecosystem.
This drop serves as the immutable public ledger of our repository mass and operational scope. The ecosystem is live, the syntax is active, and the file count is undeniable.
Cheers!
RFI-IRFOS
Screenshot From 2026年04月05日 18-13-41