Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: DeepLink-org/DLSlime

v0.1.11

Choose a tag to compare

@JimyMa JimyMa released this 27 May 19:00

What's Changed

TCP Transport (experimental)

  • Add TCP transfer engine: C++ TcpEndpoint, memory pool, connection pool, futures, and pybind11 bindings. Header-only asio (standalone). (#99)
  • Wire TCP into PeerAgent via connect_to(transport="tcp") — same control plane, same mailbox, same I/O facade. RDMA stays the default. (#105)
  • Move the identity interface into C++: rename TcpEndpoint::async_send/recv/read/write to send/recv/read/write so PeerAgent holds either RDMAEndpoint or TcpEndpoint without a Python adapter. (#105)
  • C++ dlslime::not_implemented exception, translated to Python NotImplementedError by a pybind11 translator, for write_with_imm / imm_recv on TCP. (#105)
  • PeerConnection.peer_endpoint_info exposes the peer's endpoint_info post-handshake for one-sided ops over TCP. Race fix in _ensure_local_tcp_endpoint: post-publish sweep guarantees regions registered concurrently with endpoint creation are not dropped. (#105)
  • New examples: p2p_tcp_send_recv_peer_agent.py, p2p_tcp_rc_write_peer_agent.py, p2p_tcp_rc_read_peer_agent.py, p2p_tcp_rc_send_recv.py (raw TcpEndpoint). New test: test_peer_agent_tcp.py. New guide: docs/src/guide/tcp-transport.md. (#105)

Reliability

  • Fix Redis stream name mismatch (inbox:stream:) so peer rendezvous reaches the right listener. (#104)
  • Fix a race in connection establishment. (#104)

Build / CI

  • PyPI build: vendor standalone asio headers in CIBW_BEFORE_ALL_LINUX (dodges EPEL availability variance and the broken Boost.Asio CMake fallback). Pass
    -DBUILD_TCP=ON explicitly. Add lib_slime_tcp.so to the auditwheel --exclude list.
  • README, mkdocs nav, and API reference updated: TCP transport guide, BUILD_TCP=ON build flag, new examples linked.

Bumps

  • v0.1.7 → v0.1.8 → v0.1.9 → v0.1.11. v0.1.10 was an aborted bump while iterating on the wheel-build pipeline.

New Contributors

Full Changelog: v0.1.7...v0.1.11

Contributors

SHshenhao
Assets 2
Loading

DLSlime v0.1.7

Choose a tag to compare

@JimyMa JimyMa released this 26 May 05:33

DLSlime v0.1.7

This is the first stable release after the major monorepo restructuring. If you have been on 0.1.0.post1 or earlier, this is a breaking-but-worth-it upgrade.

💡 Versions 0.1.1 ~ 0.1.6 were broken or interim release-engineering attempts and have been yanked from PyPI. Pinning to 0.1.7 (or just pip install dlslime with no pin) is the right call.


🚨 Breaking Changes

1. nanoctrl is dead. Long live dlslime-ctrl.

The Rust control plane has been renamed:

Before After
nanoctrl (PyPI) dlslime-ctrl (PyPI)
nanoctrl (binary) dlslime-ctrl (binary)
from nanoctrl import... from dlslime.ctrl import ...
Default port 3000 Default port 4479

Migration:

pip uninstall -y nanoctrl
pip install dlslime-ctrl
# Update your code:
# - from nanoctrl import NanoCtrlClient → from dlslime import NanoCtrlClient
# - URLs hardcoded to :3000 → :4479

The nanoctrl PyPI distribution is yanked at 0.1.0.post1 and will not get further releases.

2. Two packages now, no meta-package

The old dlslime-workspace meta-package (root pyproject.toml) has been removed. Install dlslime and dlslime-ctrl independently:

# Production
pip install dlslime dlslime-ctrl
# Editable / dev
pip install -e dlslime
pip install -e dlslime-ctrl

Why: the workspace's dlslime @ file:./dlslime PEP 508 reference forced pip to bake absolute local paths into wheel metadata, breaking installs on every machine that didn't share the build host's directory layout.

3. Monorepo layout

Every component is now self-contained inside its own subdir:

dlslime/ Core Python package + C++ bindings + runtime primitives
 ├── dlslime/ Python sources
 ├── examples/ Runnable examples (was repo-root /examples)
 ├── bench/ Benchmark scripts (was repo-root /bench)
 └── tests/
 ├── python/ (was repo-root /tests/python)
 └── cpp/ (was dlslime/tests-cpp)
dlslime-ctrl/ Rust control plane (renamed from /NanoCtrl)
docker/ docker-compose for dlslime-ctrl + Redis
docs/ Design notes, roadmap, platform guides
scripts/ Repo-wide release automation

Path updates required if you scripted against the old layout — sorry!


✨ New Features

Docker / Cloud-native deployment

A complete one-shot container deployment for dlslime-ctrl:

  • docker/ctrl.Dockerfile — multi-stage rust:1.95-slim-bookworm builder, ~30 MB final image.

  • docker/docker-compose.yml — bundles dlslime-ctrl + Redis with health checks. Override DLSLIME_CTRL_IMAGE to pull from GHCR instead of building locally:

    echo "DLSLIME_CTRL_IMAGE=ghcr.io/deeplink-org/dlslime-ctrl:0.1.7" >> docker/.env
    echo "DLSLIME_CTRL_PULL_POLICY=missing" >> docker/.env
    docker compose -f docker/docker-compose.yml up -d
  • docker/docker-compose.external-redis.yml — for users who already have Redis.

  • DLSLIME_CTRL_REDIS_ADVERTISE env var — externalizes the Redis URL so PeerAgents on other hosts can discover the right address.

Public Docker images via GHCR

Multi-arch (linux/amd64, linux/arm64) images at ghcr.io/deeplink-org/dlslime-ctrl:

Trigger Tags
Push to main / master edge, sha-<short>
Push tag vX.Y.Z X.Y.Z, X.Y, latest, sha-<short>
Manual workflow_dispatch optional extra tag from input

No registry login required — they're public. Works for docker pull and docker compose pull out of the box.

status command for externally-managed instances

dlslime-ctrl status previously assumed a local PID file, so it printed "not running" against any Docker / systemd / Kubernetes deployment even when the server was healthy. It now reports running (managed externally) when no PID file exists but the health endpoint answers.

Pulsing RPC backend benchmark

dlslime/bench/python/rpc_bench_pulsing.py and a refreshed run_rpc_bench.sh provide a side-by-side comparison between SlimeRPC, Ray RPC, and the new pulsing backend.

StreamMailbox race fix

PR #97 fixed a long-standing race in the SlimeRPC stream mailbox where concurrent senders could occasionally drop the last frame of a stream.


🛠 Infrastructure & Release Engineering

This release establishes a fully automated CI/CD pipeline.

One-shot scripts/release.sh

scripts/release.sh 0.1.8

Bumps the version in 4 manifests (dlslime/pyproject.toml, dlslime-ctrl/pyproject.toml, dlslime-ctrl/Cargo.toml, docs/pyproject.toml), refreshes Cargo.lock, rewrites all docs that mention the old version, then commits + tags + pushes.

GitHub Actions

Workflow Trigger Action
ci.yml every PR Lint, build wheel smoke test, optional self-hosted RDMA tests
docker-publish.yml tag / main Multi-arch GHCR publish via OIDC (no PAT needed)
pypi-publish.yml tag sdist + cibuildwheel cp310..cp313 + maturin → PyPI Trusted Publishing (OIDC)
docs.yml docs change Build & deploy docs

auditwheel handling for RDMA

pypi-publish.yml excludes both system RDMA libraries (libibverbs.so.1, libnuma.so.1, libmlx5.so.1, librdmacm.so.1, ...) and DLSlime's own sibling shared libs (lib_slime_rdma.so, lib_slime_engine.so, ...) from auditwheel repair. The system libs come from the user's rdma-core install; the sibling libs are resolved at runtime via the $ORIGIN rpath baked in by CMake.

Self-hosted runner uses $GITHUB_WORKSPACE

The rdma-tests job no longer hard-codes a contributor's home directory — it mounts $GITHUB_WORKSPACE into the container so tests always run against the exact commit being CI'd.

Pre-commit pinned to Python 3.11

ufmt + black 25.x requires Python ≥ 3.9; the hook now pins language_version: python3.11 so contributors don't trip over whichever system Python pre-commit picks up.

Rust toolchain pinned to 1.95

rust-toolchain.toml pins 1.95 to keep local builds, CI, and the Docker builder all on one version. Required for clap 4.6+ (edition2024) and comfy-table 7.2+ (let-chains).


📚 Documentation

  • New docs/src/guide/slimerpc.md (+ slimerpc.zh.md) — full SlimeRPC guide with FlatBuffers raw mode, async APIs, and benchmark methodology.
  • New docs/src/guide/benchmark-rpc.md — RPC benchmark walkthrough.
  • Refreshed docs/src/guide/peeragent-api.md, endpoint-api.md, deployment.md, dlslime-cache.md (English + Chinese versions).
  • New docker/README.md — full GHCR publishing guide and compose usage.

🧹 Removed

  • bench/results/*.csv — large committed CSVs from older runs; regenerate via dlslime/bench/python/run_rpc_bench.sh.
  • Root-level tests/, examples/, bench/ — moved under dlslime/.
  • NanoCtrl/ — replaced by dlslime-ctrl/.
  • Root pyproject.toml (the dlslime-workspace meta-package) — see Breaking Change #2.

📦 Install

# PyPI (default)
pip install dlslime dlslime-ctrl
# From source
git clone https://github.com/DeepLink-org/DLSlime.git
cd DLSlime
pip install -v --no-build-isolation -e dlslime
pip install -e dlslime-ctrl
# Docker (control plane only)
docker pull ghcr.io/deeplink-org/dlslime-ctrl:0.1.7

Prerequisites for pip install dlslime (RDMA host stack):

# Ubuntu / Debian
sudo apt install -y rdma-core libibverbs1 libnuma1
# RHEL / CentOS
sudo yum install -y rdma-core libibverbs numactl-libs
Loading

dlslime-v0.0.3

dlslime-v0.0.3 Pre-release
Pre-release

Choose a tag to compare

@JimyMa JimyMa released this 10 May 11:39
90ddfff
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

dlslime 0.0.3

First GitHub Release. pip install dlslime==0.0.3 is live on PyPI. This release is the big post-0.0.2 consolidation: a control plane, a cache subsystem, an observability stack, and a full refactor of the RDMA completion ownership model.

🎉 New subsystems

  • NanoCtrl control plane (#68, #84, #90). Rust/Axum server with Redis-backed, scope-isolated peer registration and MR publication. Ships as a separate nanoctrl PyPI wheel (now at v0.0.8).
  • PeerAgent refactor (#80, #81, #88). PeerAgent becomes the runtime hub with auto NIC/topology discovery, directed-connection model, and a canonicalized Python API.
  • DLSlimeCache v0 (#82). Slab-based in-process KV cache with peer/version directory.
  • SlimeRPC (#69 , #73). Datapath moved to C++, zero-copy inplace reply path, Python backend dropped.
  • Observability v0 (#91). C++ relaxed atomic counters, PeerAgent reporter writes Redis snapshots, nanoctrl obs {status,peers,nics,links} queries cluster state. DLSLIME_OBS=0 by default; single-branch overhead when disabled.
  • GitHub Actions CI (#78). Build + lint on every PR.

🔧 RDMA engine

  • Refactor RDMAEndpoint completion ownership (#70). Slot-lease ring + per-QP callback masks eliminate modulo-reuse races. One EndpointOpState owns a user op end-to-end; callbacks update it instead of the pool slot.
  • RNR handling for imm-recv (#71). Pre-posted RECV window keeps the HW RQ primed so WRITE_WITH_IMM from peers always finds a posted receive.
  • CQ depth limit fix (#66).
  • unregister_mr on RDMAMemoryPool (#89).
  • Concurrent sendrecv stress test (#77).
  • P2P RDMA RC read control-plane fixes (#74, #75).
  • Aggregate transfer bench memory reduction (#76).

📚 Docs & housekeeping

  • Arch diagrams refreshed (#85, #86).
  • README / news / logo updates (#72, #79, #83, #87).
  • All-to-all offsets helper (#60).

Install

pip install dlslime==0.0.3 
pip install nanoctrl==0.0.8 # control-plane CLI + server
  • Build from source with NVLink support:
DLSLIME_BUILD_NVLINK=1 pip install dlslime==0.0.3 --no-binary dlslime

Compatibility

  • Python 3.8 – 3.13, Linux x86_64 (manylinux2014).
  • Requires libibverbs at runtime for RDMA transport.
  • Companion: nanoctrl>=0.0.8 for control-plane features.

Known limitations (v0)

  • Observability reports semantic submit/completion only for one-sided ead / write / writeWithImm. Two-sided send/recv/immRecv are intentionally not accounted in v0.
  • No latency histograms, no Prometheus exporter by design — use nanoctrl obs ... --json for scripting.
  • nanoctrl obs links catalogs connections; per-link traffic counters (BW, BYTES, PENDING, ERRORS) render as - pending follow-up.
  • Single NanoCtrl instance; no HA.

New Contributors

Full Changelog: https://github.com/DeepLink-org/DLSlime/commits/dlslime-v0.0.3

PR before 0.0.2

  • Bump to dlslime 002 by @JimyMa in #63
  • init_rc_to_release by @JimyMa in #64
  • update to 0.0.2.post1 by @JimyMa in #65
  • Fix/rdma cq limit by @JimyMa in #66
  • a2a offsets by @FirwoodLin in #60
    1. Init Control Plane 2. Arm support 3. code refine by @JimyMa in #68
  • Stabilize SlimeRPC benchmark transport and optional NanoDeploy backend by @JimyMa in #69
  • Fix RDMA imm recv RNR handling by @JimyMa in #71
  • update readme by @JimyMa in #72
  • SlimeRPC: move datapath to C++, add zero-copy inplace reply path, drop Python backend by @JimyMa in #73
  • P2p rdma rc read ctrl plane fix by @JimyMa in #74
  • P2p rdma rc read ctrl plane fix by @JimyMa in #75
  • Reduce aggregate transfer bench memory usage by @JimyMa in #76
  • Refactor RDMAEndpoint completion ownership by @JimyMa in #70
  • add sendrecv concurrenct stress test by @JimyMa in #77
  • Add GitHub Actions CI by @JimyMa in #78
  • update logo by @JimyMa in #79
  • modulize_peer_agent by @jimym...
Read more

Contributors

zhhsplendid, CokeDong, and 3 other contributors
Loading

AltStyle によって変換されたページ (->オリジナル) /