Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

cairn-geocoder/cairn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

304 Commits

Cairn

CI Security scan Code Scanning Alerts License: MIT License: Apache 2.0

Offline, airgap-ready geocoder written in Rust.

Cairn (n.) — a pile of stones marking a trail. Each tile is a stone. Drop the pile on disk. The geocoder reads it.

Status

Alpha. Forward search, autocomplete, fuzzy, layer filter, focus bias, structured search, and reverse geocoding all working end-to-end on a Liechtenstein dataset (OSM PBF + WhosOnFirst SQLite).

Try the live demo — 9 preset queries dispatched against a real cairn.kaldera.dev backend on Hetzner k3s with a Liechtenstein bundle. Free-form composer below the presets lets you craft any /v1/search, /v1/structured, or /v1/reverse call.

Goals

  • Forward + reverse geocoding, autocomplete, structured search
  • Single static binary + single bundle artifact (tar)
  • Zero network at runtime — full airgap deploy
  • Region extracts via tile-tree subset (Valhalla-style 3-level grid)
  • Single-machine commodity hardware, no cluster

Non-goals

  • Multi-tenant SaaS
  • Live OSM diff replication (planned post-MVP)
  • Cloud-native horizontal scaling

Architecture

Three layers:

  1. Builder (cairn-build) — ingests OSM PBF, WhosOnFirst SQLite, OpenAddresses CSV. Emits per-tile rkyv blobs, a tantivy text index, an admin polygon layer (bincode), and a centroid layer for nearest fallback. Writes a manifest.toml with blake3 hashes.
  2. Bundle — flat directory of immutable mmap-ready files.
  3. Server (cairn-serve) — axum HTTP API. Loads the bundle once at startup; no DB, no daemon dependencies.

Tile model

64-bit PlaceId: [level: 3 | tile_id: 22 | local_id: 39]

Level Cell size Contents
0 ×ばつ Countries, regions
1 ×ばつ Cities, counties, postcodes
2 0.25° ×ばつ 0.25° Streets, addresses, POIs, neighborhoods

Workspace

crates/
 cairn-geocoder/ umbrella, re-exports
 cairn-place/ Place, PlaceId, schema (rkyv-archived)
 cairn-tile/ tile coords, manifest, blob IO, blake3 verify
 cairn-text/ tantivy index + autocomplete + fuzzy + geo-bias
 cairn-spatial/ R*-tree PIP for admin polygons + nearest centroids
 cairn-parse/ address parsing (libpostal FFI deferred)
 cairn-import-osm/ OSM PBF: place / POI nodes + named highway ways
 cairn-import-wof/ WhosOnFirst SPR + multilingual names + polygons
 cairn-import-oa/ OpenAddresses CSV
 cairn-import-geonames/ Geonames TSV (stub)
 cairn-api/ axum handlers
bins/
 cairn-build/ CLI build / extract / verify / info
 cairn-serve/ HTTP runtime

Quick start

# 1. Fetch source data (one-time, can be mirrored offline after)
mkdir -p data
curl -fsSL -o data/liechtenstein.osm.pbf \
 https://download.geofabrik.de/europe/liechtenstein-latest.osm.pbf
curl -fsSL -o data/wof-li.db.bz2 \
 https://data.geocode.earth/wof/dist/sqlite/whosonfirst-data-admin-li-latest.db.bz2
bunzip2 data/wof-li.db.bz2
# 2. Build the workspace
cargo build --release -p cairn-build -p cairn-serve
# 3. Build a bundle
./target/release/cairn-build build \
 --osm data/liechtenstein.osm.pbf \
 --wof data/wof-li.db \
 --out bundle \
 --bundle-id liechtenstein
# 4. Verify integrity
./target/release/cairn-build verify --bundle bundle
# OK: 6 tiles verified, text=ok, admin=ok, points=ok
# 5. Inspect
./target/release/cairn-build info --bundle bundle
# 6. Serve
./target/release/cairn-serve --bundle bundle --bind 127.0.0.1:8080

Endpoints

GET /healthz
GET /readyz 200 ready / 503 if no text index
GET /v1/search forward + autocomplete
GET /v1/structured field-by-field search
GET /v1/reverse PIP + nearest fallback

/v1/search

Param Type Notes
q string (required) Free-text query.
mode search|autocomplete Default search.
limit int (1–100) Default 10.
fuzzy int 0–2 Edit distance. Forward mode only.
layer csv Restrict to kinds (e.g. country,city,street).
focus.lat, focus.lon float Focus point for distance-biased rerank.
focus.weight float Distance penalty weight (default 0.5).
curl 'http://localhost:8080/v1/search?q=Vaduz&layer=city&focus.lat=47.165&focus.lon=9.51'
curl 'http://localhost:8080/v1/search?q=Vad&mode=autocomplete'
curl 'http://localhost:8080/v1/search?q=vaaduz&fuzzy=2'

/v1/structured

Param Type Notes
house_number / road / unit string Address parts.
postcode / city / district / region / country string Admin parts.
limit, focus.* as above

Builds a concatenated query, picks a layer hint based on the finest non-empty field (address → street → city → region → country).

curl 'http://localhost:8080/v1/structured?road=Aeulestrasse&city=Vaduz'
curl 'http://localhost:8080/v1/structured?country=Liechtenstein'

/v1/reverse

Param Type Notes
lat, lon float (required)
limit int 1–50 Default 10.
nearest int 0–50 Fallback K-nearest centroids when PIP empty.

Response includes source: "pip" \| "nearest". PIP results are sorted finest containing polygon first; admin chain available via admin_path.

curl 'http://localhost:8080/v1/reverse?lat=47.141&lon=9.523'
curl 'http://localhost:8080/v1/reverse?lat=48.0&lon=10.5&nearest=5'

Bundle layout

bundle/
├── manifest.toml schema, source hashes, per-tile blake3
├── tiles/<level>/<row>/<col>/<id>.bin rkyv-archived Place blobs
├── index/text/ tantivy segments (mmap'd at runtime)
└── spatial/
 ├── admin.bin bincode AdminLayer (polygons + metadata)
 └── points.bin bincode PointLayer (centroids for nearest fallback)

Build sources

Source Format Coverage Loaded by
OpenStreetMap *.osm.pbf Global --osm
WhosOnFirst SQLite Per-country admin bundles --wof
OpenAddresses CSV Per-region authoritative addresses --oa
Geonames TSV Global populated places --geonames (stub)

Quality gates

cargo fmt --all -- --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace

26 unit tests cover Place ID encoding, tile blob roundtrip, tile blake3 corruption detection, OSM tag classification, OA row validation, WoF parent-chain walking, tantivy search/autocomplete/fuzzy/layer/focus, admin PIP ordering, and nearest-K queries.

Security

  • Continuous Trivy scans of the published image and the source tree (every push, every PR, daily cron). Results land in the GitHub Code Scanning tab as the live, version-pinned CVE list.
  • Bundle integrity is anchored by per-tile blake3 hashes in manifest.toml; cairn-build verify recomputes every tile, every spatial blob, and every tantivy segment hash and bails on mismatch.
  • API key auth and per-IP rate limiting are opt-in via env vars; X-Forwarded-For is honored only when the per-connection peer is inside a configured CIDR allowlist.
  • Full threat model, triage policy, and reporting instructions: SECURITY.md.

Roadmap

See ROADMAP.md for deferred phases (libpostal FFI, address interpolation, OSM admin relations, per-tile spatial partitioning, distribution tooling).

License

Dual-licensed: MIT OR Apache-2.0. Pick whichever fits.

References

About

Offline, airgap-ready geocoder written in Rust.

Topics

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

Contributors

AltStyle によって変換されたページ (->オリジナル) /