Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Architecture

ABCrimson edited this page Mar 6, 2026 · 6 revisions

Architecture

modern-xlsx is a hybrid Rust WASM + TypeScript library for reading and writing XLSX files.

Layer Diagram

 ┌──────────────────────────────────────────┐
 │ TypeScript API │
 │ Workbook · Worksheet · Cell │
 │ StyleBuilder · RichTextBuilder │
 │ Utilities (dates, cell refs, CSV, JSON) │
 └────────────────┬─────────────────────────┘
 │ JSON string
 ┌────────────────┴─────────────────────────┐
 │ WASM Bridge (wasm-bindgen) │
 │ read() · write() · readJson() │
 └────────────────┬─────────────────────────┘
 │
 ┌────────────────┴─────────────────────────┐
 │ Rust Core (modern-xlsx-core) │
 │ ZIP (deflate) · XML (quick-xml SAX) │
 │ SharedStringTable · Style resolution │
 │ Content types · Relationships │
 └──────────────────────────────────────────┘

Data Flow

Reader Path

Uint8Array → WASM read()
 → ZIP decompress (zip crate)
 → Parse XML parts (quick-xml SAX)
 → Build WorkbookData struct
 → Serialize to JSON (serde_json)
 → JSON.parse() in JS
 → Workbook class wraps raw data

Writer Path

Workbook.toBuffer()
 → Serialize to JSON (JSON.stringify)
 → WASM write()
 → Build SST from shared string cells
 → Generate XML parts (quick-xml Writer)
 → ZIP compress (zip crate, deflate)
 → Return Uint8Array

Why JSON Bridge?

Data crosses the WASM boundary as JSON strings, not via serde_wasm_bindgen. Benchmarks show this is 8–13x faster for large workbooks because:

  1. serde_json serialization in Rust is heavily optimized (itoa, ryu for numbers)
  2. JSON.parse() is one of the fastest built-in V8/SpiderMonkey operations
  3. A single WASM boundary crossing replaces thousands of individual wasm_bindgen calls

Rust Core Modules

Module Purpose
lib.rs WorkbookData, SheetData, CellData — shared types
reader.rs Read orchestrator (ZIP → parse → WorkbookData)
writer.rs Write orchestrator (WorkbookData → XML → ZIP)
streaming.rs Streaming reader/writer for large files
ooxml/ Individual OOXML part parsers
number_format.rs Excel number format code parser
errors.rs ModernXlsxError error type
validate.rs OOXML validation and repair engine
ole2/ OLE2 compound document read/write (feature-gated: encryption)

OOXML Parsers

Parser OOXML Part
shared_strings.rs xl/sharedStrings.xml
styles.rs xl/styles.xml
worksheet.rs xl/worksheets/sheet*.xml
workbook.rs xl/workbook.xml
relationships.rs *.rels files
content_types.rs [Content_Types].xml
doc_props.rs docProps/core.xml, docProps/app.xml
comments.rs xl/comments*.xml
theme.rs xl/theme/theme1.xml
calc_chain.rs xl/calcChain.xml
validate.rs Validation rules and auto-repair
pivot_table.rs xl/pivotTables/, xl/pivotCache/
threaded_comments.rs xl/threadedComments/, xl/persons/
slicers.rs xl/slicers/, xl/slicerCaches/
timelines.rs xl/timelines/, xl/timelineCache/

Performance Patterns

  • SAX parsing — quick-xml streams events, never builds a DOM tree
  • Vec::with_capacity() — pre-allocated parse buffers on all XML parsers
  • push_entity() — zero-allocation XML entity resolution (writes into caller's buffer)
  • from_utf8().unwrap_or_default() — avoids lossy conversion + allocation
  • entries.remove() — moves data out of HashMap instead of cloning
  • drain() — moves preserved entries instead of cloning large blobs
  • Binary search insertion — rows inserted in sorted order
  • itoa::Buffer — zero-allocation integer formatting in XML
  • cold_path() — Rust 1.95 compiler hint on all error branches for optimal icache layout
  • #[inline] — on hot-path streaming JSON helpers and chart enum methods
  • .find() iterators — single-attribute XML parsing avoids full loop iteration
  • zip() iteration — parallel vector attachment without bounds checks
  • Buffer pre-allocation — XML writer String::with_capacity() based on data size estimates
  • Feature gating#[cfg(feature = "encryption")] makes crypto deps optional for smaller WASM

Bundle Size

Component Size
WASM binary ~939 KB
JS wrapper ~55 KB
Total ~994 KB

Browser Build Pipeline

The TypeScript package produces three build outputs via tsdown (rolldown):

src/index.ts → dist/index.mjs (ESM, 55 KB)
src/browser-entry.ts → dist/modern-xlsx.min.js (IIFE, 29 KB, minified)
src/worker.ts → dist/modern-xlsx.worker.js (ESM, 6 KB, minified)

IIFE Bundle

The IIFE bundle exposes the full API on window.ModernXlsx. It inlines all TypeScript source but keeps the WASM binary external. The detectWasmUrl() function auto-resolves the WASM path from document.currentScript.src.

Web Worker

The worker entry point runs in a DedicatedWorkerGlobalScope. It:

  1. Receives messages with {type: 'read'|'write', data?, json?}
  2. Auto-initializes WASM on first use
  3. Returns results with transferable ArrayBuffers (zero-copy)

Source Maps

All outputs include source maps (.js.map) for debugging in browser DevTools.

Module Splits (v0.9.1)

Large source files were split into focused submodules for maintainability:

Rust: worksheet.rsworksheet/

The 7,173-line worksheet.rs was split into four files:

File Responsibility
worksheet/mod.rs Types, re-exports
worksheet/parser.rs SAX XML parsing (xl/worksheets/sheet*.xml)
worksheet/writer.rs XML generation
worksheet/json.rs Streaming JSON serialization for WASM bridge

Rust: charts.rscharts/

The 4,862-line charts.rs was split into four files:

File Responsibility
charts/mod.rs Re-exports, chart resolution logic
charts/types.rs ChartData, enums, serde types
charts/parser.rs SAX XML parsing (xl/charts/chart*.xml)
charts/writer.rs XML generation (ChartData::to_xml())

TypeScript: barcode.tsbarcode/

The 1,828-line barcode.ts was split into 11 tree-shakeable modules — one per barcode codec (Code128, EAN-13, QR, etc.) plus a shared utilities module and barrel index.ts.

Performance Patterns (v0.9.1)

  • ryu crate — 2-6x faster f64-to-string formatting in hot paths (worksheet JSON, chart values)
  • itoa::Buffermake_rid() helper eliminates 21 format!("rId{}", n) heap allocations
  • Cow<'static, str> — Relationship fields use borrowed static strings for well-known constants (e.g., OOXML namespace URIs), avoiding allocation entirely
  • Byte-level JSON escaping — batch memcpy of safe spans instead of char-by-char iteration

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /