Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

command dump

zmworm edited this page Jun 13, 2026 · 6 revisions

dump

Serialize a document into a replayable batch script — the round-trip mechanism for editing a document by emit → modify → replay.

Synopsis

officecli dump <file> <path> [--format batch] [-o <out>] [--json]

Description

Walks the document and emits a JSON BatchItem[] array that, when replayed via officecli batch, reconstructs the source document. Supports .docx and (since v1.0.85+) .pptx. For pptx, unsupported elements surface as warnings rather than aborting the dump.

The dump is portable: unstable IDs (paraId / rsidR / textId) and derived effective.* readbacks are filtered out. The OpenXML SDK regenerates IDs on save, so emit just stays out of the way.

Arguments

Name Type Required Default Description
file path Yes - Document path — .docx (full coverage including embedded OLE objects, floating/anchored charts, chart userShapes overlays, multi-section header/footer references, data-bound content controls, multi-paragraph SDT inlined-parts, legacy form fields, cross-paragraph field chains in table cells, picture margin-edge relative positions, text-wrapping break clear, footnote/endnote indent overrides) or .pptx (text + tables + pictures + charts + notes + theme/master/layout raw + OLE/3D/video/audio/SmartArt via add-part round-trip + morph/p14/p15 transitions + motion-path animations)
path string Yes - DOM path to dump. / emits the whole document; subtree paths emit just that subtree without bundling sibling resources. Supported: /, /body, /body/p[N], /body/tbl[N], and resource parts /theme, /settings, /numbering, /styles. Subtree emit uses last() xpath predicates so the script is safe to replay onto non-blank documents.

Options

Name Type Required Default Description
--format string No batch Output format. Currently only batch is supported.
-o / --out path No - Write output to file instead of stdout. Stdout output is the path on success.
--json bool No false Standard JSON envelope wrapper (the batch payload itself is always JSON).

What's emitted

v1.0.73 hardened the round-trip extensively: bookmarks (cross-paragraph spans), TOC fields with \t/\b switches, page-background color, hyperlink tooltip/tgtFrame/history, eastAsianLayout, paragraph-mark-only run formatting (markRPr.*), tables in headers/footers, columns + vAlign on inline section breaks, fldSimple/oMath inside hyperlinks/ins/del/footnotes, ruby/smartTag/customXml wrappers, cantSplit rows, tcW percent semantics, asymmetric tcMar padding, w:sym runs, noBreakHyphen/softHyphen, soft <w:br/> line breaks, ListItem SDT, MERGEFIELD whitespace quoting, complex-field HYPERLINKs, comment dates, PAGE field, header/footer types from sections, lineRule (atLeast/exact/auto), char-based indents, w14 ligatures/numForm/numSpacing, ins/del track-change attribution.

Layer Mechanism
/styles Emitted before body so paragraph styleId refs resolve on replay
/body paragraphs Single-run paragraphs collapse into one add p row; multi-run paragraphs split into paragraph + run child rows
Tables and mixed body content Typed add rows
Section page layout set / on the root for page width/height/margins/columns/etc.
Inline section breaks Section breaks inside the body emitted alongside their paragraph
docDefaults and document protection Emitted alongside section layout
Headers and footers Seed paragraph + appended content per-part
Comments / footnote refs / endnote refs Anchored to the body paragraphs they reference
Numbering Emitted wholesale via raw-set when document has list templates
Settings part Emitted wholesale via raw-set
Theme part Emitted wholesale via raw-set
Charts Typed add (chartType + data string) — not raw-set
Pictures Inlined as data URIs through the src= prop

Format keys are forwarded as-is; the OOXML schema reflection fallback in the Add side accepts arbitrary props, so emit doesn't need a per-key allowlist.

Examples

# Whole document to stdout
officecli dump report.docx /
# Write to a batch file
officecli dump report.docx / -o report.batch.json
# Subtree: just one paragraph
officecli dump report.docx /body/p[3]
# Subtree: a single table or a resource part
officecli dump report.docx /body/tbl[1]
officecli dump report.docx /numbering
# Round-trip: dump → batch
officecli dump report.docx / -o /tmp/r.json
officecli create rebuilt.docx --type docx
officecli batch rebuilt.docx --input /tmp/r.json

Notes

  • --out - is treated as stdout (not a file literally named -).
  • With --json, the envelope's data carries outputFile + itemCount metadata, not a bare path.
  • TOC PAGEREF page numbers are preserved on round-trip but not recalculated — run refresh afterward to update them.
  • Envelope warnings: auxiliary parts not covered by the dump emitter (e.g. unsupported pptx custom parts, docx custom XML islands) surface as warnings in the JSON envelope. Replay still succeeds; the warning tells you what won't round-trip.

See Also

  • batch — replay the emitted JSON (defaults to continue-on-error)
  • refresh — recalculate TOC / PAGE fields after replay
  • Word reference

Based on OfficeCLI v1.0.97

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /