-
Notifications
You must be signed in to change notification settings - Fork 18
Open
Conversation
Introduce Docxodus (a modernized .NET 8.0 fork of Open-XML-PowerTools with better move detection) as an alternative engine alongside XmlPowerToolsEngine. - Extract BaseEngine class with shared binary extraction and subprocess logic - XmlPowerToolsEngine and DocxodusEngine are thin subclasses setting 3 constants - Add Docxodus as a git submodule at docxodus/ - Refactor build_differ.py into reusable build_engine() function (also fixes missing win-arm64 compression) - Update CI workflow for submodules and .NET SDK - Add integration tests and parametrized contract tests for both engines
Rewrite README to prominently feature Docxodus as the recommended comparison engine, with a link back to the Docxodus repo. Reorganize sections around the dual-engine architecture and add a quick example.
Thread WmlComparerSettings options from Python kwargs through CLI flags to the Docxodus C# binary. Supports detail_threshold, case_insensitive, detect_moves, simplify_move_markup, move_similarity_threshold, move_minimum_word_count, detect_format_changes, conflate_spaces, and date_time. - Extract _build_command() in BaseEngine, override in DocxodusEngine - Add input validation for thresholds and word count - Update Docxodus CLI to parse --flags (backward compat with legacy format) - Rebuild all platform binaries with new flag support - Add 13 new tests (integration, validation, unit) - Update README with Comparison Settings section
Run tests across 3 OSes x 3 Python versions on push to main and on pull requests. Includes package build verification.
Add global.json to pin the .NET SDK to 8.0.x, preventing CI runners with .NET 10 pre-installed from using the wrong compiler (which breaks Docxodus due to List<T>.Reverse() vs LINQ Reverse() resolution). Also fix build_differ.py run_command() to raise on non-zero exit codes instead of silently continuing past build failures.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
XmlPowerToolsEngine, wrapping Docxodus — a modernized .NET 8.0 fork of Open-XML-PowerTools with better move detectionBaseEngineclass with all shared binary extraction and subprocess logic; both engines are thin 3-line subclasses settingDIST_DIR_NAME,BIN_DIR_NAME, andBINARY_BASE_NAMEbuild_differ.pyinto a reusablebuild_engine()function called for both engines (also fixes pre-existing bug wherewin-arm64was compiled but never compressed)Test plan
test_openxml_differ.pypasses (backward compatibility)test_docxodus_engine.pypasses (Docxodus integration)test_engine_contract.pypasses (parametrized contract tests over both engines)hatch buildproduces wheel with both sets of binaries