1
1
Fork
You've already forked ecformat
0

Speed up with parallelization #15

Open
opened 2025年08月27日 18:02:16 +02:00 by BaumiCoder · 0 comments

Type of enhancement

runtime performance

Enhancement description

The files can be processed in parallel, for both commands check and fix. The processing of each file is independent of the other files. Furthermore, the processing of one file is not to small (i.e., the parallelization overhead should pay off). As there are typically very much files, not each file should create its on thread (at the operating system level). Instead, a smart runtime should decide how many threads are used (dependent on the current machine).

To have a deterministic output of the commands, the order of the output written to stdout / stderr should depend on the processing so far (without the parallelization). The "main thread" should wait for the incoming results of the file processing and print them in the correct order (caching result for later printing when arriving to early).

For configuration a command line argument may allow setting the number of used threads (at the operating system level). This would also allow to disable parallelization with setting this value to 1.

Some runtime measurements before and after the changes for this issue, would be interesting. Some (larger) open-source repositories can be used as test data.

Resources

  • tokio - Widely used, powerful library providing an asynchronous rust runtime
  • tokio task - processing one file = one task?
  • parallel directory walk - Maybe a more straight forward solution for this issue (found during work on #1)
  • rayon - Process iterators in parallel
### Type of enhancement runtime performance ### Enhancement description The files can be processed in parallel, for both commands `check` and `fix`. The processing of each file is independent of the other files. Furthermore, the processing of one file is not to small (i.e., the parallelization overhead should pay off). As there are typically very much files, not each file should create its on thread (at the operating system level). Instead, a smart runtime should decide how many threads are used (dependent on the current machine). To have a deterministic output of the commands, the order of the output written to `stdout` / `stderr` should depend on the processing so far (without the parallelization). The "main thread" should wait for the incoming results of the file processing and print them in the correct order (caching result for later printing when arriving to early). For configuration a command line argument may allow setting the number of used threads (at the operating system level). This would also allow to disable parallelization with setting this value to `1`. Some runtime measurements before and after the changes for this issue, would be interesting. Some (larger) open-source repositories can be used as test data. ### Resources - [tokio](https://crates.io/crates/tokio) - Widely used, powerful library providing an asynchronous rust runtime - [tokio task](https://docs.rs/tokio/latest/tokio/task/index.html) - processing one file = one task? - [parallel directory walk](https://docs.rs/ignore/0.4.23/ignore/struct.WalkBuilder.html#method.build_parallel) - Maybe a more straight forward solution for this issue (found during work on #1) - [rayon](https://crates.io/crates/rayon) - Process iterators in parallel
Sign in to join this conversation.
No Branch/Tag specified
main
chore/dependencies-0.2.0
bugfix/47-subdirectory-sections-are-not-considered
enhancement/no-colors-log-to-file
bugfix/Too-many-open-files-via-ConfigFile
bugfix/spelling_language-unparsable-values
feature/38-status-command
chore/add-pre-commit-installation-to-issue-template
chore/add-crate-installation-to-issue-template
enhancement/finalize-0.17.2-editorconfig-support
feature/8-spelling_language
feature/7-indentation
bugfix/CI-change-detection
feature/6-insert_final_newline
feature/5-trim_trailing_whitespace
feature/16-pre-commit
chore/dependencies-0.1.1
chore/33-Make-CI-and-pre-commit-more-reproducable
bugfix/28-docs.rs-build
feature/31-support-multiple-target-files
chore/25-remove-reuse-workarounds
chore/dependencies-badge
feature/11-Rust-crate-on-crates.io
chore/10-CI
chore/9-licenses
feature/4-end_of_line
feature/2-linters
feature/3-charset
feature/18-log-levels
feature/1-ignore-files
v0.2.0
v0.1.1
v0.1.0
Labels
Clear labels
Compat/Breaking
Breaking change that won't be backward compatible
EditorConfig
0.17.2
Issues to support version 0.17.2 of the EditorConfig specification
Kind
Bug
Something is not working
Kind
Chore
Some tasks maintainig tasks
Kind
Documentation
Documentation changes
Kind
Enhancement
Improve existing functionality
Kind
Feature
New functionality
Kind
Testing
Issue or pull request related to testing
Packaging
About packaging the project for some platform
Priority
Critical
The priority is critical
Priority
High
The priority is high
Priority
Low
The priority is low
Priority
Medium
The priority is medium
Reviewed
Confirmed
Issue has been confirmed
Reviewed
Duplicate
This issue or pull request already exists
Reviewed
Invalid
Invalid issue
Reviewed
Won't Fix
This issue won't be fixed
Status
Abandoned
Somebody has started to work on this but abandoned work
Status
Blocked
Something is blocking this issue or pull request
Status
Need More Info
Feedback is required to reproduce issue or to continue work
WIP
Work in progress (Assignee is working on this issue)
Milestone
Clear milestone
No items
No milestone
Projects
Clear projects
No items
No project
Assignees
Clear assignees
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
BaumiCoder/ecformat#15
Reference in a new issue
BaumiCoder/ecformat
No description provided.
Delete branch "%!s()"

Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?