Type of enhancement
runtime performance
Enhancement description
The files can be processed in parallel, for both commands check and fix. The processing of each file is independent of the other files. Furthermore, the processing of one file is not to small (i.e., the parallelization overhead should pay off). As there are typically very much files, not each file should create its on thread (at the operating system level). Instead, a smart runtime should decide how many threads are used (dependent on the current machine).
To have a deterministic output of the commands, the order of the output written to stdout / stderr should depend on the processing so far (without the parallelization). The "main thread" should wait for the incoming results of the file processing and print them in the correct order (caching result for later printing when arriving to early).
For configuration a command line argument may allow setting the number of used threads (at the operating system level). This would also allow to disable parallelization with setting this value to 1.
Some runtime measurements before and after the changes for this issue, would be interesting. Some (larger) open-source repositories can be used as test data.
Resources
- tokio - Widely used, powerful library providing an asynchronous rust runtime
- tokio task - processing one file = one task?
- parallel directory walk - Maybe a more straight forward solution for this issue (found during work on #1)
- rayon - Process iterators in parallel
### Type of enhancement
runtime performance
### Enhancement description
The files can be processed in parallel, for both commands `check` and `fix`. The processing of each file is independent of the other files. Furthermore, the processing of one file is not to small (i.e., the parallelization overhead should pay off). As there are typically very much files, not each file should create its on thread (at the operating system level). Instead, a smart runtime should decide how many threads are used (dependent on the current machine).
To have a deterministic output of the commands, the order of the output written to `stdout` / `stderr` should depend on the processing so far (without the parallelization). The "main thread" should wait for the incoming results of the file processing and print them in the correct order (caching result for later printing when arriving to early).
For configuration a command line argument may allow setting the number of used threads (at the operating system level). This would also allow to disable parallelization with setting this value to `1`.
Some runtime measurements before and after the changes for this issue, would be interesting. Some (larger) open-source repositories can be used as test data.
### Resources
- [tokio](https://crates.io/crates/tokio) - Widely used, powerful library providing an asynchronous rust runtime
- [tokio task](https://docs.rs/tokio/latest/tokio/task/index.html) - processing one file = one task?
- [parallel directory walk](https://docs.rs/ignore/0.4.23/ignore/struct.WalkBuilder.html#method.build_parallel) - Maybe a more straight forward solution for this issue (found during work on #1)
- [rayon](https://crates.io/crates/rayon) - Process iterators in parallel