Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Improved performance of ContainerFactory::clearOldContainers() #733

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dktapps wants to merge 5 commits into phpstan:master from pmmp:clear-old-containers-perf
Closed

Improved performance of ContainerFactory::clearOldContainers() #733

dktapps wants to merge 5 commits into phpstan:master from pmmp:clear-old-containers-perf

Conversation

@dktapps
Copy link
Contributor

@dktapps dktapps commented Oct 24, 2021
edited
Loading

This PR makes three changes:

  1. Do not attempt cache cleanup on workers. This saves an enormous amount of time if lots of workers are used (on my machine, 16).
  2. Use FilesystemIterator directly instead of Symfony Finder to find dead cache files. This is because in profile I noticed that RecursiveDirectoryIterator::hasChildren() accounted for almost half of the time spent in this function, while the cache dir definitely shouldn't have children anyway.
  3. Move SplFileInfo->getRealPath() call to the end where it's actually used.

Together, these changes reduce pre-analysis time on my laptop from 5.6sec to 3.3sec with 30k matching files in the nette.configurator cache. Under xdebug, it drops from ~12.7sec to ~8.5sec, with 1.9sec now being spent in here (compared to 6.5sec previously).
Analysis time without a result cache drops from 33sec to 13sec on pmmp/PocketMine-MP thanks to 1).

...iners() on Windows
Use native FilesystemIterator instead of Symfony Finder. This is because FilesystemIterator does not recurse. I observed that over 50% of the time spent in this function was just calling RecursiveDirectoryIterator::hasChildren(). Eliminating this slashes 1000ms off pre-analysis time on my laptop when the tmpdir contains 30k files (5.6sec vs 4.6sec).
this shaves off an additional 1300ms in the same scenario as previous.
Copy link
Contributor Author

dktapps commented Oct 24, 2021
edited
Loading

I just noticed that this function also runs in every single subprocess. This would explain the 1min+ delays I saw with highly populated cache in phpstan/phpstan#5825 (my laptop has 16 logical cores, so this code took 16x the time).

...ulated cache
this change reduces full analysis time of pmmp/PocketMine-MP from 33sec to 13sec (no result cache)
Copy link
Contributor Author

dktapps commented Oct 24, 2021

Final commit produces a massive performance improvement for full analysis (33sec -> 13sec when analysing PocketMine with a 30k file cache on Windows).

staabm, Kocal, and voku reacted with heart emoji staabm and voku reacted with rocket emoji

Copy link
Member

Hi, if the directory is almost empty after 27df5cf then this isn't really needed, right?

Copy link
Contributor Author

dktapps commented Oct 25, 2021

@ondrejmirtes Not strictly true, there's still going to be immense drag for 2 days after updating, until PHPStan figures out that it can get rid of some files. In any case, if the cache gets overpopulated for some other reason, these changes will ensure that the problem doesn't come back again with such crippling impact (it made PHPStan basically unusable for me). While analysedPaths was the culprit for me, I've seen several other users with extremely bloated caches and I wouldn't bet that my issue is the same as theirs.

If you want to go minimal on this, please at the very least take 5b784ad.

Copy link
Contributor

staabm commented Oct 25, 2021
edited
Loading

shouldn't we push it event further, so the worker processes don't do additional IO, which the master process already took care of?

(e.g. all these autoloadFile, scanFiles, stubFiles parameter validations leads to IO which all worker do again and again, without further benefit)

see

thats a lot of file IO, which can be rather slow (especially when done in parallel across several workers concurrently onto the same files)

Copy link
Contributor Author

dktapps commented Oct 25, 2021

Personally I do plan to explore what other stuff can be done to optimise this, but for now I'm taking it one step at a time. Once this problem is addressed it'll be more obvious from profiling what needs to be optimised. So far it's looking like the bulk of the overhead is coming from autoloading once this problem is subtracted.

staabm reacted with thumbs up emoji

Copy link
Contributor Author

dktapps commented Oct 25, 2021

To be clear, I'm only favouring evidence-driven optimisations here (stuff that I can easily observe with a profiler or by eye). Efforts can be better directed towards things that have maximum impact that way.

The next biggest optimisation step I explored (in the context of PocketMine, anyway) was to avoid re-analysing stubs every time PHPStan is run (see #730). But over half of the overhead there comes from regenerating Nette DI containers (at least on Windows), so it's possible the recent analysedPaths change might have reduced the need for that by improving container reusability.

staabm reacted with thumbs up emoji

Copy link
Member

shouldn't we push it event further, so the worker processes don't do additional IO, which the master process already took care of?

@staabm Please make these kinds of decisions only with benchmark backing it up. I think that the impact of those operations is negligible.

Copy link
Member

If you want to go minimal on this, please at the very least take 5b784ad.

Alright, I did that one: 0f6245b

As for the rest - I don't like doing nonsystemic changes like that. If the Symfony Finder is bad from performance perspective, we should do our own implementation with nice API and use it everywhere.

Copy link
Member

Thank you!

Copy link
Contributor Author

dktapps commented Oct 25, 2021

If the Symfony Finder is bad from performance perspective

It's bad in this specific context because it uses RecursiveDirectoryIterator when we know the depth of the directory tree will be 0. I suppose this could be optimised in Symfony itself by special-casing ->depth(0) to use FilesystemIterator instead of RecursiveDirectoryIterator.

Copy link
Contributor

staabm commented Oct 25, 2021
edited
Loading

It's bad in this specific context because it uses RecursiveDirectoryIterator when we know the depth of the directory tree will be 0.

I think we could use a files() on the finder, so it should not recurse

see https://symfony.com/doc/current/components/finder.html#files-or-directories

Copy link
Contributor Author

dktapps commented Oct 25, 2021
edited
Loading

I think we could use a files() on the finder, so it should not recurse

Profile shows it still recurses in this case. (or at least, it's still calling RecursiveDirectoryIterator::hasChildren()).

staabm reacted with confused emoji

Copy link
Contributor Author

dktapps commented Oct 25, 2021

In any case, this looks like a bug/performance issue that should be addressed in Symfony itself. Looking at the code I think it's probably not too difficult (a depth tracker would probably be good enough).

Copy link
Contributor Author

dktapps commented Oct 25, 2021

It looks like this is a bug in PHP itself. Consider the following code, which should be silent:

<?php
require 'vendor/autoload.php';
$iterator = new class(sys_get_temp_dir() . '/phpstan/cache/nette.configurator') extends \RecursiveDirectoryIterator{
	public function hasChildren($allowLinks = false){
		var_dump("called hasChildren");
		return parent::hasChildren($allowLinks);
	}
};
$iterator2 = new RecursiveIteratorIterator($iterator);
$iterator2->setMaxDepth(0);
foreach($iterator2 as $item){
	//var_dump($item);
}

instead outputs string(18) "called hasChildren" many thousands of times.

Copy link
Contributor Author

dktapps commented Oct 25, 2021

Submitted a bug report https://bugs.php.net/bug.php?id=81554.

staabm reacted with heart emoji

Copy link
Contributor

staabm commented Oct 28, 2021
edited
Loading

@dktapps how did you measure/profile worker-prozess startup time?

Copy link
Contributor

staabm commented Oct 28, 2021

I found a way to blackfire profile the worker process by patching the ProcessHelper with

diff --git a/src/Process/ProcessHelper.php b/src/Process/ProcessHelper.php
index 0e16484dd..10b18b478 100644
--- a/src/Process/ProcessHelper.php
+++ b/src/Process/ProcessHelper.php
@@ -27,6 +27,8 @@ class ProcessHelper
 $phpIni = php_ini_loaded_file();
 $phpCmd = $phpIni === false ? escapeshellarg(PHP_BINARY) : sprintf('%s -c %s', escapeshellarg(PHP_BINARY), escapeshellarg($phpIni));
+ $phpCmd = 'blackfire run --ignore-exit-status ' . $phpCmd;
+
 $processCommandArray = [
 $phpCmd,
 ];

that way I get one profile per worker.

Copy link
Contributor Author

dktapps commented Oct 28, 2021

@staabm I used xdebug. It generates a file per PHP process anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /