-
Notifications
You must be signed in to change notification settings - Fork 300
-
Hi folks,
in my case i implemented a view with some include and nested include options.
I analyzed the timings of the several steps of the django request lifecycle as so following:
WebMapService endpoint with the option to include Layer and Layer.ReferenceSystem objects.
If i retrieve a single WebMapService with 145 included Layers it tooks:
Database lookup | 0.2719s
Serialization | 0.0548s
Django request/response | 0.0144s
API view | 0.0053s
Response rendering | 2.1091s
If i retrieve a single WebMapService with 145 included Layers and 12 Layer.ReferenceSystem objects it tooks:
Database lookup | 0.2872s
Serialization | 0.0620s
Django request/response | 0.0201s
API view | 0.0051s
Response rendering | 4.2001s
As i can see, the rendering of the response is the bottleneck and it scales non linear in relation to the count of included objects.
Digging deeper, it shows out that the nested call of extract_included inside the several for loops is the week point.
Is there any tweak to optimize the for loops? Maybe as parallelized variation?
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 1 comment 2 replies
-
Thanks for bringing this up. I took a look at the extract_included function and there are three obvious things I see which can be improved:
- included resources should not constantly get underscored. This should be done once before the extract_included gets called. Also see issue Reverse the relationship names in includes #203 where this underscoring does not seem to properly work anyway.
- Not really a big thing, but I do not see why there needs to be an iter in this for-loop looping over items is enough.
- Any field which is not in the included resources (which are many) will cause an exception. So there is a lot of exception handling happening. That is properly one issue why it is slow. Certainly, better to make it into a conditional.
In any case, before we do performance improvements, we need a test which shows that extract_included is slow (could be added to test_performance). The test should only call extract_included and should include a large amount of nested includes etc. The test can be skipped by default but can then be used to verify whether changes have a positive affect or not.
So if anyone is interested to dive into it, a PR with a performance test would be a very good first step.
Beta Was this translation helpful? Give feedback.
All reactions
-
i implemented a high level test which calls a view with include option and the view response with 10000 included items.
I tested your three points, but none of them increases the performance notable.
I also tried to paralyze the nested for-loop, but this increases the rendering time by factor 2.
Interesting to see that as you pointed out, the number of fields increases the runtime a lot.
Cause there are no recursive calls of the extract_included this is not a point for now to dig deeper.
For now i don't know how to optimize the list serializer looping.
Beta Was this translation helpful? Give feedback.
All reactions
-
I think to dive deeper into this issue, it would be good to get some in-depth profiling. You could use the pytest-profiling plugin to create a svg graph and post this here for discussion.
Beta Was this translation helpful? Give feedback.