|
|
|
Created:
11 years, 10 months ago by haypo_gmail Modified:
11 years, 10 months ago Visibility:
Public. |
Resolver API
Patch Set 1 #Patch Set 2 : Now with a cache #
Total comments: 18
Total messages: 7
|
haypo_gmail
https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py File asyncio/base_events.py (right): https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#newcode210 asyncio/base_events.py:210: self.resolver_cache = ResolverCache(0, 60) Oops, it's (20, 60). I ...
|
11 years, 10 months ago (2014年03月07日 18:02:06 UTC) #1 | |||||||||||||||||||||||||||||||||||||
https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py File asyncio/base_events.py (right): https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#new... asyncio/base_events.py:210: self.resolver_cache = ResolverCache(0, 60) Oops, it's (20, 60). I disabled the cache temporary for a test.
Happy to see work in this area, little time to review. One API design question. https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py File asyncio/base_events.py (right): https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#new... asyncio/base_events.py:925: new_resolver = resolver_class(self) I don't see why it is so important to construct the instance in set_resolver(). Why not request that the caller pass in an instance?
Great work! Some thoughts: 1. I think we need to decide now if we prefer composition over inheritance. I.e. one way of adding a non-blocking getaddr is to create a new event loop class inherited from the loop implementation you like and overloading the methods. Since I don't think it's the last pluggable thing we want to add to the loop, it sets an important precedent. 2. I want to have an option to disable DNS caching completely. This should be somehow configurable. 3. I want to have tests for set_resolver(), and ResolverCache class. https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py File asyncio/base_events.py (left): https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#old... asyncio/base_events.py:309: def getaddrinfo(self, host, port, *, I think we should either have an option to disable DNS cache via a loop method, or by an extra argument here. https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py File asyncio/base_events.py (right): https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#new... asyncio/base_events.py:159: self.cache = {} I'd use an OrderedDict or a combination of heap & plain dict here to avoid traversing the whole "cache" dict in '_clear_old_entries'. I know the you're configuring it to have 20 entires at max right now, but other users may set a higher limit (say 10000), and then you're adding some detectable overhead on each getaddr call. https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#new... asyncio/base_events.py:169: remove.append(key) with an OrderedDict you would have here: else: break https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#new... asyncio/base_events.py:184: now = time.monotonic() How about using 'self.loop.time()' here? Should be more consistent (and I guess there would be no harm to have a reference to the active loop in this object) https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#new... asyncio/base_events.py:210: self.resolver_cache = ResolverCache(0, 60) On 2014年03月07日 18:02:07, haypo_gmail wrote: > Oops, it's (20, 60). I disabled the cache temporary for a test. Why just 20? Why not 1000? https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#new... asyncio/base_events.py:411: callback = functools.partial(self._fill_resolver_cache, key) How about using 'lambda' here? https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#new... asyncio/base_events.py:920: def get_resolver(self): Don't like the name. Can we make it a bit more self-descriptive? How about set_dns_resolver? https://codereview.appspot.com/72270043/diff/20001/asyncio/events.py File asyncio/events.py (right): https://codereview.appspot.com/72270043/diff/20001/asyncio/events.py#newcode14 asyncio/events.py:14: import time Why this import?
And some more: 4. I think if you allow users to set their own implementation of getaddr..(), it's also important to let them plug custom cache implementations. I.e. instead of using ResolverCache, I may want to use memcached, so that all my forked workers can benefit from it.
On 2014年03月11日 21:56:05, yselivanov wrote: > 1. I think we need to decide now if we prefer composition over inheritance. I.e. > one way of adding a non-blocking getaddr is to create a new event loop class > inherited from the loop implementation you like and overloading the methods. > Since I don't think it's the last pluggable thing we want to add to the loop, it > sets an important precedent. There's no doubt in my mind that we should *not* use inheritance for pluggable features. The actual class of the event loop may vary per platform, but we may still want to plug in the same functionality. I see a use for creating new event loop classes, but I think it would be when someone wants to use a completely different implementation, e.g. wrap the event loop a platform or framework already has (e.g. the OS X UI event loop). For a DNS plugin, using a subclass would just increases the temptation to use implementation details of the base class.
On 2014年03月11日 22:02:40, GvR wrote: > On 2014年03月11日 21:56:05, yselivanov wrote: > > 1. I think we need to decide now if we prefer composition over inheritance. > I.e. > > one way of adding a non-blocking getaddr is to create a new event loop class > > inherited from the loop implementation you like and overloading the methods. > > Since I don't think it's the last pluggable thing we want to add to the loop, > it > > sets an important precedent. > > There's no doubt in my mind that we should *not* use inheritance for pluggable > features. The actual class of the event loop may vary per platform, but we may > still want to plug in the same functionality. I agree.
https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py File asyncio/base_events.py (left): https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#old... asyncio/base_events.py:309: def getaddrinfo(self, host, port, *, On 2014年03月11日 21:56:05, yselivanov wrote: > I think we should either have an option to disable DNS cache via a loop method, > or by an extra argument here. It is configurable. The API is maybe not obvious: loop.resolver_cache.configure(0, 0) Any suggestion of better API? https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py File asyncio/base_events.py (right): https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#new... asyncio/base_events.py:159: self.cache = {} On 2014年03月11日 21:56:05, yselivanov wrote: > I'd use an OrderedDict or a combination of heap & plain dict here to avoid > traversing the whole "cache" dict in '_clear_old_entries'. > > I know the you're configuring it to have 20 entires at max right now, but other > users may set a higher limit (say 10000), and then you're adding some detectable > overhead on each getaddr call. Yeah, my implementation is naive. But I'm not sure that it's useful to cache 10,000 entries in a resolver cache. https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#new... asyncio/base_events.py:184: now = time.monotonic() On 2014年03月11日 21:56:05, yselivanov wrote: > How about using 'self.loop.time()' here? > Should be more consistent (and I guess there would be no harm to have a > reference to the active loop in this object) I hesitated to link the resolver cache to the loop. I don't care which exact clock the loop does use, but the cache must use a monotonic clock. I guess that it will be easy to decide when unit tests will be written :-) I prefer to discuss the API before starting to work on unit tests. https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#new... asyncio/base_events.py:210: self.resolver_cache = ResolverCache(0, 60) On 2014年03月11日 21:56:05, yselivanov wrote: > On 2014年03月07日 18:02:07, haypo_gmail wrote: > > Oops, it's (20, 60). I disabled the cache temporary for a test. > > Why just 20? Why not 1000? See the issue, these values come from Firefox. Firefox is portable and has a long experience on various platforms, so I trust its default values :-) Anyway, the cache is configurable. https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#new... asyncio/base_events.py:411: callback = functools.partial(self._fill_resolver_cache, key) On 2014年03月11日 21:56:05, yselivanov wrote: > How about using 'lambda' here? I like functools.partial! https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#new... asyncio/base_events.py:920: def get_resolver(self): On 2014年03月11日 21:56:05, yselivanov wrote: > Don't like the name. Can we make it a bit more self-descriptive? > How about set_dns_resolver? I'm not sure that the resolver is fully specific to DNS. getnameinfo() resolves also the name of a service. By the way, getnameinfo() is specific to TCP and UDP, or can it be used for other socket types like AF_UNIX? https://codereview.appspot.com/72270043/diff/20001/asyncio/base_events.py#new... asyncio/base_events.py:925: new_resolver = resolver_class(self) On 2014年03月07日 20:46:23, GvR wrote: > I don't see why it is so important to construct the instance in set_resolver(). > Why not request that the caller pass in an instance? Agreed. https://codereview.appspot.com/72270043/diff/20001/asyncio/events.py File asyncio/events.py (right): https://codereview.appspot.com/72270043/diff/20001/asyncio/events.py#newcode14 asyncio/events.py:14: import time On 2014年03月11日 21:56:05, yselivanov wrote: > Why this import? I don't know :-)