Issue 87940044: context_id for Task

GvR

I'm pretty conflicted. I see a lot of mechanism but I can't quite tell whether ...

11 years, 8 months ago (2014年04月15日 21:31:03 UTC) #2

I'm pretty conflicted. I see a lot of mechanism but I can't quite tell whether
this will actually address the use case.
Maybe one of us should try to add context_ids to a realistic example (e.g.
crawl.py) to see if it actually satisfies the use case.
https://codereview.appspot.com/87940044/diff/1/asyncio/base_events.py
File asyncio/base_events.py (right):
https://codereview.appspot.com/87940044/diff/1/asyncio/base_events.py#newcode242
asyncio/base_events.py:242: return self._context_id
I think should also have a set_context_id() method. Having it set directly from
outside the class feels odd.
https://codereview.appspot.com/87940044/diff/1/asyncio/base_events.py#newcode316
asyncio/base_events.py:316: context_id=None)
Did you mean context_id=context_id?
https://codereview.appspot.com/87940044/diff/1/asyncio/events.py
File asyncio/events.py (right):
https://codereview.appspot.com/87940044/diff/1/asyncio/events.py#newcode33
asyncio/events.py:33: return res
I wonder if the context_id should also be added (if not None). It might be
handy.
https://codereview.appspot.com/87940044/diff/1/asyncio/events.py#newcode40
asyncio/events.py:40: self._loop._context_id = self._context_id
Should be inside the try.
https://codereview.appspot.com/87940044/diff/1/asyncio/futures.py
File asyncio/futures.py (right):
https://codereview.appspot.com/87940044/diff/1/asyncio/futures.py#newcode154
asyncio/futures.py:154: context_id = self._loop.get_context_id()
Hm. Not sure if I'm contradicting myself, but here I'd worry about the extra
call overhead.
https://codereview.appspot.com/87940044/diff/1/asyncio/futures.py#newcode276
asyncio/futures.py:276: self._loop.call_soon(fn, self,
context_id=self._context_id)
My spider-sense is tingling here. I wonder if it isn't reasonable to have
different callbacks on the same Future associated with different context_ids.
https://codereview.appspot.com/87940044/diff/1/asyncio/selector_events.py
File asyncio/selector_events.py (right):
https://codereview.appspot.com/87940044/diff/1/asyncio/selector_events.py#new...
asyncio/selector_events.py:141: handle = events.Handle(callback, args, self,
self._context_id)
Again, my spider-sense is tingling.
Couldn't a reader be associated with multiple context_ids, e.g. when using HTTP
connection pooling and creating a new context_id for each successive request?
https://codereview.appspot.com/87940044/diff/1/tests/test_base_events.py
File tests/test_base_events.py (right):
https://codereview.appspot.com/87940044/diff/1/tests/test_base_events.py#newc...
tests/test_base_events.py:168: None, asyncio.Handle(cb, (), self.loop,
self.loop._context_id),
Add a line break after None for readability (each arg to run_in_executor on a
line by itself).
https://codereview.appspot.com/87940044/diff/1/tests/test_tasks.py
File tests/test_tasks.py (right):
https://codereview.appspot.com/87940044/diff/1/tests/test_tasks.py#newcode1456
tests/test_tasks.py:1456: self.assertEqual(task_context_id, 'spam')
Does this work? I can't prove that bar() will even be called.
In general I'd like to see a test for context_ids that shows their usefulness
(use case), not just tests an edge case of their implementation.

yselivanov

I replied to almost all of the comments and deleted them. I think I need ...

11 years, 8 months ago (2014年04月15日 22:41:57 UTC) #3

I replied to almost all of the comments and deleted them. I think I need more
time for myself to think about this in depth. Will post something soon.

haypo_gmail

https://codereview.appspot.com/87940044/diff/1/asyncio/base_events.py File asyncio/base_events.py (right): https://codereview.appspot.com/87940044/diff/1/asyncio/base_events.py#newcode320 asyncio/base_events.py:320: def run_in_executor(self, executor, callback, *args): No context_id for executors? ...

11 years, 8 months ago (2014年04月16日 04:53:27 UTC) #4

https://codereview.appspot.com/87940044/diff/1/asyncio/base_events.py
File asyncio/base_events.py (right):
https://codereview.appspot.com/87940044/diff/1/asyncio/base_events.py#newcode320
asyncio/base_events.py:320: def run_in_executor(self, executor, callback,
*args):
No context_id for executors? context_id is somehow local to a thread?
https://codereview.appspot.com/87940044/diff/1/asyncio/events.py
File asyncio/events.py (right):
https://codereview.appspot.com/87940044/diff/1/asyncio/events.py#newcode33
asyncio/events.py:33: return res
On 2014年04月15日 21:31:04, GvR wrote:
> I wonder if the context_id should also be added (if not None). It might be
> handy.
Yes, add the context id. If you want a shorter representation, add a __str__()
method.
https://codereview.appspot.com/87940044/diff/1/asyncio/events.py#newcode40
asyncio/events.py:40: self._loop._context_id = self._context_id
On 2014年04月15日 21:31:04, GvR wrote:
> Should be inside the try.
I disagree. Why would it fail?
https://codereview.appspot.com/87940044/diff/1/asyncio/futures.py
File asyncio/futures.py (right):
https://codereview.appspot.com/87940044/diff/1/asyncio/futures.py#newcode154
asyncio/futures.py:154: context_id = self._loop.get_context_id()
On 2014年04月15日 21:31:04, GvR wrote:
> Hm. Not sure if I'm contradicting myself, but here I'd worry about the extra
> call overhead.
Since Future is part of asyncio, for me it's fine to access private attributes
of BaseEventLoop outside BaseEventLoop definition.
https://codereview.appspot.com/87940044/diff/1/asyncio/futures.py#newcode276
asyncio/futures.py:276: self._loop.call_soon(fn, self,
context_id=self._context_id)
On 2014年04月15日 21:31:04, GvR wrote:
> My spider-sense is tingling here. I wonder if it isn't reasonable to have
> different callbacks on the same Future associated with different context_ids.
I see no reason to restrict the context id to the future context id. You can use
something like that:
_NOT_SET = object()
...
def add_done_callback(self, fn, *, context_id=_NOT_SET):
 if context_id is _NOT_SET:
 context_id = self._context_id
 ...
(_NOT_SET is needed to be able to use the None value.)
https://codereview.appspot.com/87940044/diff/1/asyncio/selector_events.py
File asyncio/selector_events.py (right):
https://codereview.appspot.com/87940044/diff/1/asyncio/selector_events.py#new...
asyncio/selector_events.py:141: handle = events.Handle(callback, args, self,
self._context_id)
On 2014年04月15日 21:31:04, GvR wrote:
> Again, my spider-sense is tingling.
> 
> Couldn't a reader be associated with multiple context_ids, e.g. when using
HTTP
> connection pooling and creating a new context_id for each successive request?
Agreed, it's important to be allowed to specify a different context id here.
https://codereview.appspot.com/87940044/diff/1/asyncio/selector_events.py#new...
asyncio/selector_events.py:180: handle = events.Handle(callback, args, self,
self._context_id)
Ditto.
https://codereview.appspot.com/87940044/diff/1/tests/test_base_events.py
File tests/test_base_events.py (right):
https://codereview.appspot.com/87940044/diff/1/tests/test_base_events.py#newc...
tests/test_base_events.py:56: h = asyncio.Handle(lambda: False, (), self.loop,
self.loop._context_id)
You may just pass None for the context id, since the test doesn't check the
context id and the loop context id is None.
https://codereview.appspot.com/87940044/diff/1/tests/test_tasks.py
File tests/test_tasks.py (right):
https://codereview.appspot.com/87940044/diff/1/tests/test_tasks.py#newcode1432
tests/test_tasks.py:1432: def test_task_context_id(self):
You need similar tests for all call*() functions.
https://codereview.appspot.com/87940044/diff/1/tests/test_tasks.py#newcode1443
tests/test_tasks.py:1443: def foo():
Usually, I don't like calling self.assert...() in a coroutine, but here I would
prefer the context id almost everywhere and specify a different context id for
each callback and each task.
You need probably two tests (two test functions): one test checking that it's
possible to specify to specify a different context id everywhere (Future, Task,
coroutine, calls, etc.), one test checking that the "current context id" is
correctly passed to child tasks/calls/etc.
https://codereview.appspot.com/87940044/diff/1/tests/test_tasks.py#newcode1446
tests/test_tasks.py:1446: yield from fut
I expect a test here to make sure that the context id is restored after a
yield-from to its previous value.
https://codereview.appspot.com/87940044/diff/1/tests/test_tasks.py#newcode1451
tests/test_tasks.py:1451: task_context_id = self.loop.get_context_id()
You need probably two tests (two test functions): one test checking that it's
possible to specify to specify a different context id everywhere (Future, Task,
coroutine, calls, etc.), one test checking that the "current context id" is
correctly passed to child tasks/calls/etc.

GvR

https://codereview.appspot.com/87940044/diff/1/asyncio/events.py File asyncio/events.py (right): https://codereview.appspot.com/87940044/diff/1/asyncio/events.py#newcode40 asyncio/events.py:40: self._loop._context_id = self._context_id On 2014年04月16日 04:53:27, haypo_gmail wrote: > ...

11 years, 8 months ago (2014年04月16日 14:47:11 UTC) #5

https://codereview.appspot.com/87940044/diff/1/asyncio/events.py
File asyncio/events.py (right):
https://codereview.appspot.com/87940044/diff/1/asyncio/events.py#newcode40
asyncio/events.py:40: self._loop._context_id = self._context_id
On 2014年04月16日 04:53:27, haypo_gmail wrote:
> On 2014年04月15日 21:31:04, GvR wrote:
> > Should be inside the try.
> 
> I disagree. Why would it fail?
That's not my point. The general pattern is
<save state>
try:
 <mutate state>
finally:
 <restore state>
It's generally important that all of <mutate state> is inside the try block,
both from a correctness POV and to remind the reader what is going on.
I've seen too many people put something that *can* fail just before the try
because they're unclear on the pattern.
https://codereview.appspot.com/87940044/diff/1/asyncio/futures.py
File asyncio/futures.py (right):
https://codereview.appspot.com/87940044/diff/1/asyncio/futures.py#newcode276
asyncio/futures.py:276: self._loop.call_soon(fn, self,
context_id=self._context_id)
FWIW, marker objects smell too. :-)

yselivanov

So here's the second patch. This new patch is radically different from what I proposed ...

11 years, 8 months ago (2014年04月16日 17:00:55 UTC) #6

So here's the second patch.
This new patch is radically different from what I proposed before (and I think
that Guido had something like this in mind, when we discussed this issue).
The idea is that any high-level application will, hopefully, be layered out as a
combination of coroutines/tasks. Low-level callback APIs are for protocols
implementation. So instead of patching the event loop, we only patch the Task
class.
The main motivation for me to work on this, is to allow developers to better
instrument their code. And this patch makes it possible to have threadlocal-like
objects for higher-level code. And that's usually the only place where you need
such objects. Both for performance monitoring and making some request-related
information always available.
Of course, having context_id preserved for all callbacks in the loop is better,
but it's a much more complex API to test & implement. So this new patch focuses
on one problem: maintaining context_id across the Task objects.
Speaking of realistic examples. I carefully reviewed how we analyze code
performance in our current projects. Turns out, we only profile high-level
operations, like how long was some database request, we don't care about
individual sockets operations (for that we'd need context_id maintained for
callbacks). And for that, making sure that context_id can be set per HTTP
request and is properly maintained for all subsequent Tasks is fine.

GvR

I like this much better. And perhaps it can be even smaller! https://codereview.appspot.com/87940044/diff/20001/asyncio/tasks.py File asyncio/tasks.py ...

11 years, 8 months ago (2014年04月16日 17:46:07 UTC) #7

I like this much better. And perhaps it can be even smaller!
https://codereview.appspot.com/87940044/diff/20001/asyncio/tasks.py
File asyncio/tasks.py (right):
https://codereview.appspot.com/87940044/diff/20001/asyncio/tasks.py#newcode176
asyncio/tasks.py:176: def set_context_id(self, context_id):
I know this was my idea, but now I'm wondering -- perhaps it would be cleaner if
the context ID could only be set when creating a task?
https://codereview.appspot.com/87940044/diff/20001/asyncio/tasks.py#newcode569
asyncio/tasks.py:569: def async(coro_or_future, *, loop=None, context_id=None):
Again, I know this was my suggestion, but I now think I was wrong. It would be
asking for trouble if calling async(X, context_id=Y) set the context ID in some
cases but not in others (and the latter is the case if X is already a Task).
"Solving" this by calling set_context_id() wouldn't be any better either. So
maybe Task() should be the only way to set the context ID.

haypo_gmail

11 years, 8 months ago (2014年04月16日 17:46:11 UTC) #8

https://codereview.appspot.com/87940044/diff/1/asyncio/events.py
File asyncio/events.py (right):
https://codereview.appspot.com/87940044/diff/1/asyncio/events.py#newcode40
asyncio/events.py:40: self._loop._context_id = self._context_id
Ah ok, you only want to move the line "self._loop._context_id =
self._context_id" inside the try block, not the two lines. I agree that it would
be better.
https://codereview.appspot.com/87940044/diff/1/asyncio/futures.py
File asyncio/futures.py (right):
https://codereview.appspot.com/87940044/diff/1/asyncio/futures.py#newcode276
asyncio/futures.py:276: self._loop.call_soon(fn, self,
context_id=self._context_id)
On 2014年04月16日 14:47:12, GvR wrote:
> FWIW, marker objects smell too. :-)
Do you suggest that None means "use the default"?
I have no use case which requires to "clear" a context id. I don't know if it's
useful.

GvR

(Pls ignore, Yuri has moved on. :-) https://codereview.appspot.com/87940044/diff/1/asyncio/futures.py File asyncio/futures.py (right): https://codereview.appspot.com/87940044/diff/1/asyncio/futures.py#newcode276 asyncio/futures.py:276: self._loop.call_soon(fn, self, ...

11 years, 8 months ago (2014年04月16日 17:50:16 UTC) #9

(Pls ignore, Yuri has moved on. :-)
https://codereview.appspot.com/87940044/diff/1/asyncio/futures.py
File asyncio/futures.py (right):
https://codereview.appspot.com/87940044/diff/1/asyncio/futures.py#newcode276
asyncio/futures.py:276: self._loop.call_soon(fn, self,
context_id=self._context_id)
On 2014年04月16日 17:46:11, haypo_gmail wrote:
> On 2014年04月16日 14:47:12, GvR wrote:
> > FWIW, marker objects smell too. :-)
> 
> Do you suggest that None means "use the default"?
Yes, that's what Yuri uses it for everywhere in the patch (and also in his new
patch).
> I have no use case which requires to "clear" a context id. I don't know if
it's
> useful.
Well, it might be useful to explicitly state that a particular task should not
be considered part of the current request (maybe if it's a cache operation for
example). But there are other ways to do that (e.g. generating an explicit new
ID, or using a predefined app-specific constant that means "do not log" or
whatever.

haypo_gmail

On 2014年04月16日 17:00:55, yselivanov wrote: > This new patch is radically different from what I ...

11 years, 8 months ago (2014年04月16日 17:55:04 UTC) #10

On 2014年04月16日 17:00:55, yselivanov wrote:
> This new patch is radically different from what I proposed before (and I think
> that Guido had something like this in mind, when we discussed this issue).
Oh, wow. Now I'm lost. IMO it would help to have an example using the API to see
if the proposed API fits the requirements.
It looks useful to me to pass the context identifier to callbacks.

yselivanov

https://codereview.appspot.com/87940044/diff/20001/asyncio/tasks.py File asyncio/tasks.py (right): https://codereview.appspot.com/87940044/diff/20001/asyncio/tasks.py#newcode176 asyncio/tasks.py:176: def set_context_id(self, context_id): On 2014年04月16日 17:46:07, GvR wrote: > ...

11 years, 8 months ago (2014年04月16日 19:01:06 UTC) #11

https://codereview.appspot.com/87940044/diff/20001/asyncio/tasks.py
File asyncio/tasks.py (right):
https://codereview.appspot.com/87940044/diff/20001/asyncio/tasks.py#newcode176
asyncio/tasks.py:176: def set_context_id(self, context_id):
On 2014年04月16日 17:46:07, GvR wrote:
> I know this was my idea, but now I'm wondering -- perhaps it would be cleaner
if
> the context ID could only be set when creating a task?
I think we don't need this, yes. Changing context_id of a task while it is being
executed can create hard to debug bugs.
https://codereview.appspot.com/87940044/diff/20001/asyncio/tasks.py#newcode569
asyncio/tasks.py:569: def async(coro_or_future, *, loop=None, context_id=None):
On 2014年04月16日 17:46:07, GvR wrote:
> Again, I know this was my suggestion, but I now think I was wrong. It would be
> asking for trouble if calling async(X, context_id=Y) set the context ID in
some
> cases but not in others 
Agree.

yselivanov

New patch, even shorted than before :) Victor, Guido, here's a very primitive example of ...

11 years, 8 months ago (2014年04月16日 19:31:43 UTC) #12

New patch, even shorted than before :)
Victor, Guido, here's a very primitive example of how this API can be used:
https://gist.github.com/1st1/10923339 -- some hypothetical web app that runs on
some hypothetical web framework. Two things I tried to show: tracing
performance & threadlocal-like-dict.
Victor, the above example shows, that if you use coroutines & tasks, you don't
really need to have context_id attached to every callback, as long as you use
your context_id in a higher-level code. Like when you want to profile database
queries, you can do that on the level of the database connection object, where
it's perfectly fine to use coroutines. With this approach, however, you won't be
able to get the current context_id in, say, socket reader/writer callbacks. But
that's something that is rarely needed (or am I wrong?)

yselivanov

On 2014年04月16日 19:31:43, yselivanov wrote: > https://gist.github.com/1st1/10923339 -- some hypothetical web app that runs on ...

11 years, 8 months ago (2014年04月16日 19:36:25 UTC) #13

On 2014年04月16日 19:31:43, yselivanov wrote:
> https://gist.github.com/1st1/10923339 -- some hypothetical web app that runs
on
I've just fixed a couple of inconsistencies in my example.

yselivanov

Guys, any feedback?

11 years, 8 months ago (2014年04月17日 19:58:19 UTC) #14

Guys, any feedback?

GvR

On 2014年04月17日 19:58:19, yselivanov wrote: > Guys, any feedback? This is a very nice small ...

11 years, 8 months ago (2014年04月18日 01:22:18 UTC) #15

On 2014年04月17日 19:58:19, yselivanov wrote:
> Guys, any feedback?
This is a very nice small patch, but it is still a big feature, and I think we
should ask for feedback on the tulip list. IIRC there was a previous proposal
where Glyph objected strenuously, but I suspect the Twisted feature he dislikes
is rather more complex than your nice minimalist design!
The gist example is missing many details, like the implementation of the Local
class -- there isn't a single line in the entire gist that actually calls
get_context_id(). I think it would have to be something like this?
context_id = asyncio.Task.current_task().get_context_id()
I wonder if it makes sense to add a shortcut API so you can write this?
context_id = asyncio.Task.current_context_id()
Then again it wouldn't be as minimalist. :-)
But my request for an example was to also show these idioms, to be sure that we
have thought through most consequences. (I'm not asking for the example to be
working code -- that would probably distract. But it should answer questions
about "how do I do X".)

yselivanov

Sorry guys, I had some urgent family related matters to resolve. And just when I ...

11 years, 8 months ago (2014年05月01日 21:34:40 UTC) #16

Sorry guys, I had some urgent family related matters to resolve. And just when I
thought it's all in the past, I had a bunch of new ones.
Will get back with more examples, better proposals etc in a few days.

RSS Feeds Recent Issues | This issue

This is Rietveld f62528b

Issue 87940044: context_id for Task

Patch Set 1 #

Patch Set 2 : 2nd attempt #

Patch Set 3 : third version #