6

I'm having some trouble understanding the behavior of select.select. Please consider the following Python program:

def str_to_hex(s):
 def dig(n):
 if n > 9:
 return chr(65-10+n)
 else:
 return chr(48+n)
 r = ''
 while len(s) > 0:
 c = s[0]
 s = s[1:]
 a = ord(c) / 16
 b = ord(c) % 16
 r = r + dig(a) + dig(b)
 return r
while True:
 ans,_,_ = select.select([sys.stdin],[],[])
 print ans
 s = ans[0].read(1)
 if len(s) == 0: break
 print str_to_hex(s)

I have saved this to a file "test.py". If invoke it as follows:

echo 'hello' | ./test.py

then I get the expected behavior: select never blocks and all of the data is printed; the program then terminates.

But if I run the program interactively, I get a most undesirable behavior. Please consider the following console session:

$ ./test.py
hello
[<open file '<stdin>', mode 'r' at 0xb742f020>]
68

The program then hangs there; select.select is now blocking again. It is not until I provide more input or close the input stream that the next character (and all of the rest of them) are printed, even though there are already characters waiting! Can anyone explain this behavior to me? I am seeing something similar in a stream tunneling program I have written and it's wrecking the entire affair.

Thanks for reading!

slowdog
6,2462 gold badges30 silver badges31 bronze badges
asked May 15, 2011 at 21:45
3
  • Off-topic, but def str_to_hex(s): return ''.join(('%02x' % ord(c) for c in s)) ;-) Commented May 15, 2011 at 23:01
  • @slowdog: How about import binascii; binascii.hexlify(s) instead? Writing your own hex conversion function is silly when an extremely fast one already exists. Commented May 15, 2011 at 23:26
  • @Omnifarious: Oh, cool! I still underestimate the amount of "batteries included". Commented May 15, 2011 at 23:35

2 Answers 2

9

The read method of sys.stdin works at a higher level of abstraction than select. When you do ans[0].read(1), python actually reads a larger number of bytes from the operating system and buffers them internally. select is not aware of this extra buffering; It only sees that everything has been read, and so will block until either an EOF or more input arrives. You can observe this behaviour by running something like strace -e read,select python yourprogram.py.

One solution is to replace ans[0].read(1) with os.read(ans[0].fileno(), 1). os.read is a lower level interface without any buffering between it and the operating system, so it's a better match for select.

Alternatively, running python with the -u commandline option also seems to disable the extra buffering.

answered May 15, 2011 at 22:28
Sign up to request clarification or add additional context in comments.

5 Comments

And this is the answer to the OPs problem.
Though, another way to handle the Python buffering issue to to simply do a no-parameter read, which should read everything currently available.
There are several ways to disable Python's buffering, outlined here: stackoverflow.com/questions/107705/python-output-buffering. It's generally not necessary to disable buffering, and I wouldn't expect it to be preferable in this case (I don't see how It'd be beneficial for simple stream processing). Great answer though.
Thanks! This explains a lot. The no-parameter reading wasn't an option; in the actual use-case, I'm launching a subprocess over SSH and the two scripts are using the subprocess's stdin and stdout to communicate.
Thanks, this answer is very helpful for a problem I found very perplexing! slowdog, unfortunately, unbuffered input doesn't work with "paste". @Omnifarious, "no-paramater read" is not so simple. I managed to work out the details and posted them here: stackoverflow.com/questions/27750135/…
1

It's waiting for you to signal EOF (you can do this with Ctrl+D when used interactively). You can use sys.stdin.isatty() to check if the script is being run interactively, and handle it accordingly, using say raw_input instead. I also doubt you need to use select.select at all, why not just use sys.stdin.read?

if sys.stdin.isatty():
 while True:
 for s in raw_input():
 print str_to_hex(s)
else:
 while True:
 for s in sys.stdin.read(1):
 print str_to_hex(s)

Which would make it appropriate for both interactive use, and for stream processing.

answered May 15, 2011 at 22:10

4 Comments

This isn't the answer to the OPs problem.
Using sys.stdin.read seems preferable to using select.select to me. Sure you can work around select.select, but it seems like a pretty ugly approach to me.
sys.stdin.read blocks; the point of using select.select is to be able to wait for one of two different input sources.
Ah, I see. From your question that wasn't obvious to me (although I guess it should have been since you were using select.select in the first place).

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.