1
\$\begingroup\$

This class is part of a utility that reads lines from a set of nonblocking file descriptors, blocking only when there are no complete lines to emit.

class NonblockingLineBuffer:
 def __init__(self, fd, encoding):
 self.fd = fd
 self.enc = encoding
 self.buf = bytearray()
 def absorb(self):
 while True:
 try:
 block = os.read(self.fd, 8192)
 except BlockingIOError:
 return
 if block:
 self.buf.extend(block)
 else:
 self.is_open = False
 # We don't close the file here because caller
 # needs to remove the fd from the poll set first.
 return
 def emit(self):
 def emit1(chunk):
 self.emitted_this_cycle = True
 return (self.fd, chunk.decode(self.enc).rstrip())
 buf = self.buf
 self.emitted_this_cycle = False
 while buf:
 r = buf.find(b'\r')
 n = buf.find(b'\n')
 if r == -1 and n == -1:
 if not self.is_open:
 yield emit1(buf)
 buf.clear()
 elif r == -1 or r > n:
 yield emit1(buf[:n])
 buf = buf[(n+1):]
 elif n == -1 or n > r:
 yield emit1(buf[:r])
 if n == r+1:
 buf = buf[(r+2):]
 else:
 buf = buf[(r+1):]
 self.buf = buf
 if not self.is_open:
 self.emitted_this_cycle = True
 yield (self.fd, None)

This question is specifically about emit, which is complicated, confusing, and might not be as efficient as it could be. Please suggest ways to make it less complicated and/or confusing, and more efficient.

(I know it could be much simpler if it didn't need to reimplement universal newline handling, but that is unfortunately a requirement from the larger context.)

(If there's something in the standard library that does some or all of the larger task, that would also be a welcome answer.)

200_success
145k22 gold badges190 silver badges478 bronze badges
asked Mar 16, 2015 at 21:27
\$\endgroup\$
2
  • \$\begingroup\$ Would you post the entire utility that works with multiple file descriptors? I suspect that there's a simpler way to achieve the same effect. \$\endgroup\$ Commented Mar 17, 2015 at 1:36
  • 1
    \$\begingroup\$ @200_success I made a bunch of changes since I posted the question, and I also got some good advice from Janne Karila about the narrowly-construed problem, so I've done as you suggest in a new question: codereview.stackexchange.com/questions/84299/… \$\endgroup\$ Commented Mar 17, 2015 at 18:30

1 Answer 1

1
\$\begingroup\$
  • You have overlooked a corner case: while you normally treat \r\n as a single separator, this not the case when the two bytes are split between blocks.
  • splitlines could handle the universal newlines for you.

Here's what I came up with; still not very pretty I'm afraid. Initialize self.carry_cr = False in constructor.

def emit(self):
 buf = self.buf
 if buf:
 # skip \n if previous buffer ended with \r
 if self.carry_cr and buf.startswith(b'\n'):
 del buf[0]
 self.carry_cr = False
 lines = buf.splitlines()
 if buf:
 if self.is_open and not buf.endswith((b'\r', b'\n')):
 buf = lines.pop()
 else:
 if buf.endswith(b'\r'):
 self.carry_cr = True
 del buf[:]
 self.buf = buf
 self.emitted_this_cycle = False
 if lines:
 self.emitted_this_cycle = True
 for line in lines:
 yield (self.fd, line.decode(self.enc).rstrip())
 if not self.is_open:
 self.emitted_this_cycle = True
 yield (self.fd, None)
answered Mar 17, 2015 at 12:01
\$\endgroup\$
1
  • \$\begingroup\$ This is enough of an improvement over what I had that I'm going to accept this and ask a new question showing more of the context (as suggested by 200_success). Thank you. \$\endgroup\$ Commented Mar 17, 2015 at 18:15

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.