3
\$\begingroup\$

It's pretty typical for Linux command-line utilities that deal with files to accept either a file as input, or stdin. It's also pretty common for them to be able to output to a file or to stdout.

These should be supported:

  • python myprogram.py input output
  • cat input | python myprogram.py > output
  • python myprogram.py input > output

Additionally, I don't like editing files in-place. I'd rather make a tmpfile and then only copy over the tmpfile if the operation is a success, but I'd also rather not deal with tmpfiles every time I write something that has an output file. So both

  • python myprogram --in-place input
  • python myprogram filename filename (ie, input is the same as output)

should use a tmpfile.

I wrote this pair of context managers to make this type of interface a little easier to write. They're meant to be used something like this:

with infile(infile_filename) as f:
 for line in f: #do some stuff
with outfile(outfile_filename, infile_name=infile_filename, inplace=inplace) as f:
#get values for these arguments from argparse.
 f.write('blah')

I would love to know what y'all think of it.

#!/usr/bin/python3
import os
import shutil
import sys
import tempfile
class infile(object):
 def __init__(self, file_name=None):
 self.file_name = file_name
 def __enter__(self):
 if self.file_name is None:
 self.f = sys.stdin
 else:
 self.f = open(self.file_name)
 return self.f
 def __exit__(self, etype, value, traceback):
 if self.f is not sys.stdin:
 self.f.close()
 def __getattr__(self, val):
 return getattr(self.f, val) # pass on other attributes to the underlying filelike object
class outfile(object):
 def __init__(self, file_name=None, *, infile_name=None, inplace=False):
 self.file_name = file_name
 self.infile_name = infile_name
 self.inplace = inplace
 def __enter__(self):
 if self.inplace or (self.file_name and self.file_name == self.infile_name):
 self.f = tempfile.NamedTemporaryFile(mode='w+t', delete=False)
 self.tmppath = self.f.name
 elif self.file_name is None:
 self.f = sys.stdout
 self.tmppath = None
 else:
 self.f = open(self.file_name, 'w')
 self.tmppath = None
 return self.f
 def __exit__(self, etype, value, traceback):
 # If got no errors...
 if etype is None and self.tmppath:
 self.f.flush()
 shutil.copy(self.tmppath, self.infile_name)
 if self.f is not sys.stdout:
 self.f.close()
 if self.tmppath:
 os.remove(self.tmppath)
 def __getattr__(self, val):
 return getattr(self.f, val)
asked Jul 28, 2015 at 22:45
\$\endgroup\$

1 Answer 1

2
\$\begingroup\$

All the files you're dealing with are already context managers, including sys.stdin and sys.stdout. Delegating to them instead of reinventing the wheel, your infile class becomes a trivial function:

def infile(filename=None):
 if filename is None:
 return sys.stdin 
 return open(filename)

outfile isn't quite as easy as that, but we can still simplify it a little. First, the inplace flag completely changes the behaviour of the function (causes it to choose between mostly disjoint codepaths), and would usually be specified as a literal in the source. It makes more sense to have it as a separate function instead. If you do need to decide between them based on user input, you can always write a short wrapper function.

I'll reuse the name outfile for the clobbering version, and use atomic_update for the non-clobbering version. The clobbering version is then really just as simple as infile:

def outfile(filename=None):
 if filename is None:
 return sys.stdout
 return open(filename, 'w')

The updating version really does need to do some extra stuff after it the files get closed. So it does need to be it's own context manager. But instead of writing it as a class, it's a little easier to use the stdlib contextlib to write it as a coroutine:

from contextlib import contextmanager
@contextmanager
def atomic_update(filename):
 if filename is None:
 f = sys.stdout
 else:
 f = tempfile.NamedTemporaryFile(mode='w+t', delete=False)
 tmppath = f.name
 with f:
 yield f
 if filename is not None:
 shutil.copy(tmppath, filename)
 os.remove(tmppath)

nb. It would be nice if we just return sys.stdout like in the other cases, but the contextmanger decorator considers it an error if the generator doesn't yield exactly once.

As it is, this is a little bit cumbersome to use effectively - you need to open the file for input separately (the tempfile is returned empty), and then it's probably a good idea to be careful to close them in the right order. It also loses, eg, file permissions, which isn't ideal. You might want to copy the original file (and metadata) into the temp file, and then shutil.move it back when you're done.

answered Jul 29, 2015 at 0:35
\$\endgroup\$
4
  • \$\begingroup\$ Well, that's a heck of a lot simpler than my code, much thanks :P \$\endgroup\$ Commented Jul 29, 2015 at 20:52
  • 1
    \$\begingroup\$ The thing that I worried about, when writing a patch for argparse.FileType is how to keep the context from trying to close the stdin/out. One way was to wrap them in a dummy context manager, with a 'do nothing' close. \$\endgroup\$ Commented Jul 30, 2015 at 19:20
  • \$\begingroup\$ Your code does a different thing. The OP's context managers do not close sys.stdin or sys.stdout your context managers close them. \$\endgroup\$ Commented Jan 26, 2021 at 15:43
  • \$\begingroup\$ @hpaulj Meanwhile I learned more about context managers. I think for optional context manager it is best to use contextlib.ExitStack --- for a file you will call .enter_context() to include the file into the stack of context managers for stdin/stdout you will not call it and it will not be handled by the context manager. \$\endgroup\$ Commented Oct 21, 2021 at 12:54

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.