I am developing a Python script that will process large amounts of data from an experiment. Files are read in, calculations are made, images are plotted and saved as .pngs, and calculated parameters are outputted to a .txt file.
When I save a file, the user is prompted if they want to overwrite any existing files. If yes, a file overwrite is attempted. If the file is locked/open in another program, a number is appended to the filename and then saved. If they do not want to overwrite, a number is appended to the filename.
In order to not have redundant code, I wrote a generic "save" method:
def _handle_file_save(path, save, overwrite, *args):
"""
Handles file collisions when saving files. If "overwrite" is true,
overwrites file, if "overwrite" is false or the file to be overwritten
is locked from editing, appends a number on end of filename to avoid
file collision.
Args:
path (str): the path to the file to be saved
save(): The save function used to save file
overwrite (bool): true if user wants to overwrite file
args: generic tuple of arguments used for different save functions
"""
if os.path.isfile(path):
if overwrite:
try:
print "Saving \"" + path + "\""
save(path, args)
# Signifies problem writing file (lack of permissions, open
# in another program, etc.)
except EnvironmentError:
# save as incremented file number
print "Whoops! Looks like %s is locked!" % path
path = gen_alt_filename(path)
save(path, args)
else:
# save as incremented file number
path = gen_alt_filename(path)
print "Saving as \"" + path + "\" instead"
save(path, args)
else:
print "Saving \"" + path + "\""
save(path, args)
print "\n"
I pass in a generic save function and any arguments that save function needs if any. Some file outputs don't require any arguments.
Flaws:
The client could pass in any arbitrary function as "save" and execute it. Is this a security risk?
The variable
args
gets passed to the save function regardless of whether that method actually takes any args (i.e. args could be empty, but gets passed anyway.Doesn't check if appended filepath is also a locked/open file (could fix this with a while loop I suppose).
Possible improvements:
Use Python's functools.partial
to reduce the function call's signature to one without "args
" if args
is empty.
Alternative:
Instead of passing a generic save function, I could replace "
save(path, args)
" with "new_file = open(path, 'w')
" (i.e. test file write, if successful, return the path to that file and write there).Don't merge code, let every save method handle its own collisions separately.
Nothing about any of this seems elegant. I'd greatly appreciate some advice on this, and how I can improve my coding strategies.
-
3\$\begingroup\$ Welcome to Code Review! I hope you get some great answers! \$\endgroup\$Phrancis– Phrancis2015年07月05日 23:22:26 +00:00Commented Jul 5, 2015 at 23:22
2 Answers 2
Security risk? The client could pass in any arbitrary function as "save" and execute it.
A client can change your code, and it's there fault for not using the function as intended.
variable "args" gets passed to the save function regardless of whether that method actually takes any args
save(path, *args)
This way it will see there is no arguments and not error.
Instead of passing a generic save function, I could replace "save(path, args)" with "new_file = open(path, 'w')"
This is a better idea. This is as things like json.dump
use a file object.
You could make it so that it works in a with statement, and return the file object.
print "Saving \"" + path + "\""
You should use str.format
.
print "Saving \"{}\"".format(path)
You can add the ability of making it work with with
statements with the below. If you want to know about these special functions look here.
class _handle_file_save:
def __init__(self, path):
self.path = path
self.file_handler = open(self.path)
def __enter__(self):
return self.file_handler
def __exit__(self, exc_type, exc_value, traceback):
self.file_handler.close()
Now, we can enter your function without the save parameter. This will allow us to do what you want.
class _handle_file_save:
def __init__(self, path, overwrite, *args):
self.path = path
self.overwrite = overwrite
self.args = args
def _open(self):
if os.path.isfile:
# ... Just copy the rest.
However we now have the problem of
Doesn't check if appended filepath is also a locked/open file
And so we can make it a recursive function!
class _handle_file_save:
def _open_recursive(self, path):
try:
self.file_handler = open(path, *self.args)
print "Saving \"{}\"".format(path)
except EnvironmentError:
self._open_recursive(gen_alt_filename(path))
Lets now get the _open
function to work now.
class _handle_file_save:
def _open(self):
if os.path.isfile(self.path):
if overwrite:
self._open_recursive(self.path)
else:
self._open_recursive(
gen_alt_filename(self.path))
else:
self._open_recursive(self.path)
You may want to change it so there is only one if statement. However you may not want the first else to be recursive.
If we wrap it all up we get:
class _handle_file_save:
def __init__(self, path, overwrite, *args):
self.path = path
self.overwrite = overwrite
self.args = args
self.file_handler = None
def _open(self):
if os.path.isfile(self.path):
if overwrite:
self._open_recursive(self.path)
else:
self._open_recursive(
gen_alt_filename(self.path))
else:
self._open_recursive(self.path)
def _open_recursive(self, path):
try:
self.file_handler = open(path, *self.args)
print "Saving \"{}\"".format(path)
except EnvironmentError:
self._open_recursive(gen_alt_filename(path))
def __enter__(self):
self._open()
if self.file_handler is None:
raise TypeError('self.file_handler never set.')
return self.file_handler
def __exit__(self, exc_type, exc_value, traceback):
self.file_handler.close()
You may want to change the error in __enter__
.
This should work, however I don't know how to generate a locked file. But it seems like you know how to check.
Also it has the added support of allowing you to put all your file checking functions, I'm looking at you gen_alt_filename
, into it to if you wish.
It's more complex, however it would work better then passing a save function as usage is just like open
.
with _handle_file_save('file', False, 'rb') as file_:
# Do something with `file_`
-
1\$\begingroup\$ woah this is awesome! Thanks for the quick response. One of the main reasons I originally tried passing the save function is because I'm using the
plt.savefig(path)
function frommatplotlib
and therefore not explicitly handling any of the file saving myself. Can you imagine a way to incorporate that functionality with your method? Thanks! \$\endgroup\$Colin Summers– Colin Summers2015年07月06日 01:35:01 +00:00Commented Jul 6, 2015 at 1:35 -
\$\begingroup\$ awesome! I can't believe I missed that. I took another look at your code and modifed it some. Because
gen_alt_filename
only checks if a file exists when generating an alternate filename it won't help in theexcept
branch on the_open_recursive
call. I'll repost what I came up with now since it won't fit here.. \$\endgroup\$Colin Summers– Colin Summers2015年07月06日 02:52:19 +00:00Commented Jul 6, 2015 at 2:52
Building off of Joe Wallis' answer. Because gen_alt_filename only checks if a file exists when generating an alternate filename it won't help in the except branch on the _open_recursive call. This SHOULD work, So:
Case 1: Append number to filename BEFORE attempting file save
Case 2: Attempt filesave, if exception thrown, append number and try again
class _handle_file_save:
def __init__(self, path, overwrite, *args):
self.path = path
self.overwrite = overwrite
self.args = args
self.file_handler = None
def _open(self):
if os.path.isfile(self.path):
if self.overwrite:
self._open_recursive(self.path)
else:
n = 1
path = self.path
while os.path.isfile(self.path):
path = self._modify_filename(n)
n += 1
self._open_recursive(path)
else:
self._open_recursive(self.path)
def _open_recursive(self, path, n=1):
try:
self.file_handler = open(path)
print "Saving \"{}\"".format(path)
except EnvironmentError:
path = self._modify_filename(n)
n += 1
self._open_recursive(path, n)
def __enter__(self):
self._open()
if self.file_handler is None:
raise TypeError('self.file_handler never set.')
return self.file_handler
def __exit__(self, exc_type, exc_value, traceback):
self.file_handler.close()
def _modify_filename(self, n):
root, suffix = os.path.splitext(self.path)
path = "{}_({}).{}".format(
root,
n,
suffix
)
return path