I'm a bit new to Python and sort of learning on my own. I wrote a small function to help me find the latest file in a directory. Taking a step back, it reads a bit janky and I was curious what steps or what resources I could look into to help me make this more friendly. Should I be returning False
? Or 0
?
Inside my example/files directory are 3 files which were, for this example, created on the dates specified in the file name:
example/files/randomtext011.201602012.txt
example/files/randomtext011.201602011.txt
example/files/randomtext011.201602013.txt
import os.path
import glob
import datetime
dir = 'example/files'
file_pattern = 'randomtext011.*.txt'
def get_latest_file(file_pattern,path=None):
if path is None:
list_of_files = glob.glob('{0}'.format(file_pattern))
if len(list_of_files)> 0:
return os.path.split(max(list_of_files, key = os.path.getctime))[1]
else:
list_of_files = glob.glob('{0}/{1}'.format(path, file_pattern))
if len(list_of_files) > 0:
return os.path.split(max(list_of_files,key=os.path.getctime))[1]
return False
2 Answers 2
First of all, I think that your variable names are quite good.
Should I be returning False? Or 0?
I would recommend None
Don't repeat yourself
As you can see, the two branches of your if
else
are very similar.
Instead you could do a
if path is None:
fullpath = file_pattern
else:
fullpath = path + '/' + file_pattern
But joining paths like this is not very pythonic (and might cause problems on windows).
Instead, fullpath = os.path.join(path, file_pattern)
is what you are looking for.
About the arguments
You can take inspiration of the os.path.join even further and change the order of your arguments (and completely remove the branching):
def get_latest_file(path, *paths):
fullpath = os.path.join(path, paths)
...
get_latest_file('example', 'files','randomtext011.*.txt')
Use docstrings
And then you might think that the way to call it is not trivial and want to document it: let's use a docstring !
def get_latest_file(path, *paths):
"""Returns the name of the latest (most recent) file
of the joined path(s)"""
fullpath = os.path.join(path, *paths)
Miscellaneous
If you use Python 3, you can use iglob instead.
For the os.path.split
, I prefer using it like this (instead of the 1
index):
folder, filename = os.path.split(latest_file)
The import datetime
is not used.
Instead of if len(list_of_files)> 0:
, you can simply do if list_of_files:
Revised code
def get_latest_file(path, *paths):
"""Returns the name of the latest (most recent) file
of the joined path(s)"""
fullpath = os.path.join(path, *paths)
list_of_files = glob.glob(fullpath) # You may use iglob in Python3
if not list_of_files: # I prefer using the negation
return None # because it behaves like a shortcut
latest_file = max(list_of_files, key=os.path.getctime)
_, filename = os.path.split(latest_file)
return filename
-
\$\begingroup\$ Thanks oliverpool! Your response is really helpful and definitely gives me a number of next steps to look into. \$\endgroup\$pyNovice89– pyNovice892016年02月19日 15:52:45 +00:00Commented Feb 19, 2016 at 15:52
Consider you has the directories in a particular path, then we need the simple code like as shown in below.
import os
files = os.listdir(path)
latest_file = files[0]
for key in files:
if os.path.getctime(path+key) > os.path.getctime(path + latest_file):
latest = key
print(latest)
-
1\$\begingroup\$ You have presented an alternative solution, but haven't reviewed the code. Please explain your reasoning (how your solution works and why it is better than the original) so that the author and other readers can learn from your thought process. \$\endgroup\$Ludisposed– Ludisposed2019年03月22日 09:59:22 +00:00Commented Mar 22, 2019 at 9:59
-
\$\begingroup\$ This code does not accomplish the same thing as the code in the question. The question only checks for the latest file that matches a glob, possibly traversing directory structures along the way. This code only checks the files within a single directory. \$\endgroup\$Vogel612– Vogel6122019年03月22日 10:18:37 +00:00Commented Mar 22, 2019 at 10:18