I have folder1
that contains other folders (which may contain other folders etc). This process stops at some point where a folder contains only files, and some of them end with the string "img"
.
I would like to write to a file for each line:
the complete path of the file (if ends with, say
"img"
)the immediate folder that contains it
something like
/path/start/folder1/folder2/.../folderN/file123-img folderN
My Python 3 code is
import os
file_list = []
with open("fileList.txt", "w") as fp:
for root, folders, files in os.walk('path/start/'):
if folders == []:
for fil in files:
if fil.endswith('img'):
final = (root.split("/"))[-1]
fp.write(root + "/"+fil + " " + final +"\n")
I wonder if i could use more pythonic tool (like comprehensions) instead for - if - for - if
2 Answers 2
- You do not use
file_list
- Emptiness of container (as in PEP8) is usually checked by
if not container
. - For path operations
os.path
is preferred. Let's change 'final' toos.path.dirname
and building path toos.path.join
. Avoid os-specific solutions. - Let's reduce indentation by using
continue
statement. - IMHO using comprehensions is not preferable here. Pythonic means 'obvious' and 'simple' way, and this is very simple way.
- I introduced explanatory variable and tweaked naming a little bit.
- Consider separating building file list with writing it to file. Separating responsibilities is usually great idea. If you do that maybe you'll be able to find good place to use comprehensions.
Improved code (without separating) below:
import os
with open("fileList.txt", "w") as fp:
for root, folders, files in os.walk('path/start/'):
if folders:
continue
for fil in files:
if not fil.endswith('img'):
continue
dir_name = os.path.basename(fil)
full_path = os.path.join(root, fil)
fp.write("{0} {1}\n".format(full_path, dir_name))
-
\$\begingroup\$ Thanks, very useful comments. The ununsed file_list is really a noob error. In my defense, I planned to put the information into a variable and then I decided to write them on disk \$\endgroup\$meto– meto2014年11月02日 17:28:34 +00:00Commented Nov 2, 2014 at 17:28
@Nsh raised some excellent points. Building on those, here's another alternative implementation, and then some additional remarks:
def find_img_files(basepath):
for root, folders, files in os.walk(basepath):
if folders:
continue
for filename in files:
if not filename.endswith('.img'):
continue
basename = os.path.basename(root)
filepath = os.path.join(root, filename)
yield filepath, basename
with open(output_path, 'w') as fh:
for filepath, basename in find_img_files(basepath):
fp.write("{0} {1}\n".format(filepath, basename))
Improvements:
- The file listing is now independent from the file writing. In fact
find_img_files
is now a reusable method, which gives you more flexibility:- The base directory path is not hardcoded, you can use the function with any path
- You can do whatever you want with the file list: save it to a file, print to the screen, or store in an array and process later
- The variable names are even more intuitive
As @Nsh pointed out, os.path.join
and os.path.basename
helps you perform joining and splitting path components in an OS-independent way. The original code won't work in Windows, this version works everywhere. For future reference, if you ever need, you can find the OS-independent path delimiter in os.path.sep
.