8
\$\begingroup\$

I have folder1 that contains other folders (which may contain other folders etc). This process stops at some point where a folder contains only files, and some of them end with the string "img".

I would like to write to a file for each line:

  • the complete path of the file (if ends with, say "img")

  • the immediate folder that contains it

something like

/path/start/folder1/folder2/.../folderN/file123-img folderN

My Python 3 code is

import os
file_list = []
with open("fileList.txt", "w") as fp:
 for root, folders, files in os.walk('path/start/'):
 if folders == []:
 for fil in files:
 if fil.endswith('img'):
 final = (root.split("/"))[-1]
 fp.write(root + "/"+fil + " " + final +"\n")

I wonder if i could use more pythonic tool (like comprehensions) instead for - if - for - if

asked Sep 29, 2014 at 22:18
\$\endgroup\$

2 Answers 2

10
\$\begingroup\$
  1. You do not use file_list
  2. Emptiness of container (as in PEP8) is usually checked by if not container.
  3. For path operations os.path is preferred. Let's change 'final' to os.path.dirname and building path to os.path.join. Avoid os-specific solutions.
  4. Let's reduce indentation by using continue statement.
  5. IMHO using comprehensions is not preferable here. Pythonic means 'obvious' and 'simple' way, and this is very simple way.
  6. I introduced explanatory variable and tweaked naming a little bit.
  7. Consider separating building file list with writing it to file. Separating responsibilities is usually great idea. If you do that maybe you'll be able to find good place to use comprehensions.

Improved code (without separating) below:

import os
with open("fileList.txt", "w") as fp:
 for root, folders, files in os.walk('path/start/'):
 if folders:
 continue
 for fil in files:
 if not fil.endswith('img'):
 continue
 dir_name = os.path.basename(fil)
 full_path = os.path.join(root, fil)
 fp.write("{0} {1}\n".format(full_path, dir_name))
answered Oct 5, 2014 at 10:49
\$\endgroup\$
1
  • \$\begingroup\$ Thanks, very useful comments. The ununsed file_list is really a noob error. In my defense, I planned to put the information into a variable and then I decided to write them on disk \$\endgroup\$ Commented Nov 2, 2014 at 17:28
6
\$\begingroup\$

@Nsh raised some excellent points. Building on those, here's another alternative implementation, and then some additional remarks:

def find_img_files(basepath):
 for root, folders, files in os.walk(basepath):
 if folders:
 continue
 for filename in files:
 if not filename.endswith('.img'):
 continue
 basename = os.path.basename(root)
 filepath = os.path.join(root, filename)
 yield filepath, basename
with open(output_path, 'w') as fh:
 for filepath, basename in find_img_files(basepath):
 fp.write("{0} {1}\n".format(filepath, basename))

Improvements:

  • The file listing is now independent from the file writing. In fact find_img_files is now a reusable method, which gives you more flexibility:
    • The base directory path is not hardcoded, you can use the function with any path
    • You can do whatever you want with the file list: save it to a file, print to the screen, or store in an array and process later
  • The variable names are even more intuitive

As @Nsh pointed out, os.path.join and os.path.basename helps you perform joining and splitting path components in an OS-independent way. The original code won't work in Windows, this version works everywhere. For future reference, if you ever need, you can find the OS-independent path delimiter in os.path.sep.

answered Oct 7, 2014 at 18:23
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.