1
\$\begingroup\$

I am using a recursive algorithm to find all of the file-paths in a given directory: it returns a dictionary like this: {'Tkinter.py': 'C:\Python27\Lib\lib-tk\Tkinter.py', ...}.

I am using this in a script to open modules by solely given the name. Currently, the whole process (for everything in sys.path) takes about 9 seconds. To avoid doing this every time, I have it save to a .pkl file and then just load this in my module-opener program.

The original recursive method took too long and sometimes gave me a MemoryError, so what I did was create a helper method to iterate through the subfolders (using os.listdir), and then call the recursive method.

Here is my code:

import os, os.path
def getDirs(path):
 sub = os.listdir(path)
 paths = {}
 for p in sub:
 print p
 pDir = '{}\{}'.format(path, p)
 if os.path.isdir(pDir): 
 paths.update(getAllDirs(pDir, paths))
 else:
 paths[p] = pDir
 return paths
def getAllDirs(mainPath, paths = {}):
 subPaths = os.listdir(mainPath)
 for path in subPaths:
 pathDir = '{}\{}'.format(mainPath, path)
 if os.path.isdir(pathDir):
 paths.update(getAllDirs(pathDir, paths))
 else:
 paths[path] = pathDir
 return paths

Is there any way to make this faster? Thanks!

asked Apr 19, 2013 at 20:52
\$\endgroup\$

1 Answer 1

1
\$\begingroup\$
import os, os.path
def getDirs(path):

Python convention is to use lowercase_with_underscores for function names

 sub = os.listdir(path)

Don't needlessly abbreviate, and at least have it be a plural name.

 paths = {}
 for p in sub:

You don't need to store things in a temporary variable to loop over them

 print p

Do you really want this function printing?

 pDir = '{}\{}'.format(path, p)

Use os.path.join to join paths. That'll make sure it works regardless of your platform.

 if os.path.isdir(pDir): 
 paths.update(getAllDirs(pDir, paths))

You shouldn't both pass it and update it afterwords. That's redundant.

 else:
 paths[p] = pDir
 return paths
def getAllDirs(mainPath, paths = {}):

Don't use mutable objects as default values, they have unexpected behavior.

 subPaths = os.listdir(mainPath)
 for path in subPaths:
 pathDir = '{}\{}'.format(mainPath, path)
 if os.path.isdir(pathDir):
 paths.update(getAllDirs(pathDir, paths))
 else:
 paths[path] = pathDir
 return paths

This whole section is repeated from the previous function. You should combine them.

Take a look at the os.walk function. It does most of the work you're doing here and you could use to simplify your code.

answered Apr 19, 2013 at 21:06
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.