1
\$\begingroup\$

I have a directory with lot of files and folders in it. The function which I wrote works, but I want to make it more efficient and faster. I was thinking of multiprocessing or threading.

The code which I wrote is as follows:

LIST.delete(0,END) # this is a list view
usePATH = '/nfzcae/nvh_mdp/Messdatenbank_Powertrain'
fileLevels = [] # code of interest is below
for path, dirs, f in os.walk(usePATH):
 for d in dirs:
 for f in glob.iglob(os.path.join(path, d,'*COMPARE.mat')):
 if 'COMPARE.mat' in f: # if 'COMPARE.mat' in f and not 'MIN' in f and not 'MAX' in f / if 'COMPARE.mat' in f ) # if 'COMPARE.mat' in f and not 'MIN' in f and not 'MAX' in f
 fileLevels.append(f.split('/')) # Split path string at all '/'
 LIST.insert(END,f) # Insert path with name string into the listbox
 LIST.update_idletasks() # Update listbox after each element added
 nr = LIST.size() # Get current number of files in listbox
 VAR.set(nr) # Set number of files as a variable for label
 LIST.see(END) # See last element added
 FILE_LIST.append(f)
 else:
 pass # Do nothing
LIST.insert(END,'Search finished')

It is actually made for a GUI Button. I want to make this code faster. I know that threading does not work for nested for loops, so I am stuck in figuring out how to use the multithreading module for this problem. Any ideas? I am having an idea on these lines:

  1. get all the Sub directories names in a list
  2. parallel pool now using the list and write a function which checks the Sub Folders for the file Name containing the keyword

Will this work??

P.S: The Folder has lot of subdirectories (more than 1000)

Edward
67.2k4 gold badges120 silver badges284 bronze badges
asked Apr 12, 2017 at 9:10
\$\endgroup\$
2
  • 1
    \$\begingroup\$ What kind of LIST allow you to delete, see and update_idletasks? This doesn't come with Python. Also what are VAR and FILE_LIST? \$\endgroup\$ Commented Apr 12, 2017 at 9:26
  • \$\begingroup\$ LIST is an object for list view in the GUI, FILE_LIST is a list for all files. I just need the FILE_LIST in some other functions @MathiasEttinger. Actually to get all the files with the keyword takes about 5 mins, i have checked it with timeit function \$\endgroup\$ Commented Apr 12, 2017 at 10:02

2 Answers 2

1
\$\begingroup\$

Since you’re using glob already, why not use it's full potential and ask it to do the folders traversal for you?

def files_with_compare(root_folder):
 for filename in glob.iglob(os.path.join(root_folder, '**', '*COMPARE.mat'), recursive=True):
 # Do something with filename

Now you’re updating some components over and over again while creating the list of relevant file names. You should build the list first and then do whatever you need with it, including updating the GUI:

def files_with_compare(root_folder):
 pattern = os.path.join(root_folder, '**', '*COMPARE.mat')
 return glob.glob(pattern, recursive=True)
LIST.delete(0, END)
FILE_LIST = files_with_compare('/nfzcae/nvh_mdp/Messdatenbank_Powertrain') 
LIST.extend(FILE_LIST) # ???
VAR.set(LIST.size())
LIST.see(END)
LIST.insert(END, 'Search finished')
answered Apr 12, 2017 at 13:10
\$\endgroup\$
2
\$\begingroup\$
  • os.walk gives you the filenames already. There is no need to use glob to get them from the OS again. The looping could be done like this:

    for path, dirs, filenames in os.walk(usePATH):
     for f in filenames:
     if f.endswith('COMPARE.mat'):
     ...
    
  • Updating a GUI can be slow. While I'm not familiar with what you are using, I would try if eliminating update_idletasks from the inner loop speeds it up.

answered Apr 12, 2017 at 11:53
\$\endgroup\$
1
  • \$\begingroup\$ The code has speeded up thanks for the Suggestion @Janne Karila \$\endgroup\$ Commented Apr 12, 2017 at 12:34

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.