I have a directory with lot of files and folders in it. The function which I wrote works, but I want to make it more efficient and faster. I was thinking of multiprocessing or threading.
The code which I wrote is as follows:
LIST.delete(0,END) # this is a list view
usePATH = '/nfzcae/nvh_mdp/Messdatenbank_Powertrain'
fileLevels = [] # code of interest is below
for path, dirs, f in os.walk(usePATH):
for d in dirs:
for f in glob.iglob(os.path.join(path, d,'*COMPARE.mat')):
if 'COMPARE.mat' in f: # if 'COMPARE.mat' in f and not 'MIN' in f and not 'MAX' in f / if 'COMPARE.mat' in f ) # if 'COMPARE.mat' in f and not 'MIN' in f and not 'MAX' in f
fileLevels.append(f.split('/')) # Split path string at all '/'
LIST.insert(END,f) # Insert path with name string into the listbox
LIST.update_idletasks() # Update listbox after each element added
nr = LIST.size() # Get current number of files in listbox
VAR.set(nr) # Set number of files as a variable for label
LIST.see(END) # See last element added
FILE_LIST.append(f)
else:
pass # Do nothing
LIST.insert(END,'Search finished')
It is actually made for a GUI Button. I want to make this code faster. I know that threading does not work for nested for loops, so I am stuck in figuring out how to use the multithreading module for this problem. Any ideas? I am having an idea on these lines:
- get all the Sub directories names in a list
- parallel pool now using the list and write a function which checks the Sub Folders for the file Name containing the keyword
Will this work??
P.S: The Folder has lot of subdirectories (more than 1000)
2 Answers 2
Since you’re using glob
already, why not use it's full potential and ask it to do the folders traversal for you?
def files_with_compare(root_folder):
for filename in glob.iglob(os.path.join(root_folder, '**', '*COMPARE.mat'), recursive=True):
# Do something with filename
Now you’re updating some components over and over again while creating the list of relevant file names. You should build the list first and then do whatever you need with it, including updating the GUI:
def files_with_compare(root_folder):
pattern = os.path.join(root_folder, '**', '*COMPARE.mat')
return glob.glob(pattern, recursive=True)
LIST.delete(0, END)
FILE_LIST = files_with_compare('/nfzcae/nvh_mdp/Messdatenbank_Powertrain')
LIST.extend(FILE_LIST) # ???
VAR.set(LIST.size())
LIST.see(END)
LIST.insert(END, 'Search finished')
os.walk
gives you the filenames already. There is no need to useglob
to get them from the OS again. The looping could be done like this:for path, dirs, filenames in os.walk(usePATH): for f in filenames: if f.endswith('COMPARE.mat'): ...
Updating a GUI can be slow. While I'm not familiar with what you are using, I would try if eliminating
update_idletasks
from the inner loop speeds it up.
-
\$\begingroup\$ The code has speeded up thanks for the Suggestion @Janne Karila \$\endgroup\$ayaan– ayaan2017年04月12日 12:34:41 +00:00Commented Apr 12, 2017 at 12:34
LIST
allow you todelete
,see
andupdate_idletasks
? This doesn't come with Python. Also what areVAR
andFILE_LIST
? \$\endgroup\$