0

I have the following Code

import numpy as np
from readers import readerXYZ
import glob 
folder=r'N:\FolderXYZ/*'
D_list=[]
ctr=0
for name in glob.glob (folder):
 Zdata = readerXYZ(name, output_matrix=True) 
 #I Need this counter of the NaNs for future computations
 ctr=ctr+np.count_nonzero(np.isnan(Zdata)) 
 a=list(Zdata.shape)
 D_list.append(a)

The program reads from different files stored in the Folder called "FolderXYZ", this is done with an external program called readerXYZ, what I want to store in the lis D_list is the dimensions of each Zdata Matrix, and at the same time count how many NaNs are in total. This code works fine but takes so Long, how can I improve it? thank you

asked Apr 23, 2019 at 8:31
2
  • Optimization should be posted on code review. You should start by profiling the code to identify which steps are costly. I recommend cProfile. I believe the slow part is probably in the readerXYZ function... Commented Apr 23, 2019 at 8:52
  • It might be possible to get the dimensions and count NaN elements without loading the entire matrix into memory. But that would depend on the format of the files. Commented Apr 23, 2019 at 9:34

1 Answer 1

0

If time is your concern. You can consider processing the list in parallel. Here is a reference for multiprocessing Python how to parallelize loops

answered Apr 23, 2019 at 8:54
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.