Process CSV files that follow the first nonempty one

Asked 7 years ago

Viewed 44 times

\$\begingroup\$

I have a list of CSV files (all have the same fields), some of which are empty. My goal is to find the first nonempty file and use that as the "base" file for an aggregation. After finding the base file, the way I will consume the rest of the files changes, so I need to maintain the index of the file. Here is what I currently have:

def f(list_of_files):
 aggregated_file = ""
 nonempty_index = -1
 for index, file in enumerate(list_of_files):
 aggregated_file = extract_header_and_body(file)
 if aggregated_file:
 nonempty_index = index
 break
 if nonempty_index != -1:
 rest_of_files = list_of_files[nonempty_index:]
 for file in rest_of_files:
 aggregated_file += extract_body(file)
 return aggregated_file

Both extract_header_and_body and extract_body return string representations (with and without column names, respectively) of the CSV files -- if the file is empty, the empty string is returned. This seems like a clunky solution, especially for Python. Is there a more concise/readable way to accomplish this?

edited Sep 20, 2018 at 4:59

kanderson8kanderson8

asked Sep 20, 2018 at 4:08

kanderson8's user avatar

kanderson8 kanderson8

1911 silver badge6 bronze badges

\$\endgroup\$

2

\$\begingroup\$ Can you be specific about exactly what is considered to be the output? This function doesn't return anything. (In addition, I suggest that you include the code for extract_header_and_body() and extract_body() as well: we may be able to give you better advice in that case.) \$\endgroup\$

200_success
– 200_success

2018年09月20日 04:36:29 +00:00
Commented Sep 20, 2018 at 4:36

Add a comment |

1 Answer 1

Sorted by: Reset to default

\$\begingroup\$

If we reformulate this as trying to extract the header + body of the first non-empty file and the body of all subsequent files this comes to mind:

aggregated_file = ''
for file in list_of_files:
 if aggregated_file:
 aggregated_file += extract_body(file)
 else:
 aggregated_file += extract_header_and_body(file)

answered Sep 20, 2018 at 5:14

l0b0's user avatar

l0b0 l0b0

9,11722 silver badges36 bronze badges

\$\endgroup\$

Add a comment |

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-py

Stack Exchange Network

Process CSV files that follow the first nonempty one

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Process CSV files that follow the first nonempty one

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions