I have a very basic python script that reads a CSV file. I'm importing the csv module setting some empty list variables and getting the data I want. The problem is that I know there is a much more efficient way to get the data I just don't know how to do it and I feel very redundant in creating a reader object multiple times.
The goal of the script is to read the csv and extract fields (header) and column cells under it.
What I'm doing to accomplish this is creating multiple reader objects and looping each object to extract the data.
The csv file is simple and will have more columns in the future:
routers,servers
192.168.1.1,10.0.1.1
192.168.1.2,10.0.1.2
The code is simple:
import csv
filename='Book2.csv'
fields=[]
rows=[]
with open(filename, 'r') as csvfile_field:
csvreader_group = csv.reader(csvfile_field)
fields=next(csvreader_group)
group1=fields[0]
group2=fields[1]
with open(filename, newline='') as csvfile_row1:
csvreader_server = csv.DictReader(csvfile_row1)
#print(str(group))
print(group1)
for row1 in csvreader_server:
server1 = row1[group1]
print(server1)
print('\n')
with open(filename, newline='') as csvfile_row2:
csvreader_server = csv.DictReader(csvfile_row2)
print(group2)
for row2 in csvreader_server:
server2 = row2[group2]
print(server2)
The results are:
routers
192.168.1.1
192.168.1.2
servers
10.0.1.1
10.0.1.2
Can someone review this and suggest a more efficient way to extract the data without the opening of the same file multiple times with the same results?
2 Answers 2
Here's how I would do it, without Pandas:
import csv
filename = 'Book2.csv'
with open(filename, 'r') as csvfile:
reader = csv.reader(csvfile)
fields = next(reader) # Reads header row as a list
rows = list(reader) # Reads all subsequent rows as a list of lists
for column_number, field in enumerate(fields): # (0, routers), (1, servers)
print(field)
for row in rows:
print(row[column_number])
print('\n')
IIUC:
df = pd.read_csv(filename)
for col in df:
print(col,'\n','\n'.join[i for i in df[col]])
DataFrame
? \$\endgroup\$