5
\$\begingroup\$

I needed to make a small script to read a csv file and store 2 columns in Redis. Can you guys take a look over the code and let me know how I can improve my code style?

#!/usr/bin/env/ python
"""Proces csv file and store key-value pair of columns in redis"""
import csv, redis, json
from sys import argv, exit
REDIS_HOST = 'localhost'
conn = redis.Redis(REDIS_HOST)
def get_csv_data(csv_file, **columns):
 col1, col2 = columns['columns']
 with open(csv_file, encoding='utf-8') as csvf:
 csv_data = csv.reader(csvf)
 return [(i[col1], i[col2]) for i in csv_data]
def store_data(data):
 for i in data:
 if i[1] not in conn:
 conn.set(i[0], i[1])
 return data
def main():
 if len(argv) < 2:
 exit("usage: python {file} file.csv [column1, column2]".format(file=__file__))
 csv_file = argv[1] 
 columns = list(map(int,argv[2:4])) or [1, 3]
 data = get_csv_data(csv_file, columns=columns)
 print (json.dumps(store_data(data)))
if __name__ == '__main__':
 main()
bsa
7934 silver badges14 bronze badges
asked Nov 1, 2014 at 18:41
\$\endgroup\$

1 Answer 1

8
\$\begingroup\$

Here are my suggestions:

import csv, redis, json
import sys
REDIS_HOST = 'localhost'
def read_csv_data(csv_file, ik, iv):
 with open(csv_file, encoding='utf-8') as csvf:
 csv_data = csv.reader(csvf)
 return [(r[ik], r[iv]) for r in csv_data]
def store_data(conn, data):
 for i in data:
 conn.setnx(i[0], i[1])
 return data
def main():
 if len(sys.argv) < 2:
 sys.exit(
 "Usage: %s file.csv [key_column_index, value_column_index]"
 % __file__
 )
 columns = (0, 1) if len(sys.argv) < 4 else (int(x) for x in sys.argv[2:4])
 data = read_csv_data(sys.argv[1], *columns)
 conn = redis.Redis(REDIS_HOST)
 print (json.dumps(store_data(conn, data)))
if '__main__' == __name__:
 main()

Feel free to ask questions on any of the changes. Some notes:

  1. In general, I don't like from x import y. It clutters your global namespace and conflicts can become an issue.
  2. In your original code, I guess you meant if i[0] not in conn, to test whether the key already exists, not the value right? That being the case, it's better to use the built in SETNX feature in Redis.
  3. I changed your **kwargs to *args. Passing a dictionary containing a single tuple was a bit confusing.
  4. Your list(map(...)) or was pretty clever, but I find this way more readable. Just personal preference. Tuple is preferable to list for immutable data.
  5. I only initialize the connection after parsing the command line arguments and loading the CSV file. In general, only constants should be module scope.
  6. I swapped __name__ and '__main__'. I prefer putting the constant term in a comparison first. That way the interpreter will catch the sometimes hard to spot typo == -> =
answered Jan 23, 2015 at 2:39
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.