Read file into list when class is instantiated in Python 2.7

Question 1

I would like to read a file into a list when I create the class below. Once the list is ready, I would like to access it outside the class in order to process more code based on the list which now contains my file. Can the code be improved upon?

class ReadFile(object):
 def __init__(self, filename):
 self._data = []
 self.localdir = 'C:\\Users\\User\\Documents'
 self.filename = filename
 self.filepath = os.path.join(self.localdir, self.filename)
 self.read_data()
 def read_data(self):
 try:
 with open(self.filepath, 'rU') as c:
 rows = csv.reader(c)
 for row in rows:
 self._data.append(row)
 except Exception as ex:
 print(ex)
 def get_data(self):
 return self._data
def main():
 new = ReadFile('myfile.csv')
 new_data = new.get_data()
 for i in new_data:
 print(i) # etc... more code will follow
if __name__ == '__main__':
 main()

Question 2

I agree with Caridorc about not needing a class, in addition ReadFile isn't a very good class name, it sounds like a function. Unless, you meant read as an adjective (this file has been read), but you can see how that could be confusing.

Keeping filename and your hardcoded path seems redundant at the moment

 self.localdir = 'C:\\Users\\User\\Documents'
 self.filename = filename
 self.filepath = os.path.join(self.localdir, self.filename)

Also you could use r''to write your filepath without the double backslash. r before a string will tell Python to interpret it literally, ie. Don't consider anything an escape character. That means that a backslash will always just be a backslash.

You store all 3 of these but only every use filepath, why not go directly to creating the filepath? The point of init should be taking the passed parameters and turning them into the form needed for your class, and your class only needs filepath. Particularly if you do convert this to a function, it'll become less useful to keep the redundant values.

Also, since init is about getting the data you need, why even have getData as a separate function? Collapse that back into init since it's being called there anyway and will never be called again. In fact if you call get_data() twice you've accidentally doubled up on the info, which you surely never want.

Since you're not changing the data at all, you can pass the csv reader directly to list() to create a list. In fact, you could just return list(rows) if you turned this all into one function.

You should add a docstring to your class/function. ReadFile seems pretty clear until you find out that you've passed a filepath and gotten an error that it doesn't exist (since you read from a specific directory) or you might be puzzled when you pass a text file and get back a csv formatted file.

Here's how I might refactor your code:

def read_csv_file(filename):
 """Returns a list of data from a csv file passed to it.
 Only reads files from 'C:\Users\User\Documents\' + filename
 Prints any exceptions from reading the file."""
 filepath = os.path.join(r'C:\Users\User\Documents', filename)
 try:
 with open(filepath, 'rU') as c:
 rows = csv.reader(c)
 return list(rows)
 except Exception as ex:
 print(ex)
def main():
 new_data = read_csv_file('myfile.csv')
 for i in new_data:
 print(i) # etc... more code will follow
if __name__ == '__main__':
 main()

If you were sticking with keeping it a class then I'd still keep the above in one function and make minor other changes. I'd add a docstring about the class's general use.

Also I would no longer catch and print the error. Errors shouldn't be ignored, and failing to read your file will actually mean that the class hasn't been correctly created as it's missing crucial data and likely has an invalid path. You can either let it be normally raised without intervention, or add a custom error message with raise ErrorType("Message about the error"). In your case IOError is likely, so you might raise IOError with a message about how the passed filename isn't in your hardcoded directory. But exactly how you'll handle that is up to you.

Like this:

class ReadFile(object):
 """Docstring"""
 def __init__(self, filename):
 self.localdir = r'C:\Users\User\Documents'
 self.filename = filename
 self.filepath = os.path.join(self.localdir, self.filename)
 try:
 with open(self.filepath, 'rU') as c:
 rows = csv.reader(c)
 self._data = list(rows)
 except IOError:
 raise IOError("{} not found in {}".format(
 self.filename, self.directory)
 def get_data(self):
 return self._data

Question 3

self.localdir and self.filename are used further in the code. i am using a class to get more practice but originally it was all functions. if you had to use a class, how would you do it? (maybe thats a better question since the code can probably be turned into two functions (including main).

Question 4

@jes516 I updated my answer with a bit more about the class approach. Note that I don't use classes that often so someone more experienced with Python classes might have more to say than me.

Question 5

I like the raise IOError. I was actually doing if new_data to check if there was data in the list before running off with more code but raising IOError (most likely this will be the problem if there is one) prevents the class from being created. thank you sir

Question 6

You are using a class with two methods, (not counting get), you should get rid of it and use a simple function.

Question 7

Something SuperBiasedMan didn't mention is small try statements. jes516 found Except block for with-statement and made an answer with that a few minuets ago, but not as a review.

You don't want large trys as that leads to bugs, what if rows = csv.reader(c) raises Exception or any child of it? You will think that it's because there was a problem with the file, not csv.

This also has the benefit of always closing the connection to the file too. If we were to get an error from csv, then the file will be closed safely, while any and all exceptions are still raised.

It's recommended to keep trys as small as possible for this reason.

# Small try, no masked bugs.
try:
 fopen = open(self.filepath, 'rU')
except IOError:
 raise IOError("{} not found in {}".format(
 self.filename, self.directory)
else:
 with fopen:
 rows = csv.reader(fopen)
 self._data = list(rows)

Question 8

I tried to edit it before it got deleted but i was late. I agree on the smaller try statements. Thats why i wanted to link the mailing list where I found that code block. The with will handle any errors after the file has been open and close it there after. I like it.

Question 9

I believe so. I mean with will raise any errors within the with block and like you mentioned, close the file. For example if I do rows = csv.reader(mycsv) i get a NameError: NameError: global name 'mycsv' is not defined

score 2 · Accepted Answer · 2015-09-17 09:04:06Z

I agree with Caridorc about not needing a class, in addition ReadFile isn't a very good class name, it sounds like a function. Unless, you meant read as an adjective (this file has been read), but you can see how that could be confusing.

Keeping filename and your hardcoded path seems redundant at the moment

 self.localdir = 'C:\\Users\\User\\Documents'
 self.filename = filename
 self.filepath = os.path.join(self.localdir, self.filename)

Also you could use r''to write your filepath without the double backslash. r before a string will tell Python to interpret it literally, ie. Don't consider anything an escape character. That means that a backslash will always just be a backslash.

You store all 3 of these but only every use filepath, why not go directly to creating the filepath? The point of init should be taking the passed parameters and turning them into the form needed for your class, and your class only needs filepath. Particularly if you do convert this to a function, it'll become less useful to keep the redundant values.

Also, since init is about getting the data you need, why even have getData as a separate function? Collapse that back into init since it's being called there anyway and will never be called again. In fact if you call get_data() twice you've accidentally doubled up on the info, which you surely never want.

Since you're not changing the data at all, you can pass the csv reader directly to list() to create a list. In fact, you could just return list(rows) if you turned this all into one function.

You should add a docstring to your class/function. ReadFile seems pretty clear until you find out that you've passed a filepath and gotten an error that it doesn't exist (since you read from a specific directory) or you might be puzzled when you pass a text file and get back a csv formatted file.

Here's how I might refactor your code:

def read_csv_file(filename):
 """Returns a list of data from a csv file passed to it.
 Only reads files from 'C:\Users\User\Documents\' + filename
 Prints any exceptions from reading the file."""
 filepath = os.path.join(r'C:\Users\User\Documents', filename)
 try:
 with open(filepath, 'rU') as c:
 rows = csv.reader(c)
 return list(rows)
 except Exception as ex:
 print(ex)
def main():
 new_data = read_csv_file('myfile.csv')
 for i in new_data:
 print(i) # etc... more code will follow
if __name__ == '__main__':
 main()

If you were sticking with keeping it a class then I'd still keep the above in one function and make minor other changes. I'd add a docstring about the class's general use.

Also I would no longer catch and print the error. Errors shouldn't be ignored, and failing to read your file will actually mean that the class hasn't been correctly created as it's missing crucial data and likely has an invalid path. You can either let it be normally raised without intervention, or add a custom error message with raise ErrorType("Message about the error"). In your case IOError is likely, so you might raise IOError with a message about how the passed filename isn't in your hardcoded directory. But exactly how you'll handle that is up to you.

Like this:

class ReadFile(object):
 """Docstring"""
 def __init__(self, filename):
 self.localdir = r'C:\Users\User\Documents'
 self.filename = filename
 self.filepath = os.path.join(self.localdir, self.filename)
 try:
 with open(self.filepath, 'rU') as c:
 rows = csv.reader(c)
 self._data = list(rows)
 except IOError:
 raise IOError("{} not found in {}".format(
 self.filename, self.directory)
 def get_data(self):
 return self._data

self.localdir and self.filename are used further in the code. i am using a class to get more practice but originally it was all functions. if you had to use a class, how would you do it? (maybe thats a better question since the code can probably be turned into two functions (including main).
@jes516 I updated my answer with a bit more about the class approach. Note that I don't use classes that often so someone more experienced with Python classes might have more to say than me.
I like the raise IOError. I was actually doing if new_data to check if there was data in the list before running off with more code but raising IOError (most likely this will be the problem if there is one) prevents the class from being created. thank you sir

Stack Exchange Network

Read file into list when class is instantiated in Python 2.7

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Read file into list when class is instantiated in Python 2.7

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions