I would like to read a file into a list when I create the class below. Once the list is ready, I would like to access it outside the class in order to process more code based on the list which now contains my file. Can the code be improved upon?
class ReadFile(object):
def __init__(self, filename):
self._data = []
self.localdir = 'C:\\Users\\User\\Documents'
self.filename = filename
self.filepath = os.path.join(self.localdir, self.filename)
self.read_data()
def read_data(self):
try:
with open(self.filepath, 'rU') as c:
rows = csv.reader(c)
for row in rows:
self._data.append(row)
except Exception as ex:
print(ex)
def get_data(self):
return self._data
def main():
new = ReadFile('myfile.csv')
new_data = new.get_data()
for i in new_data:
print(i) # etc... more code will follow
if __name__ == '__main__':
main()
3 Answers 3
I agree with Caridorc about not needing a class, in addition ReadFile
isn't a very good class name, it sounds like a function. Unless, you meant read as an adjective (this file has been read), but you can see how that could be confusing.
Keeping filename
and your hardcoded path seems redundant at the moment
self.localdir = 'C:\\Users\\User\\Documents'
self.filename = filename
self.filepath = os.path.join(self.localdir, self.filename)
Also you could use r''
to write your filepath without the double backslash. r
before a string will tell Python to interpret it literally, ie. Don't consider anything an escape character. That means that a backslash will always just be a backslash.
You store all 3 of these but only every use filepath, why not go directly to creating the filepath? The point of init
should be taking the passed parameters and turning them into the form needed for your class, and your class only needs filepath
. Particularly if you do convert this to a function, it'll become less useful to keep the redundant values.
Also, since init
is about getting the data you need, why even have getData
as a separate function? Collapse that back into init
since it's being called there anyway and will never be called again. In fact if you call get_data()
twice you've accidentally doubled up on the info, which you surely never want.
Since you're not changing the data at all, you can pass the csv reader directly to list()
to create a list. In fact, you could just return list(rows)
if you turned this all into one function.
You should add a docstring to your class/function. ReadFile
seems pretty clear until you find out that you've passed a filepath and gotten an error that it doesn't exist (since you read from a specific directory) or you might be puzzled when you pass a text file and get back a csv formatted file.
Here's how I might refactor your code:
def read_csv_file(filename):
"""Returns a list of data from a csv file passed to it.
Only reads files from 'C:\Users\User\Documents\' + filename
Prints any exceptions from reading the file."""
filepath = os.path.join(r'C:\Users\User\Documents', filename)
try:
with open(filepath, 'rU') as c:
rows = csv.reader(c)
return list(rows)
except Exception as ex:
print(ex)
def main():
new_data = read_csv_file('myfile.csv')
for i in new_data:
print(i) # etc... more code will follow
if __name__ == '__main__':
main()
If you were sticking with keeping it a class then I'd still keep the above in one function and make minor other changes. I'd add a docstring about the class's general use.
Also I would no longer catch and print the error. Errors shouldn't be ignored, and failing to read your file will actually mean that the class hasn't been correctly created as it's missing crucial data and likely has an invalid path. You can either let it be normally raised without intervention, or add a custom error message with raise ErrorType("Message about the error")
. In your case IOError
is likely, so you might raise IOError
with a message about how the passed filename isn't in your hardcoded directory. But exactly how you'll handle that is up to you.
Like this:
class ReadFile(object):
"""Docstring"""
def __init__(self, filename):
self.localdir = r'C:\Users\User\Documents'
self.filename = filename
self.filepath = os.path.join(self.localdir, self.filename)
try:
with open(self.filepath, 'rU') as c:
rows = csv.reader(c)
self._data = list(rows)
except IOError:
raise IOError("{} not found in {}".format(
self.filename, self.directory)
def get_data(self):
return self._data
-
\$\begingroup\$ self.localdir and self.filename are used further in the code. i am using a class to get more practice but originally it was all functions. if you had to use a class, how would you do it? (maybe thats a better question since the code can probably be turned into two functions (including main). \$\endgroup\$jes516– jes5162015年09月17日 09:44:53 +00:00Commented Sep 17, 2015 at 9:44
-
\$\begingroup\$ @jes516 I updated my answer with a bit more about the class approach. Note that I don't use classes that often so someone more experienced with Python classes might have more to say than me. \$\endgroup\$SuperBiasedMan– SuperBiasedMan2015年09月17日 09:55:12 +00:00Commented Sep 17, 2015 at 9:55
-
\$\begingroup\$ I like the raise IOError. I was actually doing
if new_data
to check if there was data in the list before running off with more code but raising IOError (most likely this will be the problem if there is one) prevents the class from being created. thank you sir \$\endgroup\$jes516– jes5162015年09月17日 16:32:06 +00:00Commented Sep 17, 2015 at 16:32
You are using a class
with two methods, (not counting get
), you should get rid of it and use a simple function.
Something SuperBiasedMan didn't mention is small try
statements.
jes516 found Except block for with-statement and made an answer with that a few minuets ago, but not as a review.
You don't want large try
s as that leads to bugs,
what if rows = csv.reader(c)
raises Exception
or any child of it?
You will think that it's because there was a problem with the file, not csv
.
This also has the benefit of always closing the connection to the file too.
If we were to get an error from csv
, then the file will be closed safely, while any and all exceptions are still raised.
It's recommended to keep try
s as small as possible for this reason.
# Small try, no masked bugs.
try:
fopen = open(self.filepath, 'rU')
except IOError:
raise IOError("{} not found in {}".format(
self.filename, self.directory)
else:
with fopen:
rows = csv.reader(fopen)
self._data = list(rows)
-
\$\begingroup\$ I tried to edit it before it got deleted but i was late. I agree on the smaller
try
statements. Thats why i wanted to link the mailing list where I found that code block. Thewith
will handle any errors after the file has been open and close it there after. I like it. \$\endgroup\$jes516– jes5162015年09月24日 19:49:13 +00:00Commented Sep 24, 2015 at 19:49 -
\$\begingroup\$ I believe so. I mean
with
will raise any errors within thewith
block and like you mentioned, close the file. For example if I dorows = csv.reader(mycsv)
i get aNameError
:NameError: global name 'mycsv' is not defined
\$\endgroup\$jes516– jes5162015年09月24日 20:02:47 +00:00Commented Sep 24, 2015 at 20:02