Python Method for Text Input from Project Euler

Question 1

Method

Frequently input for problems is given as a 1D or 2D array in a text file.

Have knocked out a method to do so, but believe something shorter, clearer and cheaper can be written:

def read_array_from_txt(path, dim, typ, sep):
 '''
 @time: O(n)
 @space: O(n)
 '''
 txt_file_object = open(path,"r")
 text = txt_file_object.readlines()
 text = text[0]
 if dim == 1:
 if typ == "int":
 text = [int(num) for num in text.split(sep)]
 elif typ == "str":
 text = [let for let in text.split(sep)]
 else:
 raise ValueError("Unknown type.")
 elif dim == 2:
 if typ == "int":
 text = [[int(num) for num in line.split(sep)] for line in text]
 if typ == "str":
 text = [[let for let in line.split(sep)] for line in text]
 else:
 raise ValueError("Unknown type.")
 else:
 raise ValueError("Unknown dimension.")
 txt_file_object.close()
 return text

An Example:

Input: encrypted_message.txt

36,22,80,0,0,4,23,25,19,17,88,4,4,19

Output:

>>> read_array_from_txt("./encrypted_message.txt", 1, "int", ",")
[36,22,80,0,0,4,23,25,19,17,88,4,4,19]

Some Other Possible Inputs:

1 37 79 164 155 32 87 39 113 15 18 78 175 140 200 4 160 97 191 100 91 20 69 198 196 
2 123 134 10 141 13 12 43 47 3 177 101 179 77 182 117 116 36 103 51 154 162 128 30 
3 48 123 134 109 41 17 159 49 136 16 130 141 29 176 2 190 66 153 157 70 114 65 173 104 194 54

are,in,hello,hi,ok,yes,no,is

Question 2

Could you include an example how you'd use this?

Question 3

Have you tested this? It seems to me to have at least one obvious bug.

Question 4

Tested and have used this on several problems, but could have missed something.

Question 5

First of all, let's make use of the with statement so that the file is closed automatically:

with open(path, 'r') as txt_file_object:
 text = txt_file_object.readlines()

With this, you don't have to call close() anymore, as the file will close automatically when you exit the with scope.

text = text[0]

You are only reading the first line of text. Is this really what you want to do?

You are using the variable text for two different things: for the input lines and for the output values. This is not very intuitive; in fact, the result can be a list of integers, so why would it be called text? Maybe result would be a better name for it.

BUT since now you don't have to close at the end of the function, you can return the result directly instead of saving it in a variable:

return [int(num) for num in text.split(sep)]

Returning will exit the with scope, so again the file will be closed automatically.

[let for let in text.split(sep)]

This selects all objects inside text.split(sep), so we can return the splitted list directly:

text.split(sep)

Similarly, there's a different way of applying a function to every item in a list, which is using map. Maybe the list comprehension feels more natural, so you can keep that if you want; still I'll show you just so you know:

# Creates a list by calling 'int' to every item in the list
list(map(int, text.split(sep))

You are repating a lot of code; you have four different results, but they are very similar, so let's try to provide a more generic way.

My concern here is handling the two possible dimensions. You are parsing the text in the same way (depending if it's int or str), but when it's two dimensions you do it for every line. So we could use Python's lambdas to decide first what type of parsing (int or str) we are doing, and then just apply it once or multiple times.

The lambda can take parameters; the only parameter we need in our case is the input text. We can't just use text directly because sometimes we want to parse the full text, but sometimes only the line:

if typ == "int":
 parse_function = lambda t: list(map(int, t.split(sep)))
elif typ == "str":
 parse_function = lambda t: list(t.split(sep))
else:
 raise ValueError("Unknown type.")

Now parse_function can be used like any other function, taking the text as input. So we can use it when deciding the dimension:

if dim == 1:
 return parse_function(text)
elif dim == 2:
 return [parse_function(line) for line in text]
else:
 raise ValueError("Unknown dimension.")

You do well in throwing exceptions for invalid input, but how is the user meant to know what possible values can be used for typ and dim? You could add that to the docstring. You should also say what the function does in the docstring.

Updated code

def read_array_from_txt(path, dim, typ, sep):
 '''
 Processes a text file as a 1D or 2D array.
 :param path: Path to the input file.
 :param dim: How many dimensions (1 or 2) the array has.
 :param typ: Whether the elements are read as 'int' or 'str'
 :param sep: The text that is used to separate between elements.
 @time: O(n)
 @space: O(n)
 '''
 with open(path,"r") as txt_file_object:
 text = txt_file_object.readlines()
 if typ == "int":
 parse_function = lambda t: list(map(int, t.split(sep)))
 elif typ == "str":
 parse_function = lambda t: list(t.split(sep))
 else:
 raise ValueError("Unknown type.")
 if dim == 1:
 return parse_function(text)
 elif dim == 2:
 return [parse_function(line) for line in text]
 else:
 raise ValueError("Unknown dimension.")

Question 6

Like the comment on explaining inputs in doscstring; and removal of redundancy via lambdas. Disinclined to use with; explicitly closing seems clearer.

Question 7

str.split already returns a list, so list(text.split(sep)) is redundant. @A.L.Verminburger: The with scope gives you more than just calling close. It guarantees that the file will be closed (barring the computer instantaneously dying). Even in the event of an exception (which would prevent your close from running. It is a well-known and often used Python idiom, which I would recommend getting used to.

Question 8

True, I assumed it would return a generator. I will edit the answer now, thanks.

Question 9

You didn't remove list(...) from all the cases of list(text.split(sep)). You missed this one: parse_function = lambda t: list(t.split(sep))

Question 10

You'll want return parse_function(text[0]) on the dim == 1 case, because text is a list, not a string.

eric.m eric.meric.m 5793 silver badges9 bronze badges · Accepted Answer · 2019-08-29 12:09:58Z

First of all, let's make use of the with statement so that the file is closed automatically:

with open(path, 'r') as txt_file_object:
 text = txt_file_object.readlines()

With this, you don't have to call close() anymore, as the file will close automatically when you exit the with scope.

text = text[0]

You are only reading the first line of text. Is this really what you want to do?

You are using the variable text for two different things: for the input lines and for the output values. This is not very intuitive; in fact, the result can be a list of integers, so why would it be called text? Maybe result would be a better name for it.

BUT since now you don't have to close at the end of the function, you can return the result directly instead of saving it in a variable:

return [int(num) for num in text.split(sep)]

Returning will exit the with scope, so again the file will be closed automatically.

[let for let in text.split(sep)]

This selects all objects inside text.split(sep), so we can return the splitted list directly:

text.split(sep)

Similarly, there's a different way of applying a function to every item in a list, which is using map. Maybe the list comprehension feels more natural, so you can keep that if you want; still I'll show you just so you know:

# Creates a list by calling 'int' to every item in the list
list(map(int, text.split(sep))

You are repating a lot of code; you have four different results, but they are very similar, so let's try to provide a more generic way.

My concern here is handling the two possible dimensions. You are parsing the text in the same way (depending if it's int or str), but when it's two dimensions you do it for every line. So we could use Python's lambdas to decide first what type of parsing (int or str) we are doing, and then just apply it once or multiple times.

The lambda can take parameters; the only parameter we need in our case is the input text. We can't just use text directly because sometimes we want to parse the full text, but sometimes only the line:

if typ == "int":
 parse_function = lambda t: list(map(int, t.split(sep)))
elif typ == "str":
 parse_function = lambda t: list(t.split(sep))
else:
 raise ValueError("Unknown type.")

Now parse_function can be used like any other function, taking the text as input. So we can use it when deciding the dimension:

if dim == 1:
 return parse_function(text)
elif dim == 2:
 return [parse_function(line) for line in text]
else:
 raise ValueError("Unknown dimension.")

You do well in throwing exceptions for invalid input, but how is the user meant to know what possible values can be used for typ and dim? You could add that to the docstring. You should also say what the function does in the docstring.

Updated code

def read_array_from_txt(path, dim, typ, sep):
 '''
 Processes a text file as a 1D or 2D array.
 :param path: Path to the input file.
 :param dim: How many dimensions (1 or 2) the array has.
 :param typ: Whether the elements are read as 'int' or 'str'
 :param sep: The text that is used to separate between elements.
 @time: O(n)
 @space: O(n)
 '''
 with open(path,"r") as txt_file_object:
 text = txt_file_object.readlines()
 if typ == "int":
 parse_function = lambda t: list(map(int, t.split(sep)))
 elif typ == "str":
 parse_function = lambda t: list(t.split(sep))
 else:
 raise ValueError("Unknown type.")
 if dim == 1:
 return parse_function(text)
 elif dim == 2:
 return [parse_function(line) for line in text]
 else:
 raise ValueError("Unknown dimension.")

Like the comment on explaining inputs in doscstring; and removal of redundancy via lambdas. Disinclined to use with; explicitly closing seems clearer.
str.split already returns a list, so list(text.split(sep)) is redundant. @A.L.Verminburger: The with scope gives you more than just calling close. It guarantees that the file will be closed (barring the computer instantaneously dying). Even in the event of an exception (which would prevent your close from running. It is a well-known and often used Python idiom, which I would recommend getting used to.
True, I assumed it would return a generator. I will edit the answer now, thanks.
You didn't remove list(...) from all the cases of list(text.split(sep)). You missed this one: parse_function = lambda t: list(t.split(sep))
You'll want return parse_function(text[0]) on the dim == 1 case, because text is a list, not a string.

Stack Exchange Network

Python Method for Text Input from Project Euler

Method

An Example:

Some Other Possible Inputs:

1 Answer 1

Updated code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Python Method for Text Input from Project Euler

Method

An Example:

Some Other Possible Inputs:

1 Answer 1

Updated code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions