Method
Frequently input for problems is given as a 1D or 2D array in a text file.
Have knocked out a method to do so, but believe something shorter, clearer and cheaper can be written:
def read_array_from_txt(path, dim, typ, sep):
'''
@time: O(n)
@space: O(n)
'''
txt_file_object = open(path,"r")
text = txt_file_object.readlines()
text = text[0]
if dim == 1:
if typ == "int":
text = [int(num) for num in text.split(sep)]
elif typ == "str":
text = [let for let in text.split(sep)]
else:
raise ValueError("Unknown type.")
elif dim == 2:
if typ == "int":
text = [[int(num) for num in line.split(sep)] for line in text]
if typ == "str":
text = [[let for let in line.split(sep)] for line in text]
else:
raise ValueError("Unknown type.")
else:
raise ValueError("Unknown dimension.")
txt_file_object.close()
return text
An Example:
Input: encrypted_message.txt
36,22,80,0,0,4,23,25,19,17,88,4,4,19
Output:
>>> read_array_from_txt("./encrypted_message.txt", 1, "int", ",")
[36,22,80,0,0,4,23,25,19,17,88,4,4,19]
Some Other Possible Inputs:
1 37 79 164 155 32 87 39 113 15 18 78 175 140 200 4 160 97 191 100 91 20 69 198 196
2 123 134 10 141 13 12 43 47 3 177 101 179 77 182 117 116 36 103 51 154 162 128 30
3 48 123 134 109 41 17 159 49 136 16 130 141 29 176 2 190 66 153 157 70 114 65 173 104 194 54
are,in,hello,hi,ok,yes,no,is
-
\$\begingroup\$ Could you include an example how you'd use this? \$\endgroup\$dfhwze– dfhwze2019年08月29日 10:47:25 +00:00Commented Aug 29, 2019 at 10:47
-
\$\begingroup\$ Have you tested this? It seems to me to have at least one obvious bug. \$\endgroup\$Peter Taylor– Peter Taylor2019年08月29日 11:00:42 +00:00Commented Aug 29, 2019 at 11:00
-
\$\begingroup\$ Tested and have used this on several problems, but could have missed something. \$\endgroup\$A.L. Verminburger– A.L. Verminburger2019年08月29日 11:05:54 +00:00Commented Aug 29, 2019 at 11:05
1 Answer 1
First of all, let's make use of the with
statement so that the file is closed automatically:
with open(path, 'r') as txt_file_object:
text = txt_file_object.readlines()
With this, you don't have to call close()
anymore, as the file will close automatically when you exit the with
scope.
text = text[0]
You are only reading the first line of text. Is this really what you want to do?
You are using the variable text
for two different things: for the input lines and for the output values. This is not very intuitive; in fact, the result can be a list of integers, so why would it be called text
? Maybe result
would be a better name for it.
BUT since now you don't have to close at the end of the function, you can return the result directly instead of saving it in a variable:
return [int(num) for num in text.split(sep)]
Returning will exit the with
scope, so again the file will be closed automatically.
[let for let in text.split(sep)]
This selects all objects inside text.split(sep)
, so we can return the splitted list directly:
text.split(sep)
Similarly, there's a different way of applying a function to every item in a list, which is using map
. Maybe the list comprehension feels more natural, so you can keep that if you want; still I'll show you just so you know:
# Creates a list by calling 'int' to every item in the list
list(map(int, text.split(sep))
You are repating a lot of code; you have four different results, but they are very similar, so let's try to provide a more generic way.
My concern here is handling the two possible dimensions. You are parsing the text in the same way (depending if it's int or str), but when it's two dimensions you do it for every line. So we could use Python's lambdas to decide first what type of parsing (int or str) we are doing, and then just apply it once or multiple times.
The lambda can take parameters; the only parameter we need in our case is the input text. We can't just use text
directly because sometimes we want to parse the full text, but sometimes only the line:
if typ == "int":
parse_function = lambda t: list(map(int, t.split(sep)))
elif typ == "str":
parse_function = lambda t: list(t.split(sep))
else:
raise ValueError("Unknown type.")
Now parse_function
can be used like any other function, taking the text as input. So we can use it when deciding the dimension:
if dim == 1:
return parse_function(text)
elif dim == 2:
return [parse_function(line) for line in text]
else:
raise ValueError("Unknown dimension.")
You do well in throwing exceptions for invalid input, but how is the user meant to know what possible values can be used for typ
and dim
? You could add that to the docstring. You should also say what the function does in the docstring.
Updated code
def read_array_from_txt(path, dim, typ, sep):
'''
Processes a text file as a 1D or 2D array.
:param path: Path to the input file.
:param dim: How many dimensions (1 or 2) the array has.
:param typ: Whether the elements are read as 'int' or 'str'
:param sep: The text that is used to separate between elements.
@time: O(n)
@space: O(n)
'''
with open(path,"r") as txt_file_object:
text = txt_file_object.readlines()
if typ == "int":
parse_function = lambda t: list(map(int, t.split(sep)))
elif typ == "str":
parse_function = lambda t: list(t.split(sep))
else:
raise ValueError("Unknown type.")
if dim == 1:
return parse_function(text)
elif dim == 2:
return [parse_function(line) for line in text]
else:
raise ValueError("Unknown dimension.")
-
\$\begingroup\$ Like the comment on explaining inputs in doscstring; and removal of redundancy via lambdas. Disinclined to use with; explicitly closing seems clearer. \$\endgroup\$A.L. Verminburger– A.L. Verminburger2019年08月29日 16:47:03 +00:00Commented Aug 29, 2019 at 16:47
-
1\$\begingroup\$
str.split
already returns a list, solist(text.split(sep))
is redundant. @A.L.Verminburger: Thewith
scope gives you more than just callingclose
. It guarantees that the file will be closed (barring the computer instantaneously dying). Even in the event of an exception (which would prevent yourclose
from running. It is a well-known and often used Python idiom, which I would recommend getting used to. \$\endgroup\$Graipher– Graipher2019年08月29日 16:50:45 +00:00Commented Aug 29, 2019 at 16:50 -
\$\begingroup\$ True, I assumed it would return a generator. I will edit the answer now, thanks. \$\endgroup\$eric.m– eric.m2019年08月30日 06:34:51 +00:00Commented Aug 30, 2019 at 6:34
-
\$\begingroup\$ You didn't remove
list(...)
from all the cases oflist(text.split(sep))
. You missed this one:parse_function = lambda t: list(t.split(sep))
\$\endgroup\$AJNeufeld– AJNeufeld2019年09月04日 21:05:13 +00:00Commented Sep 4, 2019 at 21:05 -
\$\begingroup\$ You'll want
return parse_function(text[0])
on thedim == 1
case, becausetext
is a list, not a string. \$\endgroup\$AJNeufeld– AJNeufeld2019年09月04日 21:07:14 +00:00Commented Sep 4, 2019 at 21:07