How can I make the following checksum algorithm more pythonic and readable?
items = {1:'one', 2:'two', 3:'three'}
text = "foo" # This variable may change on runtime. Added to allow the execution
number = 1 # This variable may change on runtime. Added to allow the execution
string = text + items[number] + " " + "trailing"
if items[number] == "two": string=text
string = [ord(x) for x in string]
iterator = zip(string[1:], string)
checksum = string[0]
for counter, (x, y) in enumerate(iterator):
if counter % 2:
checksum += x
checksum += 2 * y
else:
checksum += x * y
checksum %= 0x10000
print("Checksum:", checksum)
The thing called salt
by everyone here is fixed and won't change ever. Also, the code is into a bigger function.
This code as-is must return this:
Checksum: 9167
4 Answers 4
First of all, I would apply the "Extract Method" refactoring method and put the logic behind generating a checksum into a get_checksum()
function.
Also, I would avoid hardcoding the trailing
"salt" and the 0x10000
"normalizer" and put them into the default keyword argument value and to a "constant" respectively.
You can use sum()
instead of having loop with if and else, which increase the overall nestedness. But, it might become less explicit and potentially a bit more difficult to understand.
Comments and docstrings would definitely be needed to describe the logic behind the checksum calculations.
Variable naming can be improved - str_a
, str_b
and string
are not the best choices (even though the part1
and part2
in the code below are not either - think about what these strings represent - may be you'll come up with better variable names). Also, note that you are reusing the string
variable to keep the "ords" of each character, which, strictly speaking, is not good - the variable name does not correspond to what it is used for.
At the end you would have something like:
DEFAULT_SALT = "trailing"
NORMALIZER = 0x10000
def get_checksum(part1, part2, salt=DEFAULT_SALT):
"""Returns a checksum of two strings."""
combined_string = part1 + part2 + " " + salt if part2 != "***" else part1
ords = [ord(x) for x in combined_string]
checksum = ords[0] # initial value
# TODO: document the logic behind the checksum calculations
iterator = zip(ords[1:], ords)
checksum += sum(x + 2 * y if counter % 2 else x * y
for counter, (x, y) in enumerate(iterator))
checksum %= NORMALIZER
return checksum
We can see that the test you've mentioned in the question passes:
$ ipython3 -i test.py
In [1]: get_checksum("foo", "one")
Out[1]: 9167
-
1\$\begingroup\$ Can you update the answer with the new code? I've made a edit after the Dex' ter comment. \$\endgroup\$0x2b3bfa0– 0x2b3bfa02017年02月07日 16:38:06 +00:00Commented Feb 7, 2017 at 16:38
-
3\$\begingroup\$ @Helio You can use his (and also my) function directly, even with your new code, by just passing it the correct arguments. \$\endgroup\$Graipher– Graipher2017年02月07日 16:39:08 +00:00Commented Feb 7, 2017 at 16:39
-
1\$\begingroup\$ @Graipher: Yes. P. S: You made a typo on my nickname. ;-) \$\endgroup\$0x2b3bfa0– 0x2b3bfa02017年02月07日 16:41:04 +00:00Commented Feb 7, 2017 at 16:41
-
2\$\begingroup\$ @Helio Hah, it was not yet too late to fix it :) \$\endgroup\$Graipher– Graipher2017年02月07日 16:41:33 +00:00Commented Feb 7, 2017 at 16:41
-
1\$\begingroup\$
part1
becomesname
, part2 will take one of these values{1:'ver_one', 2:'', 3:'ver_three'}
andsalt
is ` version`. The "salt" can be constant. ;-) \$\endgroup\$0x2b3bfa0– 0x2b3bfa02017年02月07日 16:46:15 +00:00Commented Feb 7, 2017 at 16:46
Similarly to alecxe's answer, i would wrap the code in a function and make the salt a parameter.
I would use str.format
for building the string
(and still use a ternary).
I see no need for the explicit iterator
variable, just inline it.
def checksum(str_a, str_b, salt):
string = str_a if str_b == "***" else "{}{} {}".format(str_a, str_b, salt)
string = [ord(x) for x in string]
checksum = string[0]
for counter, (x, y) in enumerate(zip(string[1:], string)):
if counter % 2:
checksum += x + 2 * y
else:
checksum += x * y
return checksum % 0x10000
Okay so i did read this question few hours ago on phone and started working out an answer on paper, having the performance tag in mind. So here it is:
untouched lines:
items = {1:'one', 2:'two', 3:'three'}
text = "foo" # This variable may change on runtime. Added to allow the execution
number = 1 # This variable may change on runtime. Added to allow the execution
first changes made here. Your Code does set sring and overwrtes it if items[numer]
equals string "two". I dont like those inline if elses in long lines so i will take more space then other answers but think this is more readable.
if items[number] == "two":
string = text
else:
string = "{}{} trailing".format(text, items[number])
next line is untouched again with:
string = [ord(x) for x in string]
and now my performance-part comes into play. To understand i will explain your algorith and how i rearanged it. You zip the String starting with the second element and itself. This resolves to [[string[1], string[0]], [string[2], string[1]], ... , [string[last element], string[last element -1]]]
.
This is a list with len(string) -1 elements.
So we will memorize strings length
string_len = len(string)
We will also memorize if the number of zip-lists elements would be even or odd. In case string has an odd number of elements zip-list would have an even number and if strings number of elements would be even zip-lists had been odd.
odd = (string_len -1) % 2
We will also do the initial setup for checksum as is:
checksum = string[0]
Iterating over iterator there are 2 cases: the iteration is odd
if counter % 2: checksum += x checksum += 2 * y
or its even
else: checksum += x * y
There are two things unnecessary in my opinion. First is creating the zip-list since you can access every element by index. Element x is string[iteration+1]
, y is string[iteration]
. Since counter is successively even and odd we can also say that for two iterations we will once do
checksum += x * y in case our iteration-counter is even and checksum += x checksum += 2 * y in case its not even.
Or we will just allways do 2 iterations in one step and do both operations like:
for i in range(0, string_len -1 -odd, 2):
checksum += string[i] * string[i + 1] # counter % 2 := false checksum += x * y
checksum += string[i + 2] # counter % 2 := true checksum += x
checksum += 2 * string[i + 1] # counter % 2 := true checksum += 2 * y
Now why we did memorize if zip-lists number of elements would had been odd or even: We allways made 2 operations but this wont work for an odd number of operations. So we will check once with an if what to do:
if odd:
checksum += string[i] * string[i + 1]
Anything else as is:
checksum %= 0x10000
print("Checksum:", checksum)
it might be recommendable to save the modulo in another place like other answers mention.
Entire Code looks like this:
items = {1:'one', 2:'two', 3:'three'}
text = "foo" # This variable may change on runtime. Added to allow the execution
number = 1 # This variable may change on runtime. Added to allow the execution
if items[number] == "two":
string = text
else:
string = str.format("{:s}{:s} trailing", text, items[number])
string = [ord(x) for x in string]
string_len = len(string)
odd = (string_len -1) % 2
checksum = string[0]
for i in range(0, string_len -1 -odd, 2):
checksum += string[i] * string[i + 1] # counter % 2 := false checksum += x * y
checksum += string[i + 2] # counter % 2 := true checksum += x
checksum += 2 * string[i + 1] # counter % 2 := true checksum += 2 * y
if odd:
checksum += string[i] * string[i + 1]
checksum %= 0x10000
print("Checksum:", checksum)
-
1\$\begingroup\$ Agree on the ternary being borderline too long to understand. Would write
string = "{}{} trailing".format(text, items[number])
, though as being shorter and just as easy to read IMO. Also,:s
is implied. \$\endgroup\$Graipher– Graipher2017年02月07日 20:04:24 +00:00Commented Feb 7, 2017 at 20:04 -
1\$\begingroup\$ You are absolutely right, somehow i didnt come up with the right method call. Especially this is more OOP which i like, will change it :) \$\endgroup\$SchreiberLex– SchreiberLex2017年02月07日 21:11:06 +00:00Commented Feb 7, 2017 at 21:11
My review (Python 3).
items = {1: 'one', 2: 'two', 3: 'three'}
text = "foo"
number = 1
string_original = "{}{} trailing".format(text, items[number])
if items[number] == "two":
string_original = text
ord_list = [ord(x) for x in string_original]
checksum = ord_list[0]
for i in range(1, len(ord_list)):
x, y = ord_list[i], ord_list[i-1]
# The `-1` is necessary since parity is changed by starting the loop on 1
if (i-1) % 2:
checksum = checksum + x + 2*y
else:
checksum = checksum + x*y
checksum %= 0x10000
print("Checksum:", checksum)
Some notes about it:
- I had recommend putting the code after an
if
statement on its own line, even if it is only a single statement. - Always leave one space when declaring variables.
string = text
and notstr=text
(unless specifying default parameters inside a function). - I had also recommend not use the
+=
kind of operators when there a is a non trivial expression on the right side, even though this might come down to personal preference. - Also, you should try to avoid using generic names such as string for naming your variables.
- Regardless of the language used, always try to see if you can simplify the code somehow, in order to make it more maintainable.
str_a
andstr_b
to some strings ... \$\endgroup\$