I am working on a project which requires comparing the performance of SHA3 on different values of rate and capacity. I have implemented the same in Python and my program works fine, however, the performance of my program is slow in comparison to the standard SHA3 implementation.
The standard implementation of SHA3-512 (using hashlib.sha3_512()
) takes ~0.016 seconds for hashing a 1.5 MB file whereas my implementation of SHA3-512 for the same file takes ~20 seconds. I want the performance of my code to be similar to the standard implementation done using hashlib.sha3_512()
.
I am unable to figure out a way to optimise my code. My implementation is as follows:
import binascii
import time
class Keccak:
__STATE_LENGTH = 1600
__DELIMITED_SUFFIX = 0x06
__SHA3_RATE_LENGTH = 256
def __init__(self,variant=2):
self.__state_bytes_length = self.__STATE_LENGTH // 8
self.__delimited_suffix = self.__DELIMITED_SUFFIX
self.__SHA3_RATE_LENGTH = variant
self.__rate_bytes_length = self.__SHA3_RATE_LENGTH // 8
self.__state_in_bytes = bytearray([0 for i in range(self.__state_bytes_length)])
self.__capacity_bytes_length = self.__state_bytes_length - self.__rate_bytes_length
self.__hash_bytes_length = self.__capacity_bytes_length//2
self.__hash_bytes = bytearray()
@staticmethod
def __rotate_word(word, n):
return ((word >> (64 - (n % 64))) + (word << (n % 64))) % (1 << 64)
@staticmethod
def __load_64_bytes(byte_array):
return sum((byte_array[i] << (8 * i)) for i in range(8))
@staticmethod
def __store_64_bytes(integer):
return list((integer >> (8 * i)) % 256 for i in range(8))
def __run_inner_hash_functions(self, lanes):
R = 1
for round in range(24):
# θ
C = [lanes[x][0] ^ lanes[x][1] ^ lanes[x][2] ^ lanes[x][3] ^ lanes[x][4] for x in range(5)]
D = [C[(x + 4) % 5] ^ self.__rotate_word(C[(x + 1) % 5], 1) for x in range(5)]
lanes = [[lanes[x][y] ^ D[x] for y in range(5)] for x in range(5)]
# ρ and π
(x, y) = (1, 0)
current = lanes[x][y]
for t in range(24):
(x, y) = (y, (2 * x + 3 * y) % 5)
(current, lanes[x][y]) = (lanes[x][y], self.__rotate_word(current, (t + 1) * (t + 2) // 2))
# χ
for y in range(5):
T = [lanes[x][y] for x in range(5)]
for x in range(5):
lanes[x][y] = T[x] ^ ((~T[(x + 1) % 5]) & T[(x + 2) % 5])
# ι
for j in range(7):
R = ((R << 1) ^ ((R >> 7) * 0x71)) % 256
if R & 2:
lanes[0][0] = lanes[0][0] ^ (1 << ((1 << j) - 1))
return lanes
def __run_hash_function(self):
# In column first order
lanes = [[self.__load_64_bytes(self.__state_in_bytes[8 * (x + 5 * y):
8 * (x + 5 * y) + 8])
for y in range(5)]
for x in range(5)]
lanes = self.__run_inner_hash_functions(lanes)
state_in_bytes = bytearray(200)
for x in range(5):
for y in range(5):
state_in_bytes[8 * (x + 5 * y):
8 * (x + 5 * y) + 8] = self.__store_64_bytes(lanes[x][y])
self.__state_in_bytes = state_in_bytes
def get_hash_of(self, input_bytes):
block_size = 0
message_offset = 0
# === Absorb all the input blocks ===
while message_offset < len(input_bytes):
block_size = min(len(input_bytes) - message_offset, self.__rate_bytes_length)
for i in range(block_size):
self.__state_in_bytes[i] ^= input_bytes[message_offset + i]
message_offset += block_size
if block_size == self.__rate_bytes_length:
self.__run_hash_function()
block_size = 0
# === Do the padding and switch to the squeezing phase ===
self.__state_in_bytes[block_size] ^= self.__delimited_suffix
if ((self.__delimited_suffix & 0x80) != 0) and (block_size == (self.__rate_bytes_length - 1)):
self.__run_hash_function()
self.__state_in_bytes[self.__rate_bytes_length - 1] ^= 0x80
self.__run_hash_function()
# === Squeeze out all the output blocks ===
while self.__hash_bytes_length > 0:
block_size = min(self.__hash_bytes_length, self.__rate_bytes_length)
self.__hash_bytes += self.__state_in_bytes[0: block_size]
self.__hash_bytes_length -= block_size
if self.__hash_bytes_length > 0:
self.__run_hash_function()
return binascii.hexlify(self.__hash_bytes)
variant = int(input("Enter the rate value of SHA3: \n"))
keccak = None
try:
keccak = Keccak(variant)
except ValueError as v:
print(v)
exit(0)
file_name = input("Enter a file name: ")
file = open(file_name, "rb")
contents = file.read()
# for i in range(10):
start = time.time()
original_hash = keccak.get_hash_of(contents)
end = time.time()
execution_time = end-start
#execution_times.append(execution_time)
print("Hash:" + str(original_hash)[2:-1] + "\nLength of Hash: " + str(len(original_hash)))
print("Time taken for hashing: " + str(float(execution_time)) + " Seconds")
file.close()
1 Answer 1
Let's start with some conventions. Class definitions typically have two leading whitespaces before them and two after. Uppercase variables generally means it's a global variable. Single letter variable names can occasionally be fine, but I'd only use it if it relates to a known mathematical formula, and I'd want that formula kept with the code somewhere; for some of your variables it would be better to just type out pi
or whatever instead of renaming this to y
and only mentioning that in a comment. Private variables (two leading underscore to name-mangle) is sometimes used but quite rarely. We generally use _
for toss-away temporary variables to signify as much (e.g. [0 for i in range(self.__state_bytes_length)]
would be [0 for _ in range(self.__state_bytes_length)]
).
You have quite the amount of magic numbers in your code, and you have a lot of bit manipulation that isn't well documented. I'd die a little inside if I had to review this code for work.
Doing a sys.exit(0)
upon catching an exception is really strange - an exit code of 0
typically means success.
Instead of manually closing the file you've opened it would be better to use a context manager (with open(path, mode) as f:
).
Also, if you want to do profiling with time prints, it would be better to use perf_counter
, and even better yet to use a module like timeit
or to use the profile
module.
With that out of the way, let's move on to what you actually requested feedback on: the performance (as in timewise). I don't think you're ever going to get the time down to what hashlib
's function does without using pypy or leaving python entirely - i.e. you'd need to extend with C or C++ or some other compiled/lower level language. There are some minor improvements that you can do (replace calling list
with list comprehension, combine your nested loops using itertools
), but python is just not built for these kind of speeds.
-
\$\begingroup\$
file
is a built-in? That's news to me.. \$\endgroup\$Reinderien– Reinderien2022年01月07日 16:22:15 +00:00Commented Jan 7, 2022 at 16:22 -
\$\begingroup\$ My bad - it was in python2 but has since been removed. \$\endgroup\$ades– ades2022年01月07日 16:36:14 +00:00Commented Jan 7, 2022 at 16:36
-
\$\begingroup\$ I was wondering if there's an issue while reading the file..as in instead of reading the complete file at once, how about reading and hashing it in parts? Would that make my program faster? @ades \$\endgroup\$Ankush Soni– Ankush Soni2022年01月10日 07:26:55 +00:00Commented Jan 10, 2022 at 7:26
-
\$\begingroup\$ @ades Or how about using multiprocessing for this to make the code parallelized? \$\endgroup\$Ankush Soni– Ankush Soni2022年01月11日 06:58:02 +00:00Commented Jan 11, 2022 at 6:58
-
\$\begingroup\$ I think better ideas would be to try out using pypy or cython @AnkushSoni. You should otherwise try to profile your script to find out where you spend time; you wouldn't start optimising any part without first quantifying how long you're currently spending. \$\endgroup\$ades– ades2022年01月11日 08:48:05 +00:00Commented Jan 11, 2022 at 8:48
Explore related questions
See similar questions with these tags.
hashlib
are written in C to make use of possible hardware optimizations. So they are obviously much faster than your Python-interpreted implementation. Thus, you will not achieve your goal in making a pure Python implementation equally performant. \$\endgroup\$hashlib
library specifically made to do this? \$\endgroup\$hashlib
library won't let me do that until I tweak the source code. :( \$\endgroup\$