I made a brute force password cracker with Python, but it's extremely slow. How can I make it faster?
import itertools
import string
import time
def guess_password(real):
chars = string.ascii_uppercase + string.digits
password_length = 23
start = time.perf_counter()
for guess in itertools.product(chars, repeat=password_length):
guess = ''.join(guess)
t = list(guess)
t[5] = '-'
t[11] = '-'
t[17] = '-'
tg = ''.join(t)
if tg == real:
return 'Scan complete. Code: \'{}\'. Time elapsed: {}'.format(tg, (time.perf_counter() - start))
print(guess_password('E45E7-BYXJM-7STEY-K5H7L'))
As I said, it's extremely slow. It takes at least 9 days to find a single password.
3 Answers 3
You should exploit the structure of the password, if it has any. Here you have a 20 character password separated into four blocks of five characters each, joined with a -
. So don't go on generating all combinations of length 23, only to throw most of them away.
You also str.join
the guess, then convert it to a list
, then replace the values and str.join
it again. You could have saved yourself the first str.join
entirely by directly converting to list
.
You know the length of the password, so no need to hardcode it. Just get it from the real password (or, in a more realistic cracker, pass the length as a parameter).
With these small changes your code would become:
def guess_password(real):
chars = string.ascii_uppercase + string.digits
password_format = "-".join(["{}"*5] * 4)
password_length = len(real) - 3
for guess in itertools.product(chars, repeat=password_length):
guess = password_format.format(*guess)
if guess == real:
return guess
Here I used some string formatting to get the right format.
Note also that the timing and output string are not in there. Instead make the former a decorator and the latter part of the calling code, which should be protected by a if __name__ == "__main__":
guard to allow you to import from this script without running the brute force cracker:
from time import perf_counter
from functools import wraps
def timeit(func):
@wraps(func)
def wrapper(*args, **kwargs):
start = perf_counter()
ret = func(*args, **kwargs)
print(f"Time elapsed: {perf_counter() - start}")
return ret
return wrapper
@timeit
def guess_password(real):
...
if __name__ == "__main__":
real_password = 'E45E7-BYXJM-7STEY-K5H7L'
if guess_password(real_password):
print(f"Scan completed: {real_password}")
On my machine this takes 9.96 s ± 250 ms, whereas your code takes 12.3 s ± 2.87 s for the input string "AAAAA-AAAAA-AAAAA-FORTN"
.
But in the end you will always be limited by the fact that there are a lot of twenty character strings consisting of upper case letters and digits. Namely, there are \36ドル^{20} = 13,367,494,538,843,734,067,838,845,976,576\$ different passwords that need to be checked (well, statistically you only need to check half of them, on average, until you find your real password, but you might get unlucky). Not even writing your loop in Assembler is this going to run in less than days.
-
\$\begingroup\$ I got output
guess = password_format.format(*guess) IndexError: tuple index out of range
\$\endgroup\$Sena– Sena2019年02月12日 17:23:11 +00:00Commented Feb 12, 2019 at 17:23 -
\$\begingroup\$ @AkınOktayATALAY: In that case you gave it a password of a different format (or used a previous revision, I had a typo in the password length). It works with the two given strings. \$\endgroup\$Graipher– Graipher2019年02月12日 17:26:19 +00:00Commented Feb 12, 2019 at 17:26
-
\$\begingroup\$ I think I should change
password_length = len(real) - 5
topassword_length = len(real) - 3
\$\endgroup\$Sena– Sena2019年02月12日 17:27:24 +00:00Commented Feb 12, 2019 at 17:27 -
\$\begingroup\$ @AkınOktayATALAY: Yes, I already did that (about a minute after posting the answer for the first time), just update the page. \$\endgroup\$Graipher– Graipher2019年02月12日 17:31:06 +00:00Commented Feb 12, 2019 at 17:31
-
1\$\begingroup\$ @AkınOktayATALAY: Take your time. It is usually not a bad idea to wait at least 24 hours, so everybody on the globe had a chance to see the question and think about answering. Maybe I missed something. \$\endgroup\$Graipher– Graipher2019年02月12日 17:32:11 +00:00Commented Feb 12, 2019 at 17:32
There are other ways beyond improving the code itself.
- Beyond changes which reduce allocations a lot, like:
t = list(guess)
instead of:
guess = ''.join(guess)
t = list(guess)
Reduces the runtime 11s -> 6.7s.
- You can use a different runtime which will speed up almost any code:
➜ /tmp python3 foo.py
Scan complete. Code: 'AAAAA-AAAAA-AAAAA-FORTN'. Time elapsed: 6.716003532
➜ /tmp pypy3 foo.py
Scan complete. Code: 'AAAAA-AAAAA-AAAAA-FORTN'. Time elapsed: 3.135087580012623
- Or precompile the existing code into a module which you can load again in your standard python code:
# cythonize -3 -i foo.py
Compiling /private/tmp/foo.py because it changed.
[1/1] Cythonizing /private/tmp/foo.py
running build_ext
building 'foo' extension
...
# ipython3
In [1]: import foo
Scan complete. Code: 'AAAAA-AAAAA-AAAAA-FORTN'. Time elapsed: 3.846977077
-
\$\begingroup\$ thanks the
cythonize
worked but the PyPy is 3x slow for me. Do you know why? \$\endgroup\$Sena– Sena2019年02月13日 14:43:03 +00:00Commented Feb 13, 2019 at 14:43 -
\$\begingroup\$
¯\_(ツ)_/¯
sorry \$\endgroup\$viraptor– viraptor2019年02月13日 22:56:10 +00:00Commented Feb 13, 2019 at 22:56
import hashlib
from urllib.request import urlopen
############# append the below code ################
def readwordlist(url):
try:
wordlistfile = urlopen(url).read()
except Exception as e:
print("Hey there was some error while reading the wordlist, error:", e)
exit()
return wordlistfile
def hash(password):
result = hashlib.sha1(password.encode())
return result.hexdigest()
def bruteforce(guesspasswordlist, actual_password_hash):
for guess_password in guesspasswordlist:
if hash(guess_password) == actual_password_hash:
print("Hey! your password is:", guess_password,
"\n please change this, it was really easy to guess it (:")
# If the password is found then it will terminate the script here
exit()
-
2\$\begingroup\$ This is just an alternativ rewrite, not a review of the original code. Also, the formatting is bonked, please check the help pages for correct syntax. \$\endgroup\$TomG– TomG2023年11月04日 16:11:28 +00:00Commented Nov 4, 2023 at 16:11
password_length
to 20, you are needlessly duplicating your searches (each of your proposal passwords is constructed 36^3 different times) \$\endgroup\$AAAAA-AAAAA-AAAAA-FORTN
but my old code is faster than yours with 5 seconds. My old code got 25 seconds but your one got 30 seconds. Why? I thought it was going to make it faster. \$\endgroup\$AAAAA-AAAAA-AAAAB-AAAAA
. \$\endgroup\$