This is my first actual python tool I've written and I'm certainly happy with the way it works, but I'm sure there's better ways to do things in the code. Any ideas on how to speed it up would also be nice too!
Code:
#!/usr/bin/python
import StringIO
import getopt
import hashlib
import sys
import os
print " "
print "Python Hash-Cracker"
print "Version 3.0-2 Stable"
def info():
print " "
print "Information:"
print "[*]Options:"
print "[*](-h) Hash"
print "[*](-t) Type [See supported hashes]"
print "[*](-w) Wordlist"
print "[*](-n) Numbers bruteforce"
print "[*](-v) Verbose [{WARNING}Slows cracking down!]"
print "[*]Examples:"
print "[>]./Hash-Cracker.py -h <hash> -t md5 -w DICT.txt"
print "[>]./Hash-Cracker.py -h <hash> -t sha384 -n -v"
print "[*]Supported Hashes:"
print "[>]md5, sha1, sha224, sha256, sha384, sha512"
print "[*]Thats all folks!\n"
def check_os():
if os.name == "nt":
operating_system = "windows"
if os.name == "posix":
operating_system = "posix"
return operating_system
class hash:
def hashcrack(self, hash, type):
self.num = 0
if (type == "md5"):
h = hashlib.md5
elif (type == "sha1"):
h = hashlib.sha1
elif (type == "sha224"):
h = hashlib.sha224
elif (type == "sha256"):
h = hashlib.sha256
elif (type == "sha384"):
h = hashlib.sha384
elif (type == "sha512"):
h = hashlib.sha512
else:
print "[-]Is %s a supported hash type?" % type
exit()
wordlist1 = open(wordlist, "r")
wordlist2 = wordlist1.read()
buf = StringIO.StringIO(wordlist2)
while True:
line = buf.readline().strip()
if (line == ""):
print "\n[-]Hash not cracked:"
print "[*]Reached end of wordlist"
print "[*]Try another wordlist"
print "[*]Words tryed: %s" % self.num
break
hash2 = h(line).hexdigest()
if (ver == "yes"):
sys.stdout.write('\r' + str(line) + ' ' * 20)
sys.stdout.flush()
if (hash2 == hash.lower()):
print "[+]Hash is: %s" % line
print "[*]Words tryed: %s" % self.num
break
else:
self.num = self.num + 1
def hashcracknum(self, hash, type):
self.num = 0
if (type == "md5"):
h = hashlib.md5
elif (type == "sha1"):
h = hashlib.sha1
elif (type == "sha224"):
h = hashlib.sha224
elif (type == "sha256"):
h = hashlib.sha256
elif (type == "sha384"):
h = hashlib.sha384
elif (type == "sha512"):
h = hashlib.sha512
else:
print "[-]Is %s a supported hash type?" % type
exit()
while True:
line = "%s" % self.num
line.strip()
hash2 = h(line).hexdigest().strip()
if (ver == "yes"):
sys.stdout.write('\r' + str(line) + ' ' * 20)
sys.stdout.flush()
if (hash2.strip() == hash.strip().lower()):
print "[+]Hash is: %s" % line
break
else:
self.num = self.num + 1
def main(argv):
what = check_os()
print "[Running on %s]\n" % what
global hash1, type, wordlist, line, ver, numbrute
hash1 = None
type = None
wordlist = None
line = None
ver = None
numbrute = None
try:
opts, args = getopt.getopt(argv,"ih:t:w:nv",["ifile=","ofile="])
except getopt.GetoptError:
print '[*]./Hash-Cracker.py -t <type> -h <hash> -w <wordlist>'
print '[*]Type ./Hash-Cracker.py -i for information'
sys.exit(1)
for opt, arg in opts:
if opt == '-i':
info()
sys.exit()
elif opt in ("-t", "--type"):
type = arg
elif opt in ("-h", "--hash"):
hash1 = arg
elif opt in ("-w", "--wordlist"):
wordlist = arg
elif opt in ("-v", "--verbose"):
ver = "yes"
elif opt in ("-n", "--numbers"):
numbrute = "yes"
if not (type and hash1):
print '[*]./Hash-Cracker.py -t <type> -h <hash> -w <wordlist>'
sys.exit()
if (type == "hashbrowns"):
if (hash1 == "hashbrowns"):
if (wordlist == "hashbrowns"):
print " ______"
print "^. .^ \~"
print " (oo)______/"
print " WW WW"
print " What a pig!!! "
exit()
print "[*]Hash: %s" % hash1
print "[*]Hash type: %s" % type
print "[*]Wordlist: %s" % wordlist
print "[+]Cracking..."
try:
if (numbrute == "yes"):
h = hash()
h.hashcracknum(hash1, type)
else:
h = hash()
h.hashcrack(hash1, type)
except IndexError:
print "\n[-]Hash not cracked:"
print "[*]Reached end of wordlist"
print "[*]Try another wordlist"
print "[*]Words tryed: %s" % h.num
except KeyboardInterrupt:
print "\n[Exiting...]"
print "Words tryed: %s" % h.num
except IOError:
print "\n[-]Couldn't find wordlist"
print "[*]Is this right?"
print "[>]%s" % wordlist
if __name__ == "__main__":
main(sys.argv[1:])
Note: Ignore the hashbrowns and pig, that was an Easter egg for my brother. Ever time he tested my error handling he would put 'hashbrowns' for every arg.
2 Answers 2
Summary:
- Better comments and docstrings.
- Separate code that prints messages and computes values.
- Try not to do too much in a single function.
Big things:
First time I tried to run the program, it immediately hit an error:
$ tail -n 1 /usr/share/dict/words Zyzzogeton $ md5 -s "Zyzzogeton" MD5 ("Zyzzogeton") = 4f9c55496b14676f23f40117cc89e641 $ python hashcrack.py -h "4f9c55496b14676f23f40117cc89e641" -t md5 /usr/share/dict/words Python Hash-Cracker Version 3.0-2 Stable [Running on posix] [*]Hash: 4f9c55496b14676f23f40117cc89e641 [*]Hash type: md5 [*]Wordlist: None [+]Cracking... Traceback (most recent call last): File "hashcrack.py", line 172, in <module> main(sys.argv[1:]) File "hashcrack.py", line 157, in main h.hashcrack(hash1, type) File "hashcrack.py", line 52, in hashcrack wordlist1 = open(wordlist, "r") TypeError: coercing to Unicode: need string or buffer, NoneType found
Turns out I’d forgotten the
-w
flag – your program should be better about handling malformed user input.You should read PEP 8, the Python style guide. Among other things: indentation is inconsistent (Python standard is 4 spaces, but this file uses 2, 3 and 4 interchangeably); class names should be CamelCase, two blank lines between functions.
I would consider using something like docopt to do the command-line argument parsing. Even if you don’t use docopt, it’s worth looking at the style of usage message it requires:
- There’s a standard format for usage messages on command-line tools. Sticking to this format will make it easier for other people to quickly pick up using your tool.
- There’s a long-standing convention of using
-h
or--help
to print help information. Using-i
to get info is fairly unusual. - And using docopt would allow you to simplify a big chunk of your argument-parsing code.
There are no comments or docstrings in this code, which makes it hard to work out what something is doing (and by extension, whether it’s doing it correctly).
A function should either do something (e.g. print to the screen) or return a value. Your functions tend to intersperse printing with computation. If I want to get the result of a function but without the printing, that's quite hard to do. Separating the two will make it easier to reuse your code.
(You could use something like the
logging
module for messages you want to print mid-computation, as logging is easier to turn off than printing to stdout.)More generally, I’d consider breaking your code down into smaller functions. Each function can be written and tested individually, and it tends to make it easier to see what’s going on.
And smaller comments:
Common convention is that if a program runs successfully, it exits with code 0; any other outcome is a non-zero exit code. If I ask to check with a non-existent hash, the
exit()
on line 91 will print an error, then exit with code 0. Also, errors should be printed to stderr, not stdout.Rather than a big branch of
if ... elif
statements inhashcracknum()
, use a dict for the lookup of hash name to hash function. Python hash tables are very efficient.Your code could be better about handling user input. For example, this command:
python hashcrack.py -h notahash -t MD5 -w /usr/share/dict/words
I claim that the hash function I want to use is obvious to a human reader, but this causes the script to error.
Don’t use
type
as a variable name; overriding built-in functions is a recipe for weirdness.You definitely shouldn’t name a class
hash
. This overrides a builtin Python function (that’s used in quite a few places), classes should have CamelCase names, and it’s not particularly descriptive.Your
check_os()
function will throw a NameError if called on a platform whereos.name
isn'tnt
orposix
. If you want a human-readable platform name, you might want to look at theplatform
module, which tends to have more granular names.Don’t print anything at the module level outside the
main()
function – if anybody tries toimport
this file, thoseprint
s will be executed no matter what. That makes it very annoying to reuse your code.It’s common practice for Python modules/libraries to present a
__version__
attribute – it would be good to do that for your version string.
Long conditional chains
Use a dictionary instead of long conditional chains
hash_functors = {
'md5': hashlib.md5,
'sha1': hashlib.sha1,
...
}
try:
hash_functor = hash_functors[hash_type]
except KeyError:
raise YourSpecialCustomErrorOrWhatever("Unknown hash type {}".format(hash_type))
Even better, however, is to just use the dictionary implicitly contained by the hashlib module. All of your hash types are just functions of the module, so you should do something like
def get_hash_functor(hash_type):
try:
return getattr(hashlib, hash_type)
except AttributeError:
raise YourSpecialCustomErrorOrWhatever("Unknown hash type {}".format(hash_type))
By wrapping it in a function you also make it much easier to change the backend implementation of your hash functions and swap them out for others as needed.
Context managers
Always use context managers whenever possible.
with open(wordlist, 'r') as wordlist1:
do_stuff(wordlist1)
This ensures that any necessary cleanup occurs after the indented block, even if there is an unhandled exception.
Iterating over a file
Its easier to iterate over a file in Python like this
for line in open_file:
do_something_to_line(line)
You can also clearly handle the case where no match was found. With for-loops in Python, an else
block is entered if the loop isn't broken out of or returned from, i.e. if you loop over an empty iterable, or the condition is never met within the loop, etc.
index = 0
hashed_word = None
for index, word in enumerate(wordlist1):
hashed_word = hash_functor(word.strip())
if hashed_word.hexdigest() == correct_hash:
print "[+]Hash is: %s" % line
break
else:
print "[-]Hash not cracked:"
print "[*]Reached end of wordlist"
print "[*]Try another wordlist"
print "[*]Words tried: %s" % index
return hashed_word
I also pulled some common code out of there to make it clearer, returned the value so you can continue to process it if necessary, and used enumerate
to more cleanly handle counting loop iterations.
hashcracknum vs hashcrack
These are basically identical; the only difference is what they're trying to hash. Factor the similarities out into shared functions, and then only have the differences and minor setup code in each. You'll end up with something that looks sort of like this
def is_matching_hash(value_to_hash, correct_hash, hash_functor):
return hash_functor(value_to_hash).hexdigest() == correct_hash
def _do_hash_crack(iterable, hash_functor, correct_hash, verbose):
attempt_number = 0
value = None
for attempt_number, value in enumerate(iterable, start=1):
if verbose:
print "{:<20}".format(value)
if is_matching_hash(value, correct_hash, hash_functor)
break
return attempt_number, value
def hash_crack(word_file_name, correct_hash, hash_functor, verbose=False):
with open(word_file_name, 'r') as wordfile:
iterable = word.strip().lower() for word in wordfile
tries, word = _do_hash_crack(iterable, hash_functor, correct_hash, verbose)
if value is None:
print "\n[-]Hash not cracked:"
print "[*]Reached end of wordlist"
print "[*]Try another wordlist"
else:
print "[*]Hash is: %s" % word
print "[*]Words tried: %s" % tries
return word
def hash_crack_num(correct_hash, hash_functor, verbose=False):
tries, number = _do_hash_crack(itertools.count(), hash_functor, correct_hash, verbose)
print "[*]Words tried: %s" % tries
return number
As you can see, the main difference is in what the iterable is, and what message we print out when none is found (which doesn't happen with the hash_crack_number function as it never terminates until the correct value is found. This is your code's original behavior, but it smells bad to me - is it what you intended?).
StringIO
In order to maximize compatibility and performance while using StringIO
you generally want something like this
try:
import cStringIO as io
except ImportError:
try:
import StringIO as io
except ImportError:
import io
The order you try them in will generally depend on what you think is most likely to be available on your system. I have a little function I have lying around I use whenever I have chains like these
def import_fallback(names, name_as, final_exception=None):
for name in names:
try:
globals()[name_as] = __import__(name)
break
except ImportError as e:
if name == names[-1]:
if final_exception is not None:
final_exception(e)
else:
raise
Then you'd just do
import_fallback(['io', 'cStringIO', 'StringIO'], 'io'])
An easy modification if you don't like messing with globals()
inside of the function would be to explicitly return, and then put it into whatever scope you want a la
globals()['io'] = import_fallback(['io', 'cStringIO', 'StringIO'])
CLI and main
Use argparse or another third party module like docopt or click instead of getopt - much cleaner and more familiar for Python developers.
You should also probably have a dedicated function (or 2, or 3, or however many it takes to do it cleanly and modularly) to handling the command line.
You also don't need any global state - instead take the values that functions will need and pass them as parameters. The functions I rewrote up above assume that the caller function gives it a functor to hash a value, as well as the correct hash, the name of the file (if applicable) and whether or not the function is 'verbose'.
OOP
Just because you can make something an object doesn't mean you should. I don't see any advantage to making this a class.
You're also abusing OOP - you have self.num
(which is also horribly named and doesn't at all indicate what purpose it serves) and use it for different purposes in two places.
sys.stdout.write
vs print
There is almost never a reason to use sys.stdout.write
instead of just print
. So just use print
.
python test.py -t md5 -h md5 -w "cat, dog"
, but it didn't work. (Also the tabbing seems fine for anyone wondering) \$\endgroup\$