4chan Tripcode Explorer in Python

Question 1

Background information for those of you who don't know what 4chan tripcodes are:
Via Wikipedia:

A tripcode is the hashed result of a password that allows one's identity to be recognized without storing any data about users. Entering a particular password will let one "sign" one's posts with the tripcode generated from that password.

Not displayed on Wikipedia:

With tripcodes, many people like to create special tripcodes for themselves, containing a certain string inside the text - for example the password "LC,T{af"
generating the tripcode "QeMbDfeels" which yields the phrase "feels", in relation to Wojak, also known as Feels Guy.

On my Intel Core 2 Duo T7400 @ 2.16GHz processor, I can get about 20000 tripcodes generated per second - I don't know if I can optimize this program any further, or my processor is just slow.

#!/usr/bin/python/ 
# -*- coding: utf-8 -*
from __future__ import division
import sys,re,string,crypt,random,time
Password,UpdateCount,ElapsedUpdates,Total,Matches,Rate,ElapsedCount,Filetext="0",0,0,0,0,0,0,"" #Saving myself a few lines by declaring all of these onto one line
Characters="!#$%&\"\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~" #Use all possible unicode characters accepted by 4chan
random.seed(time.time()) #Seed the random module with the current time
absolutestartTime=(time.time()) #Set an absolute start time for statistics 
isCaseSensitive=Respond.query("Search for exact case / case-sensitive?") #Query for case-sensitive search
def GenerateTripcode(Password):
 ShiftJIS=Password.encode('shift_jis', 'ignore')#Convert to SHIFT-JIS.
 Tripcode=crypt.crypt( #Generate the tripcode
 ShiftJIS, re.compile('[^\.-ElapsedUpdates]')\
 .sub('.', (ShiftJIS+'...')[1:3])\
 .translate(string.maketrans(':;<=>?@[\\]^_`','ABCDEFGabcdef'))\
 )[-10:]
 return Tripcode
def GenerateRandomString(Length): return ''.join(random.choice(Characters) for i in range(7)) #Decided to throw this into its own function. If it's more efficient to just use the verbatim command instead of calling the function, please tell me
startTime = time.time() #Grab the current time for the performance checker
def getCheck(chk, tf): #I hated having this clause in the while loop, so I just passed it into a function
 if tf: return chk
 if not tf: return string.lower(chk) 
if not isCaseSensitive: Find=string.lower(sys.argv[1]) #If non-case sensitive, lowercase the search string
if isCaseSensitive: Find=sys.argv[1]
try: #try clause in order to catch an interrupt
 while 1==1: #Infinite loop
 UpdateCount+=1;ElapsedCount+=1;Total+=1 #Increase counts by one.
 Password=GenerateRandomString(7) #Generate random string (question from line 18)
 Tripcode=GenerateTripcode(Password)
 if re.search(Find, getCheck(Tripcode, isCaseSensitive))>-1: #Check if string contains match using regex
 Out=Password+" >>> "+Tripcode+"\n" #Generate output string showing a code has been found
 Filetext+=Out #Add this to a filetext variable instead of directly writing it to file during the loop
 print "033円[K\r", #Flush out any text from the current line
 print Password+" >>> "+Tripcode #Print out the password and tripcode.
 Matches+=1 #Add 1 to the matchcount for summary later.
 if UpdateCount==100: #Update generated count and rate every 100 loops
 UpdateCount=0;ElapsedUpdates+=1 #Reset counter, +1 to the rate checker counter
 status=str(Total)+" tripcodes"+" "+str(Rate)+" tripcodes/sec"+'\r' #Set status to a variable
 sys.stdout.write(status) #Print out status
 sys.stdout.flush()
 if ElapsedUpdates==10: #Every 1000 codes, check rates
 ElapsedRange=time.time() - startTime #See how many seconds have elapsed since last check
 Rate=int(round(ElapsedCount/ElapsedRange)) #Get rate (Tripcodes/sec)
 startTime=time.time() #Reset startTime for next check
 ElapsedCount=0 #Reset elapsed tripcode count
 ElapsedUpdates=0 #Reset elapsed update count
except KeyboardInterrupt: #Catch keyboard interrupt
 ElapsedSecs = time.time() - absolutestartTime #Use absolute time set at line 8 to see how many seconds this program ran
 Elapsed = time.strftime("%M minutes, %S seconds", time.gmtime(ElapsedSecs)) #Use another variable to format the time for printing
 #Print statistics. 
 print "\nCaught interrupt." 
 print str(Matches)+" matches found"
 print str(Total)+" codes generated in "+Elapsed
 print "Average of ~"+str(int(round(Total/ElapsedSecs)))+" tripcodes generated per second"
 if Matches>=1: 
 print "1 match found every ~"+str(round(ElapsedSecs/Matches,2))+" seconds" 
 print "1 match found in every ~"+str(int(round(Total/Matches)))+" tripcodes"
 print "Writing matches to file...",
 open("t.txt", "a").write(Filetext)
 print "done."
 exit()

The module Respond is the code from Recipe 577058.

Question 2

Given that you're only using ASCII characters for the password, you should be able to skip the Shift JIS encoding step entirely, because plain ASCII is encoded identically in Shift JIS.

Question 3

I know this isn't what you're interested in, but I'm going to talk about style first - as is, I can't read this at all.

The biggest issue is that you seem to have some sort of dislike for whitespace - why? Whitespace is your friend, and is the difference between legible and illegible code. If you really want some obfuscated code, then write it normally and then send it through an obfuscator. You need more newlines, spaces around operators, etc.

You could also follow the Python naming conventions better - variables and functions should be named with lower_snake_case.

I've also cleaned some things up in terms of variables (i.e. the characters are in the string module), the ternary expression, when you set the start_time (you want that to be as tight as possible), etc.

At this point, your code looks like this:

#!/usr/bin/python/ 
# -*- coding: utf-8 -*
from __future__ import division
import sys
import re
import string
import crypt
import random
import time
def generate_tripcode(password):
 shift_jis = password.encode("shift_jis", "ignore")
 tripcode = crypt.crypt(
 shift_jis, re.compile(r'[^\.-ElapsedUpdates]')\
 .sub('.', (shift_jis + '...')[1:3])\
 .translate(string.maketrans(':;<=>?@[\\]^_`', 'ABCDEFGabcdef'))
 )[-10:]
 return tripcode
def generate_random_string(length):
 return ''.join(random.choice(characters) for _ in xrange(length))
def get_check(chk, tf):
 return chk if tf else string.lower(chk)
def display_statistics(matches, total, elapsed, rate, elapsed_seconds):
 print """
 Caught interrupt.
 {matches} matches found
 {total} codes generated in {elapsed}
 Average of ~{rate} tripcodes generated per second
 """.format(**locals())
 if matches > 0: 
 print "1 match found every ~{} seconds".format(round(elapsed_seconds / matches, 2))
 print "1 match found in every ~{} tripcodes".format(int(round(total / matches)))
try: 
 update_count = 0
 elapsed_updates = 0
 total = 0
 matches = 0
 rate = 0
 elapsed_count = 0
 filetext = ""
 characters = string.printable.split()[0] # get all non-whitespace, non-weird characters
 random.seed(time.time()) #Seed the random module with the current time
 is_case_sensitive = Respond.query("Search for exact case / case-sensitive?")
 find = sys.argv[1]
 if not is_case_sensitive:
 find = find.lower()
 absolute_start_time = time.time()
 start_time = time.time()
 while True:
 update_count += 1
 elapsed_count += 1
 total += 1
 password = generate_random_string(7)
 tripcode = generate_tripcode(password)
 if re.search(find, get_check(tripcode, is_case_sensitive)) > -1:
 output_string = "{} >>> {}\n".format(password, tripcode)
 filetext == output_string
 print "033円[K\r", #Flush out any text from the current line
 print output_string
 mathces += 1
 if update_count == 100:
 update_count = 0
 elapsed_updates += 1
 status = "{} tripcodes {} tripcodes/sec\r".format(total, rate)
 print status
 if elapsed_updates == 10:
 elapsed_range = time.time() - start_time
 rate = int(round(elapsed_count / elapsed_range))
 elapsed_count = 0
 elapsed_updates = 0
 start_time = time.time()
except KeyboardInterrupt: 
 elapsed_seconds = time.time() - absolute_start_time
 elapsed = time.strftime("%M minutes, %S seconds", time.gmtime(elapsed_seconds))
 rate = int(round(total/elapsed_seconds))
 # Print statistics. 
 display_statistics(matches, total, elapsed, rate, elapsed_seconds)
 print "Writing matches to file...",
 with open("t.txt", "a") as file_:
 file_.write(filetext)
 print "done."

Now as for performance, the big choke is probably generate_tripcode - the most obvious way to make it faster is to move as many of the computations that don't change outside of the function.

Your regular expression has a minor bug - you indicated that you want to replace the characters .-z, however as written it will replace the range of characters from . to z - to fix this, make the - the last character in the regex.

regex = re.compile(r'[^\.z-]')
translator = string.maketrans(':;<=>?@[\\]^_`', 'ABCDEFGabcdef')
def generate_tripcode(password):
 shift_jis = password.encode("shift_jis", "ignore")
 tripcode = crypt.crypt(
 shift_jis, regex.sub('.', (shift_jis + '...')[1:3]).translate(translator)
 )[-10:]
 return tripcode

Otherwise, if you can use a different cryptographic hash that is faster you'll probably see a much bigger performance increase, however that isn't an option for this situation. That is also part of why they use cryptographic hashes - to prevent you from doing this :)

In general though, if you want to push your computer to the limits, don't write the performance intensive code in Python - either write it in C/C++ entirely, or write an extension to handle the part that needs to be fast.

Everything below this was invalidated by a comment - left here for posterity

Your regular expression doesn't make sense to me either - you're saying you want to replace everything but the characters .-ElapsedUpt with a period? If so, you have a bug - putting the - where you did means the range from . to E - if you want to exclude -, put it at the end. It still doesn't really make sense to me though - if those are the only characters you want, why don't you make that your characters string? Also, you won't ever have the characters in your translation after that regex - all of them would have been replaced with a period.

That being said, while preserving the intent of your code, I'd rewrite the function to look like this, which is probably going to give you a bit of a performance boost

regex = re.compile(r'[^\.ElapsedUpdates-]')
translator = string.maketrans(':;<=>?@[\\]^_`', 'ABCDEFGabcdef')
def generate_tripcode(password):
 shift_jis = (password.encode("shift_jis", "ignore") + '...')[1:3]
 tripcode = crypt.crypt(
 shift_jis, regex.sub('.', shift_jis).translate(translator)
 )[-10:]
 return tripcode

Question 4

Oh my, I am so sorry about the regex.. I hadn't noticed. Must have been a rouge find&replace. Sorry. Original regex was [^\.-z], so sorry

Question 5

@空间. just to be clear, the intent is to replace characters .-z, or the range of characters from . to z?

Question 6

Replace characters .-z

Question 7

Did this code run for you? Not working for me. The only one that works is with the original regex in the function, not the one outside of the function.

Question 8

"or just use a non-cryptographic hash" – well, the reason it's a cryptographic hash was to discourage people from doing this ;) He can't change the hash because it needs to be the exact same as what's done on the server.

Question 9

Or you know, it's Python. Not particularly known for high performance. If you want more speed you should look at parallelisation, modules written in more performant compiled languages, or possibly PyPy.

Now, first of all, please look at PEP8 to make the reader's job a lot easier. In particular some whitespace would be nice. It's also not necessary to fit everything onto a single line. Also a lot of comments just restate the code in the line - don't add comments if they don't add any new information.

Next, the first line is wrong, there should be no slash at the end. Ideally, since you're restricted to Python 2.7(?) it should rather be #!/usr/bin/env python2.7 instead.

Even though it's Python 2.7 it's still advisable to use the function form of print, i.e. print(...), since it removes on special syntax case without much ill effect. Similarly, using xrange makes sense in many situations.

A if __name__ == "__main__": block with a separate main function would be nice too, that way the number of globals could also be reduced a bit.

while 1==1: is less clear than while True:.

A plain open without with doesn't close the file afterwards. It's good practice to use with all the time anyway.

An exit call at the end of the program is unnecessary, the program will exit immediately afterwards anyway and the exit code is well defined too.

The Respond module should probably be imported as well.

re.compile is supposed to be run once if the regular expression doesn't change. The whole point is to compile it only once.

Inlining functions will likely give you better performance, but first comes clarity. It's a good idea to put chunks of logic into functions instead of having a incomprehensible blob of code.

For the command line arguments using a module like argparse would be a splendid idea. At the moment you have to know that there's a single argument and what semantics it has. The lower-/uppercase switch could also be a command line flag instead.

The two cases in getCheck should either be rewritten with if/else, or with a single return statement:

def getCheck(chk, tf):
 if tf:
 return chk
 else:
 return string.lower(chk)
# or
def getCheck(chk, tf):
 return chk if tf else string.lower(chk)

Also, if there's already a function for it, use it when possible, e.g. for the isCaseSensitive check.

Find = getCheck(sys.argv[1], isCaseSensitive)

The Length argument for GenerateRandomString isn't used.

I'll stop here, you already got an answer with some cleanup, so I'll not paste my current state (which looks otherwise pretty similar).

Dan Oberlam Dan Oberlam 8,0492 gold badges33 silver badges74 bronze badges · Accepted Answer · 2016-02-16 19:51:02Z

I know this isn't what you're interested in, but I'm going to talk about style first - as is, I can't read this at all.

The biggest issue is that you seem to have some sort of dislike for whitespace - why? Whitespace is your friend, and is the difference between legible and illegible code. If you really want some obfuscated code, then write it normally and then send it through an obfuscator. You need more newlines, spaces around operators, etc.

You could also follow the Python naming conventions better - variables and functions should be named with lower_snake_case.

I've also cleaned some things up in terms of variables (i.e. the characters are in the string module), the ternary expression, when you set the start_time (you want that to be as tight as possible), etc.

At this point, your code looks like this:

#!/usr/bin/python/ 
# -*- coding: utf-8 -*
from __future__ import division
import sys
import re
import string
import crypt
import random
import time
def generate_tripcode(password):
 shift_jis = password.encode("shift_jis", "ignore")
 tripcode = crypt.crypt(
 shift_jis, re.compile(r'[^\.-ElapsedUpdates]')\
 .sub('.', (shift_jis + '...')[1:3])\
 .translate(string.maketrans(':;<=>?@[\\]^_`', 'ABCDEFGabcdef'))
 )[-10:]
 return tripcode
def generate_random_string(length):
 return ''.join(random.choice(characters) for _ in xrange(length))
def get_check(chk, tf):
 return chk if tf else string.lower(chk)
def display_statistics(matches, total, elapsed, rate, elapsed_seconds):
 print """
 Caught interrupt.
 {matches} matches found
 {total} codes generated in {elapsed}
 Average of ~{rate} tripcodes generated per second
 """.format(**locals())
 if matches > 0: 
 print "1 match found every ~{} seconds".format(round(elapsed_seconds / matches, 2))
 print "1 match found in every ~{} tripcodes".format(int(round(total / matches)))
try: 
 update_count = 0
 elapsed_updates = 0
 total = 0
 matches = 0
 rate = 0
 elapsed_count = 0
 filetext = ""
 characters = string.printable.split()[0] # get all non-whitespace, non-weird characters
 random.seed(time.time()) #Seed the random module with the current time
 is_case_sensitive = Respond.query("Search for exact case / case-sensitive?")
 find = sys.argv[1]
 if not is_case_sensitive:
 find = find.lower()
 absolute_start_time = time.time()
 start_time = time.time()
 while True:
 update_count += 1
 elapsed_count += 1
 total += 1
 password = generate_random_string(7)
 tripcode = generate_tripcode(password)
 if re.search(find, get_check(tripcode, is_case_sensitive)) > -1:
 output_string = "{} >>> {}\n".format(password, tripcode)
 filetext == output_string
 print "033円[K\r", #Flush out any text from the current line
 print output_string
 mathces += 1
 if update_count == 100:
 update_count = 0
 elapsed_updates += 1
 status = "{} tripcodes {} tripcodes/sec\r".format(total, rate)
 print status
 if elapsed_updates == 10:
 elapsed_range = time.time() - start_time
 rate = int(round(elapsed_count / elapsed_range))
 elapsed_count = 0
 elapsed_updates = 0
 start_time = time.time()
except KeyboardInterrupt: 
 elapsed_seconds = time.time() - absolute_start_time
 elapsed = time.strftime("%M minutes, %S seconds", time.gmtime(elapsed_seconds))
 rate = int(round(total/elapsed_seconds))
 # Print statistics. 
 display_statistics(matches, total, elapsed, rate, elapsed_seconds)
 print "Writing matches to file...",
 with open("t.txt", "a") as file_:
 file_.write(filetext)
 print "done."

Now as for performance, the big choke is probably generate_tripcode - the most obvious way to make it faster is to move as many of the computations that don't change outside of the function.

Your regular expression has a minor bug - you indicated that you want to replace the characters .-z, however as written it will replace the range of characters from . to z - to fix this, make the - the last character in the regex.

regex = re.compile(r'[^\.z-]')
translator = string.maketrans(':;<=>?@[\\]^_`', 'ABCDEFGabcdef')
def generate_tripcode(password):
 shift_jis = password.encode("shift_jis", "ignore")
 tripcode = crypt.crypt(
 shift_jis, regex.sub('.', (shift_jis + '...')[1:3]).translate(translator)
 )[-10:]
 return tripcode

Otherwise, if you can use a different cryptographic hash that is faster you'll probably see a much bigger performance increase, however that isn't an option for this situation. That is also part of why they use cryptographic hashes - to prevent you from doing this :)

In general though, if you want to push your computer to the limits, don't write the performance intensive code in Python - either write it in C/C++ entirely, or write an extension to handle the part that needs to be fast.

Everything below this was invalidated by a comment - left here for posterity

Your regular expression doesn't make sense to me either - you're saying you want to replace everything but the characters .-ElapsedUpt with a period? If so, you have a bug - putting the - where you did means the range from . to E - if you want to exclude -, put it at the end. It still doesn't really make sense to me though - if those are the only characters you want, why don't you make that your characters string? Also, you won't ever have the characters in your translation after that regex - all of them would have been replaced with a period.

That being said, while preserving the intent of your code, I'd rewrite the function to look like this, which is probably going to give you a bit of a performance boost

regex = re.compile(r'[^\.ElapsedUpdates-]')
translator = string.maketrans(':;<=>?@[\\]^_`', 'ABCDEFGabcdef')
def generate_tripcode(password):
 shift_jis = (password.encode("shift_jis", "ignore") + '...')[1:3]
 tripcode = crypt.crypt(
 shift_jis, regex.sub('.', shift_jis).translate(translator)
 )[-10:]
 return tripcode

Oh my, I am so sorry about the regex.. I hadn't noticed. Must have been a rouge find&replace. Sorry. Original regex was [^\.-z], so sorry
@空间. just to be clear, the intent is to replace characters .-z, or the range of characters from . to z?
Did this code run for you? Not working for me. The only one that works is with the original regex in the function, not the one outside of the function.
"or just use a non-cryptographic hash" – well, the reason it's a cryptographic hash was to discourage people from doing this ;) He can't change the hash because it needs to be the exact same as what's done on the server.

Stack Exchange Network

4chan Tripcode Explorer in Python

2 Answers 2

Everything below this was invalidated by a comment - left here for posterity

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

4chan Tripcode Explorer in Python

2 Answers 2

Everything below this was invalidated by a comment - left here for posterity

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions