This bot uses the PRAW package and jisho.org API. While running, it looks at recent comments for summons. A user either requests a random Chinese character/Kanji and associated definitions or specifies a query to look up, which can be anything, but definitions are not guaranteed to come up. The bot replies with the requested information. Definitions are supplied by the jisho.org API, and I purposely included only the first definition for each word found. I'm concerned about style, maintainability, and whether there's a better way to handle multiple types of summons. I'm not very familiar with python. You can look at the jisho output using the URL in the code.
import json
import os
import random
import time
import praw
import requests
import config
# lower and upper bounds for unicode block containing common CJK characters
UNICODE_LOWER_BOUND = 0x4E00
UNICODE_UPPER_BOUND = 0x9FFF
RANDOM_SUMMONS = ["random chinese character", "random kanji", "random hanzi", "random hanja", 'random 汉字', 'random 漢字']
LOOKUP = '!lookup'
# authenticate bot using praw api
def authenticate():
r = praw.Reddit(username = config.username,
password = config.password,
client_id = config.client_id,
client_secret = config.client_secret,
user_agent = "kanjibot")
return r
# main loop
def run_bot(r, comments_replied_to):
print('runbotstarted')
for comment in r.subreddit('test').comments(limit=40):
summon = find_summon(comment.body)
if summon != None and comment.id not in comments_replied_to and comment.author != r.user.me():
print("summon detected")
comment.reply(generate_reply(summon))
comments_replied_to.append(comment.id)
with open("comments_replied_to.txt", "a") as f:
f.write(comment.id + "\n")
time.sleep(5)
def get_saved_comments():
if not os.path.isfile("comments_replied_to.txt"):
comments_replied_to = []
else:
with open("comments_replied_to.txt", "r") as f:
comments_replied_to = f.read()
comments_replied_to = comments_replied_to.split("\n")
return comments_replied_to
# nothing -> str
# return str containing random chinese character in CJK Unified Ideographs Unicode block.
def generate_random_kanji():
codepoint = random.randint(UNICODE_LOWER_BOUND, UNICODE_UPPER_BOUND)
return chr(codepoint)
# str -> boolean
# return true if body contains a trigger string
def summoned(body):
return any(summon in body for summon in RANDOM_SUMMONS)
# str -> str or None
# if summon is !lookup, return query within !lookup flags. Otherwise, if summon is random, return the summon.
def find_summon(body):
for summon in RANDOM_SUMMONS:
if summon in body:
return summon
elif LOOKUP in body:
return body.split('!lookup')[1]
return None
# str -> str
# build and return the reply string based on the summon string
def generate_reply(summon):
reply = ''
query = ''
if summon in RANDOM_SUMMONS:
query = generate_random_kanji()
reply = '#**You asked for a random Chinese character. Here it is: ' + query + '**'
else:
query = summon.replace('!lookup', '')
reply += '#**You asked to define ' + query + '**'
reply += '\n# Japanese Definitions:'
# type of definitions_data: list of dict, each dict is a definition
definitions_data = requests.get('https://jisho.org/api/v1/search/words?keyword=' + query).json()['data']
if definitions_data == []:
reply += ' no Japanese definitions found\n'
else:
for defin in definitions_data:
try:
reply += '\n\nWord: ' + defin['slug']
reply += '\n\nReading: ' + defin['japanese'][0]['reading']
reply += '\n\nEnglish Definition: ' + defin['senses'][0]['english_definitions'][0]
except:
reply += '\n\nError: Missing information for this definition'
reply += '\n\nimprovements to come'
print(reply)
return reply
# main function: so this module can be imported without executing main functionality.
def main():
reddit = authenticate()
comments_replied_to = get_saved_comments()
while True:
run_bot(reddit, comments_replied_to)
## end definitions
## begin executions
if __name__ == '__main__':
main()
1 Answer 1
Indentation
The indentation within authenticate
is non-standard. Here are two standard alternatives:
r = praw.Reddit(username = config.username,
password = config.password,
client_id = config.client_id,
client_secret = config.client_secret,
user_agent = "kanjibot")
r = praw.Reddit(
username = config.username,
password = config.password,
client_id = config.client_id,
client_secret = config.client_secret,
user_agent = "kanjibot",
)
Comparison to None
if summon != None
should be
if summon is not None
Sets
comments_replied_to
within run_bot
would be better-represented as a set. You haven't used type hints, so I'm guessing here, but since you use .append
it's probably a list. A set is better for your membership comparison operations (not in
).
To load it as a set directly, rather than:
with open("comments_replied_to.txt", "r") as f:
comments_replied_to = f.read()
comments_replied_to = comments_replied_to.split("\n")
return comments_replied_to
use
with open("comments_replied_to.txt") as f:
return {line.rstrip() for line in f}
Sleeping
time.sleep(5)
Why? This should not be needed.
More sets
For this:
return any(summon in body for summon in RANDOM_SUMMONS)
If body
and RANDOM_SUMMONS
are both made sets, then this can be
return not RANDOM_SUMMONS.isdisjoint(body)
which will be much more efficient.
Loop efficiency
for summon in RANDOM_SUMMONS:
if summon in body:
return summon
elif LOOKUP in body:
return body.split('!lookup')[1]
Why are those last two lines in your loop? The result will not change no matter how many iterations you execute. You should move those last two lines out before your loop, then replace the loop with
intersect = RANDOM_SUMMONS & body
if len(intersect) > 0:
return next(iter(intersect))
return None
This assumes that it is non-fatal for there to be more than one overlap.
Requests
requests.get('https://jisho.org/api/v1/search/words?keyword=' + query).json()['data']
First of all, when you get the response back, call raise_for_status
- this call might not have succeeded. Also, do not pass query params in the URL string; pass them in a dictionary to the params
kwarg.
Successive concatenation
reply +=
is not advisable; it present efficiency problems. There are few ways around this - using a StringIO
is one solution.
Caller choice
generate_reply
should not print the reply; it should only return it. It should be up to the caller whether they want to print it or not.
-
\$\begingroup\$ Thanks for the review! One thing, under your 'more sets' header, you mention turning comment.body into a set (it's originally a string). What kind of set exactly? I guess if I reduced the random summon strings to one word rather than two, I could turn body into a set of words. \$\endgroup\$ShokoN– ShokoN2020年07月30日 01:34:29 +00:00Commented Jul 30, 2020 at 1:34
-
1\$\begingroup\$ What kind of set exactly? A (built-in) Python
set
of phrases; dependent on a few things. In the expressionsummon in body for summon in RANDOM_SUMMONS
, ifbody
is reducible to a set of phrase strings that match the format ofRANDOM_SUMMONS
, then this is possible. \$\endgroup\$Reinderien– Reinderien2020年07月30日 01:39:52 +00:00Commented Jul 30, 2020 at 1:39