Reddit Dictionary Bot Python 3

Question 1

This bot uses the PRAW package and jisho.org API. While running, it looks at recent comments for summons. A user either requests a random Chinese character/Kanji and associated definitions or specifies a query to look up, which can be anything, but definitions are not guaranteed to come up. The bot replies with the requested information. Definitions are supplied by the jisho.org API, and I purposely included only the first definition for each word found. I'm concerned about style, maintainability, and whether there's a better way to handle multiple types of summons. I'm not very familiar with python. You can look at the jisho output using the URL in the code.

import json
import os
import random
import time
import praw
import requests
import config
# lower and upper bounds for unicode block containing common CJK characters
UNICODE_LOWER_BOUND = 0x4E00
UNICODE_UPPER_BOUND = 0x9FFF
RANDOM_SUMMONS = ["random chinese character", "random kanji", "random hanzi", "random hanja", 'random 汉字', 'random 漢字']
LOOKUP = '!lookup'
# authenticate bot using praw api
def authenticate():
 r = praw.Reddit(username = config.username,
 password = config.password,
 client_id = config.client_id,
 client_secret = config.client_secret,
 user_agent = "kanjibot")
 return r
# main loop
def run_bot(r, comments_replied_to):
 print('runbotstarted')
 for comment in r.subreddit('test').comments(limit=40):
 summon = find_summon(comment.body)
 if summon != None and comment.id not in comments_replied_to and comment.author != r.user.me(): 
 print("summon detected")
 comment.reply(generate_reply(summon))
 comments_replied_to.append(comment.id)
 with open("comments_replied_to.txt", "a") as f:
 f.write(comment.id + "\n")
 time.sleep(5)
def get_saved_comments():
 if not os.path.isfile("comments_replied_to.txt"):
 comments_replied_to = []
 else:
 with open("comments_replied_to.txt", "r") as f:
 comments_replied_to = f.read()
 comments_replied_to = comments_replied_to.split("\n")
 return comments_replied_to
# nothing -> str
# return str containing random chinese character in CJK Unified Ideographs Unicode block.
def generate_random_kanji():
 codepoint = random.randint(UNICODE_LOWER_BOUND, UNICODE_UPPER_BOUND)
 return chr(codepoint)
# str -> boolean
# return true if body contains a trigger string
def summoned(body):
 return any(summon in body for summon in RANDOM_SUMMONS)
# str -> str or None
# if summon is !lookup, return query within !lookup flags. Otherwise, if summon is random, return the summon.
def find_summon(body):
 for summon in RANDOM_SUMMONS:
 if summon in body:
 return summon
 elif LOOKUP in body:
 return body.split('!lookup')[1]
 return None
# str -> str
# build and return the reply string based on the summon string
def generate_reply(summon):
 reply = ''
 query = ''
 if summon in RANDOM_SUMMONS:
 query = generate_random_kanji()
 reply = '#**You asked for a random Chinese character. Here it is: ' + query + '**'
 else:
 query = summon.replace('!lookup', '')
 reply += '#**You asked to define ' + query + '**'
 reply += '\n# Japanese Definitions:'
 # type of definitions_data: list of dict, each dict is a definition
 definitions_data = requests.get('https://jisho.org/api/v1/search/words?keyword=' + query).json()['data']
 if definitions_data == []:
 reply += ' no Japanese definitions found\n'
 else: 
 for defin in definitions_data:
 try:
 reply += '\n\nWord: ' + defin['slug']
 reply += '\n\nReading: ' + defin['japanese'][0]['reading']
 reply += '\n\nEnglish Definition: ' + defin['senses'][0]['english_definitions'][0]
 except: 
 reply += '\n\nError: Missing information for this definition'
 reply += '\n\nimprovements to come'
 print(reply)
 return reply
 
# main function: so this module can be imported without executing main functionality.
def main():
 reddit = authenticate()
 comments_replied_to = get_saved_comments()
 while True:
 run_bot(reddit, comments_replied_to)
## end definitions
## begin executions
if __name__ == '__main__':
 main()

enter image description here

Question 2

Indentation

The indentation within authenticate is non-standard. Here are two standard alternatives:

 r = praw.Reddit(username = config.username,
 password = config.password,
 client_id = config.client_id,
 client_secret = config.client_secret,
 user_agent = "kanjibot")
 r = praw.Reddit(
 username = config.username,
 password = config.password,
 client_id = config.client_id,
 client_secret = config.client_secret,
 user_agent = "kanjibot",
 )

Comparison to `None`

if summon != None

should be

if summon is not None

Sets

comments_replied_to within run_bot would be better-represented as a set. You haven't used type hints, so I'm guessing here, but since you use .append it's probably a list. A set is better for your membership comparison operations (not in).

To load it as a set directly, rather than:

with open("comments_replied_to.txt", "r") as f:
 comments_replied_to = f.read()
 comments_replied_to = comments_replied_to.split("\n")
return comments_replied_to

use

with open("comments_replied_to.txt") as f:
 return {line.rstrip() for line in f}

Sleeping

time.sleep(5)

Why? This should not be needed.

More sets

For this:

return any(summon in body for summon in RANDOM_SUMMONS)

If body and RANDOM_SUMMONS are both made sets, then this can be

return not RANDOM_SUMMONS.isdisjoint(body)

which will be much more efficient.

Loop efficiency

for summon in RANDOM_SUMMONS:
 if summon in body:
 return summon
 elif LOOKUP in body:
 return body.split('!lookup')[1]

Why are those last two lines in your loop? The result will not change no matter how many iterations you execute. You should move those last two lines out before your loop, then replace the loop with

intersect = RANDOM_SUMMONS & body
if len(intersect) > 0:
 return next(iter(intersect))
return None

This assumes that it is non-fatal for there to be more than one overlap.

Requests

requests.get('https://jisho.org/api/v1/search/words?keyword=' + query).json()['data']

First of all, when you get the response back, call raise_for_status - this call might not have succeeded. Also, do not pass query params in the URL string; pass them in a dictionary to the params kwarg.

Successive concatenation

reply += is not advisable; it present efficiency problems. There are few ways around this - using a StringIO is one solution.

Caller choice

generate_reply should not print the reply; it should only return it. It should be up to the caller whether they want to print it or not.

Question 3

Thanks for the review! One thing, under your 'more sets' header, you mention turning comment.body into a set (it's originally a string). What kind of set exactly? I guess if I reduced the random summon strings to one word rather than two, I could turn body into a set of words.

Question 4

What kind of set exactly? A (built-in) Python set of phrases; dependent on a few things. In the expression summon in body for summon in RANDOM_SUMMONS, if body is reducible to a set of phrase strings that match the format of RANDOM_SUMMONS, then this is possible.

score 6 · Accepted Answer · 2020-07-27 00:48:30Z

Indentation

The indentation within authenticate is non-standard. Here are two standard alternatives:

 r = praw.Reddit(username = config.username,
 password = config.password,
 client_id = config.client_id,
 client_secret = config.client_secret,
 user_agent = "kanjibot")
 r = praw.Reddit(
 username = config.username,
 password = config.password,
 client_id = config.client_id,
 client_secret = config.client_secret,
 user_agent = "kanjibot",
 )

Comparison to `None`

if summon != None

should be

if summon is not None

Sets

comments_replied_to within run_bot would be better-represented as a set. You haven't used type hints, so I'm guessing here, but since you use .append it's probably a list. A set is better for your membership comparison operations (not in).

To load it as a set directly, rather than:

with open("comments_replied_to.txt", "r") as f:
 comments_replied_to = f.read()
 comments_replied_to = comments_replied_to.split("\n")
return comments_replied_to

use

with open("comments_replied_to.txt") as f:
 return {line.rstrip() for line in f}

Sleeping

time.sleep(5)

Why? This should not be needed.

More sets

For this:

return any(summon in body for summon in RANDOM_SUMMONS)

If body and RANDOM_SUMMONS are both made sets, then this can be

return not RANDOM_SUMMONS.isdisjoint(body)

which will be much more efficient.

Loop efficiency

for summon in RANDOM_SUMMONS:
 if summon in body:
 return summon
 elif LOOKUP in body:
 return body.split('!lookup')[1]

Why are those last two lines in your loop? The result will not change no matter how many iterations you execute. You should move those last two lines out before your loop, then replace the loop with

intersect = RANDOM_SUMMONS & body
if len(intersect) > 0:
 return next(iter(intersect))
return None

This assumes that it is non-fatal for there to be more than one overlap.

Requests

requests.get('https://jisho.org/api/v1/search/words?keyword=' + query).json()['data']

First of all, when you get the response back, call raise_for_status - this call might not have succeeded. Also, do not pass query params in the URL string; pass them in a dictionary to the params kwarg.

Successive concatenation

reply += is not advisable; it present efficiency problems. There are few ways around this - using a StringIO is one solution.

Caller choice

generate_reply should not print the reply; it should only return it. It should be up to the caller whether they want to print it or not.

Thanks for the review! One thing, under your 'more sets' header, you mention turning comment.body into a set (it's originally a string). What kind of set exactly? I guess if I reduced the random summon strings to one word rather than two, I could turn body into a set of words.
What kind of set exactly? A (built-in) Python set of phrases; dependent on a few things. In the expression summon in body for summon in RANDOM_SUMMONS, if body is reducible to a set of phrase strings that match the format of RANDOM_SUMMONS, then this is possible.

Stack Exchange Network

Reddit Dictionary Bot Python 3

1 Answer 1

Indentation

Comparison to `None`

Sets

Sleeping

More sets

Loop efficiency

Requests

Successive concatenation

Caller choice

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Reddit Dictionary Bot Python 3

1 Answer 1

Indentation

Comparison to None

Sets

Sleeping

More sets

Loop efficiency

Requests

Successive concatenation

Caller choice

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

Comparison to `None`