I'm writing some code that parses strings, often using simple keywords. On parsing, the code performs various actions, such a printing a response, running functions, etc., and it keeps track of whether it was able to respond.
I am actually using multiple parsers and have illustrated this in the code shown below.
What would be better ways to structure this code, particularly with a mind to scalability and code compactness? For example, imagine many more parsers being added that operate on principles more complex than simple keyword-spotting.
Minimal working example (MWE):
#!/usr/bin/python
import json
import os
import requests
import subprocess
import sys
def main():
message = "how are you"
#message = "ip address"
#message = "restart"
triggered = [
parse_1(message = message),
parse_2(message = message)
]
if not any(triggered):
report_help()
def parse_1(
message = None
):
def keyphrases_text_response(
message = None,
keyphrases = None,
response = None
):
if any(pattern in message for pattern in keyphrases):
print(response)
return True
else:
return False
triggered = [
keyphrases_text_response(
message = message,
keyphrases = [
"image"
],
response = "http://i.imgur.com/MiqrlTh.jpg"
),
keyphrases_text_response(
message = message,
keyphrases = [
"sup",
"hi"
],
response = "sup home bean"
),
keyphrases_text_response(
message = message,
keyphrases = [
"how are you",
"are you well",
"status"
],
response = "nae bad fam"
),
keyphrases_text_response(
message = message,
keyphrases = [
"help",
"what can you do"
],
response = "I can report my IP address and I can restart my script."
)
]
if any(triggered):
return True
else:
return False
def parse_2(
message = None
):
triggered = []
if any(pattern in message for pattern in\
[
"IP",
"I.P.",
"IP address",
"I.P. address",
"ip address"
]
):
triggered.append(True)
report_IP()
if any(pattern in message for pattern in\
[
"restart"
]
):
triggered.append(True)
restart()
if any(pattern in message for pattern in\
[
"SSH",
"reverse"
]
):
triggered.append(True)
engage_command(
command = "ssh -R 10000:localhost:22 www.sern.ch",
background = True
)
if any(triggered):
return True
else:
return False
def report_IP(
contact = None,
country = True
):
IP = "unknown"
try:
data_IP_website = requests.get("http://ipinfo.io/json")
data_IP = data_IP_website.json()
IP = data_IP["ip"]
country = data_IP["country"]
except:
pass
text = "IP address: " + IP
if country:
text = text + " (" + country + ")"
print(text)
def restart():
print("restart! (and I'm in a crazy loop!)")
import __main__
os.execv(__main__.__file__, sys.argv)
def report_help():
print("I can report my IP address, I can restart my script and I can run commands.")
def engage_command(
command = None,
background = False
):
print("engage command: {command}".format(command = command))
if not background:
process = subprocess.Popen(
[command],
shell = True,
executable = "/bin/bash"
)
process.wait()
output, errors = process.communicate()
return output
else:
subprocess.Popen(
[command],
shell = True,
executable = "/bin/bash"
)
return None
if __name__ == "__main__":
main()
-
1\$\begingroup\$ I'd recommend checking out pep8 and flake8 to help improve readability of your code. \$\endgroup\$Jared Mackey– Jared Mackey2017年08月23日 04:19:41 +00:00Commented Aug 23, 2017 at 4:19
2 Answers 2
Repetition
You have a huge number of clauses of the form if any(pattern in message for pattern in somelist):
keyphrases_text_response
does a good job of reducing that within parse_1
, but you can adapt it to improve parse_2
as well.
def keyphrases_response(
message = None,
keyphrases = None,
response = None
):
if any(pattern in message for pattern in keyphrases):
response()
return True
else:
return False
def keyphrases_text_response(
message = None,
keyphrases = None,
response = None
):
return keyphrases_response(message, keyphrases, lambda: print(response))
This allows parse_2
to have a similar structure to parse_1
.
def parse_2(
message = None
):
triggered = [
keyphrases_response(
message = message,
keyphrases = [
"IP",
"I.P.",
"IP address",
"I.P. address",
"ip address"
],
response = reportIP
),
keyphrases_response(
message = message
keyphrases = [
"restart"
],
response = restart
),
keyphrases_response(
message = message
keyphrases = [
"SSH",
"reverse"
],
response = lambda: engage_command(
command = "ssh -R 10000:localhost:22 www.sern.ch",
background = True
)
)
]
return any(triggered)
Efficiency
These separate if
statements in the parse_2
function:
if any(pattern in message for pattern in\
if any(pattern in message for pattern in\
if any(pattern in message for pattern in\
should be combined into a single if/else
statement:
if any(pattern in message for pattern in\
elif any(pattern in message for pattern in\
elif any(pattern in message for pattern in\
The checks are mutually exclusive. This makes the code more efficient since you don't have to perform the 2nd check if the first is true, etc. Also, this more clearly shows the intent of the code.
Layout
In some areas, there is too much wasted vertical whitespace. For example, the blank lines in this code are not needed:
if any(triggered):
return True
else:
return False
There is value in having more code on the screen at once to understand its overall structure. This is much more typical:
if any(triggered):
return True
else:
return False
In this particular case, these 5 lines can be simplified and replaced with the following one line:
return any(triggered)
Naming
The functions named parse_1
and parse_2
need more descriptive
names. The "1" and "2" suffixes should be replaced with something
more meaningful.
Documentation
The PEP 8 style guide recommends
adding docstrings for functions. This would aid in understanding the
parse_
functions.
Tools
You could run code development tools to automatically find some style issues with your code.
ruff
Remove unused import: `json`