I put together this code as part of a practice problem. The instructions for the program are to accept JSON messages, transform the messages and then dispatch them to the right queue according to a series of rules.
The message service must accept messages when the following method is called: enqueue(msg)
. It must provide a single method that returns the next message for the queue with the number queue_number
.
Transformation Rules
You must implement the following transformations on input messages. These rules must be applied in order, using the transformed output in later steps. Multiple rules may apply to a single tuple.
- You must string-reverse any string value in the message that contains the exact string Mootium. For instance,
{"company": "Mootium, Inc.", "agent": "007"}
changes to{"company": ".cnI ,muitooM", "agent": "007"}
.- You must replace any integer values with the value produced by computing the bitwise negation of that integer's value. For instance,
{"value": 512}
changes to{"value": -513}
- You must add a field hash to any message that has a field
_hash
. The value of_hash
may be the name of another field. The value of your new field hash must contain the base64-encoded SHA-256 digest of the UTF-8-encoded value of that field. You may assume that the value you're given to hash is a string. If a hash field already exists in the message and the value is different to the computed hash value, then an exception should be thrown.Transformation rules, except the hash rule, must ignore the values of "private" fields whose names begin with an underscore (
_
).Dispatch Rules
There are five output queues, numbered 0 through 4.
You must implement the following "dispatch" rules to decide which queue gets a message. These rules must be applied in order; the first rule that matches is the one you should use.
- If a message contains the key
_special
, send it to queue 0.- If a message contains a hash field, send it to queue 1.
- If a message has a value that includes
muidaQ
(Qadium
in reverse), send it to queue 2.- If a message has an integer value, send it to queue 3.
- Otherwise, send the message to queue 4.
Dispatch rules must ignore the values of "private" fields whose names begin with an underscore (
_
). (Of course, rules that test the presence of keys that begin with_
still apply.)Sequences
Certain messages may be parts of a sequence. Such messages include some special fields:
_sequence
: an opaque string identifier for the sequence this message is part of_part
: an integer indicating which message this is in the sequence, starting at 0Sequences must be outputted in order. Dispatch rules are to be applied based on the first message in a sequence (message 0) only, while transformation rules must be applied to all messages.
The output queue must enqueue messages from a sequence as soon as it can; don't try to wait to output all messages of a sequence at a time. The output queue must return messages within a sequence in the correct order by part number (message 0 before message 1, before message 2 ...).
import queue
import json
import re
import base64
import hashlib
class AutoVivification(dict):
"""Implementation of perl's autovivification feature."""
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
value = self[item] = type(self)()
return value
class MootiumError(Exception):
"""Raised in the case of requesting a message from an empty queue """
pass
class Queue:
"""Simple delivery message service. Transforms incoming messages
and sends each message to the appropriate queue, keeping sequences in order.
"""
def __init__(self):
#list of output queues
self.queue_dict = {
'0': queue.Queue(),
'1': queue.Queue(),
'2': queue.Queue(),
'3': queue.Queue(),
'4': queue.Queue()
}
#store sequences in a dictionary-like structure
self.sequenceDict = AutoVivification()
def transform(self, msg):
"""Transforms incoming messages to match the transformation rules.
Args:
msg: A string with the message.
Raises:
ValueError: If the message has a key for "hash" and the value
differs from the calculated hash.
Returns:
The transformed data as a dictionary.
"""
messageDict = json.loads(msg)
keys = [i for i in messageDict.keys()]
for i in keys:
# skip private fields
if re.match("_", i):
continue
# reverse strings that include Mootium
if re.search("Mootium", str(messageDict[i])):
messageDict[i] = messageDict[i][::-1]
# replace integer values with its bitwise negation
if isinstance(messageDict[i], int):
messageDict[i] = ~messageDict[i]
if "_hash" in keys:
# if _hash references another field, encode the value from that field.
# otherwise, encode the value associated with _hash
if messageDict["_hash"] in keys:
toEncode = messageDict[messageDict["_hash"]].encode()
else:
toEncode = messageDict["_hash"].encode()
# base64-encoded SHA-256 digest of the UTF-8 encoded value
encodedSHA256 = base64.b64encode(
(hashlib.sha256(toEncode)).digest())
if "hash" in keys:
# make sure the values are the same, if a hash field already exists
if encodedSHA256 != messageDict["hash"]:
raise ValueError(
'The computed hash has a different value from the existing hash field'
)
messageDict["hash"] = encodedSHA256
return messageDict
def dispatch(self, msg, output=None):
"""Delivers a message to the right queue to match the dispatch rules.
Args:
msg: A dictionary with the message.
output: The queue for the message, if it belongs to a sequence
and is not the first part.
Returns:
The number for the queue where the message was delivered.
"""
# set keys and separate public and private message contents
keys = [i for i in msg.keys()]
publicKeys = [i for i in keys if not i.startswith('_')]
publicContents = [msg[i] for i in publicKeys]
#turn the output into string
msgDump = json.dumps(msg)
#for sequenced messages with part > 0
if output:
self.queue_dict[str(output)].put(msgDump)
else:
if "_special" in keys:
output = 0
elif "hash" in publicKeys:
output = 1
elif re.search("muidaQ", str(publicContents)):
output = 2
elif sum([isinstance(msg[i], int) for i in publicKeys]) > 0:
output = 3
else:
output = 4
self.queue_dict[str(output)].put(msgDump)
return output
def enqueue(self, msg):
"""Adds a standard JSON message to the right queue, after applying
the rules for transformation. It will apply the transformations to
messages that belong to a sequence, determine the queue from
the first message in the sequence and add them in proper order.
Args:
msg: A standard JSON message with either strings or numerics.
"""
cleanmsg = self.transform(msg)
if "_sequence" in [i for i in cleanmsg.keys()]:
sequence = cleanmsg["_sequence"]
part = cleanmsg["_part"]
self.sequenceDict[sequence][part] = cleanmsg
if part == "0":
#dispatch the first message and get the queue number
self.sequenceDict[sequence]["output"] = self.dispatch(cleanmsg)
self.sequenceDict[sequence]["current"] = 0
# send the next message, if it is available and the output is set
if self.sequenceDict[sequence]["output"]:
output = self.sequenceDict[sequence]["output"]
while self.sequenceDict[sequence][str(
self.sequenceDict[sequence]["current"] + 1)]:
self.sequenceDict[sequence][
"current"] = self.sequenceDict[sequence]["current"] + 1
self.dispatch(
self.sequenceDict[sequence][str(
self.sequenceDict[sequence]["current"])], output)
else:
self.dispatch(cleanmsg)
def next(self, queue_number):
"""Pulls the next value from the specified queue.
Args:
msg: A standard JSON message with either strings or numerics.
Raises:
ValueError: If the queue number is outside the range of options.
QadiumError: If the requested queue is empty.
Returns:
The next value in the selected queue.
"""
valueError = 'Check your queue_number. Valid options include: 0, 1, 2, 3, 4.'
if queue_number not in [0, 1, 2, 3, 4]:
raise ValueError(valueError)
try:
return self.queue_dict[str(queue_number)].get(block=False)
except queue.Empty:
raise MootiumError("Nothing is available on the queue")
def get_message_service():
"""Returns a new, "clean" Q service."""
return Queue()
I'm curious for feedback. In particular, what would you have used instead of the auto vivification and are there ways you would make this more efficient or pieces of code you would exclude?
-
1\$\begingroup\$ Welcome to Code Review! Please put the description of the task you want to accomplish (possibly abbreviated) directly in the question. Otherwise your question will lose in quality or even become useless as reference for others once the pastebin link goes down. \$\endgroup\$AlexV– AlexV2019年04月26日 06:11:06 +00:00Commented Apr 26, 2019 at 6:11
1 Answer 1
Welcome to Codereview, and welcome to Python!
Your code looks good -- indentation is good, names are mostly good (but see below), docblock comments are mostly good (but see below). It seems like you need to "soak in" Python a bit, and you'll be up and running.
Names
Python's coding standard is PEP8 which for the purposes of naming can be simplified to:
snake_case
except for classesCAPS
for constants- unless you have to do something else
Since your method names were mostly determined by the problem spec, you didn't have a lot of wiggle room. I still fault you for being inconsistent, though:
self.queue_dict = { ... }
self.sequenceDict = AutoVivification()
That last attribute should be sequence_dict
. Except putting type in names is so Windows 3.1! So maybe pending_sequences
.
Comments
If your comment says in English what the code says in Python, delete it. Comments should explain parts of the code that aren't clear or that have possibly-surprising effects.
#list of output queues
self.queue_dict = {
This comment is already a lie, since queue_dict
isn't a list
at all!
Also, there's a copy/paste error in the docblock for next
: the Args are wrong.
Types
The name for AutoVivification
is collections.defaultdict
.
class MootiumError
should have a different name, since it has a fairly specific purpose. Considering that you later catch queue.Empty
, I'm surprised at your choice. Perhaps MootiumQueueEmpty
? Or even MootiumQueueError
?
You construct a dictionary of numbered queues, but index the dictionary with strings. Then in dispatch
you have to convert your number to a string to index the queue. Why not just use integer keys for the dictionary? Better still, why not use a list, which takes integer keys always? (And it would make your comment valid again!)
def __init__(self):
self.queue_dict = [queue.Queue() for _ in range(5)]
# Store sequence-parts in a se[arate dict for each sequence
self.pending_sequences = collections.defaultdict(dict)
There are three iterator functions for dictionaries: keys()
, values()
, and items()
. The items
iterator yields (key, value) tuples.
Dictionaries can be checked for the presence of keys using the in
operator. Strings can be checked for substrings using the in
operator. Sequences can be linearly scanned for items using the in
operator. It's the most expensive way. Naturally, that's what you're doing. Don't do that!
Python is not as regex-first as Perl. So there are non-re string functions, like startswith
. They're faster and more expressive.
def transform(self, msg):
""" ... """
message = json.loads(msg)
for k, v in message.items():
if k.startswith('_'):
continue
# reverse strings that include Mootium
if isinstance(v, str) and 'Mootium' in v:
message[k] = v[::-1]
# replace integer values with its bitwise negation
elif isinstance(v, int):
message[k] = ~v
if '_hash' in message:
# if _hash references another field, encode the value from that field.
# otherwise, encode the value associated with _hash
if message["_hash"] in message:
toEncode = message[message["_hash"]].encode()
else:
toEncode = message["_hash"].encode()
digest = base64.b64encode(hashlib.sha256(toEncode).digest())
if message.setdefault('hash', digest) != digest:
raise ValueError(
'The computed hash has a different value from the existing hash field'
)
return message