A message can contain an user name or group name, the user name has to be 2
character long and should contain uppercase ASCII characters only, similarly group name can be 3-10
characters long. There can be any number of these entities with the optional comment at the end.
The message would have the following format:
amount|<user|group> <,user|group>* opt_comment
Example:
Suppose there is a group COOKIEMONSTERS
with users CD, RM
:
Note: There can be multiple users in a group.
Valid messages
AB+2, BC, COOKIEMONSTERS "Dinner"
AB, BC
AB, BC, FOODIE
AB, BC, FOODIE+4
Invalid messages
A, BC, COOKIEMONSTERS "Dinner"
A, BC, "Dinner" COOKIEMONSTERS
A, BC, SOMEVERYVERYLONGNAME
1+A, BC
I have to parse the message and if there is group name present then I have to expand it into its constituent user names (assume that I have some service for it) hence, given the group constraint I chose not to follow the regex path.
So, given the above message I need the following information:
['AB+2', 'BC', 'CD', 'RM'] and "Dinner" if comment present
where CD
and RM
are members of the group COOKIEMONSTERS
.
I have written some very ugly code and want to know if some cleaner approach is possible.
Code:
const USER_LENGTH = 2;
function _parseExpenseMessage(message, dependencies) {
const parseInfo = parse(message);
const mul = [];
const add = [];
const handles = [];
let idx = 0;
for (let handle of parseInfo.handles) {
handles.push(handle.substring(0, USERNAME_LENGTH));
idx = handle.indexOf('*');
if (idx !== -1) {
mul.push(roundToTwo(handle[idx + 1]));
} else {
mul.push(1);
}
idx = handle.indexOf('+');
if (idx !== -1) {
add.push(roundToTwo(handle[idx + 1]));
} else {
add.push(0);
}
}
let amount = +parseInfo.amount;
let each = (amount - add.reduce((x, sum) => x + sum, 0)) / mul.reduce((x, sum) => x + sum, 0);
const res = [];
for (let i = 0; i < handles.length; i++) {
res.push({
target: handles[i],
amount: each * mul[i] + add[i]
});
}
return {expenses: res, comment: parseInfo.comment};
};
function parse(text) {
const duplicates = new Set();
const handles = [];
const amount = text.split('|')[0];
const message = text.split('|')[1];
let name = '';
let expr = '';
let hasComment = false;
let i = 0;
let len = message.length;
for (; i < len; i++) {
var char = message.charAt(i);
if (isValidCharacter(char)) {
name += char;
} else if (isExpression(char)) {
expr += char;
}
if (isSeparator(char) || i === message.length - 1) {
if (char === '"') {
if (message.lastIndexOf('"') === message.length - 1) {
hasComment = true;
break;
} else {
throw new Error('Invalid comment format');
}
}
if (name.length === USERNAME_LENGTH) {
handles.push(name + expr);
duplicates.add(name);
} else if (name.length > USERNAME_LENGTH) {
let group = someService(name);
for (let user of group.users()) {
let name = user.getName();
duplicates.add(user);
handles.push(user + expr);
}
}
name = '';
expr = '';
}
}
let comment = '';
if (hasComment) {
comment = message.substr(i, len).replace(/"/g, '');
}
// Duplicate entry not allowed
if (duplicates.size < handles.length) {
throw new Error('Duplicate entry');
}
return {amount: amount, handles: handles, comment: comment};
}
function isValidCharacter(char) {
return (char >= 'A' && char <= 'Z');
}
function isExpression (char) {
return (char === '*' || char === '+' || (char >= '0' && char <= '9'));
}
function isSeparator (char) {
return (char === ' ' || char === ',' || char === '"');
}
function roundToTwo(num) {
return Math.round(+(num) * 100) / 100;
}
Note:
I also need to check that there are not duplicate user in the message above.
Update 1:
The user name can contain expression with the limited semantics, eg: AB+1
or AB*2
or AB+1*2
that's it. There can be only two operators +
and *
.
1 Answer 1
Much like the other commenters on your question, I don't quite understand your specs. So this will be kind of a "blind" review of the code, focusing on technique, and largely disregarding the overall approach and algorithm.
let
and var
You use let
but not enough. For example, in this code var char
would be better as let char
:
for (; i < len; i++) { var char = message.charAt(i); if (isValidCharacter(char)) { name += char; } else if (isExpression(char)) { expr += char; }
Reserved words
char
is a reserved word in JavaScript.
Notice that its syntax highlighting in the code in your question is a bit strange. That was a clue.
Repeated splits
Here the same split is performed twice:
const amount = text.split('|')[0]; const message = text.split('|')[1];
Handling duplicate users
The parse
function parses user
and expr
strings,
collecting user + expr
in a list of handles
,
and checks for duplicate users by putting them in a duplicates
set.
First of all, the duplicates
is unfortunate for a set
,
which by definition will only contain distinct values.
But my main objection here is that you check for duplicates near the end of the function, after the parsing has completed. It would be better to check at the point when you insert a user into a set, if it already exists in the set then raise the error immediately, there's no need to continue with parsing.
Setting the comment
In the parse
function, instead of declaring and setting comment
at the end, it would be better to set it at the point where you set hasComment = true
.
The benefit will be that you can remove the hasComment
flag variable,
and declare the i
loop variable within the scope of the loop.
What is _parseExpenseMessage
for?
Since the function name starts with an _
character,
I assumed it might be an indication of some sort of "private" function,
used by other functions,
so I started reviewing the parse
function first,
in an attempt to explore the code through its public interface.
But then, it seems the _parseExpenseMessage
function is not referenced anywhere.
And the code in it doesn't resemble anything you've described in the description, so I'm wondering if it's used at all.
Term ordering
I liked very much a tip from the book Code Complete to write the terms of a range condition in increasing order of the values, when applicable. So instead of this:
return (char >= 'A' && char <= 'Z');
I find it easier to read when written like this:
return ('A' <= char && char <= 'Z');
This appeals to me also because it's one step away from Python's cool ... <= ... <= ...
operator, and I have secret hopes that other languages may implement this feature too.
When the terms are in sorted order,
the switch to a more modern (future) writing style will be just one step away. (Yeah right, I'll dream on.)
I would apply the same logic to the other similar functions too.
-
\$\begingroup\$ Actually the problem is from a interview coding problem hence I couldn't reveal much details. Having said that I agree that I failed to explain the problem properly because parsing the message is part of the big problem. So, let me give you quick hints, the application is a command line tool to manage expenses, you can register expense with the command
AB: amount|BC,CD,PIZZABUDDYS opt_comment
here AB is the spender andBC
andCD
are users names andPIZZABUDDY
can be a alias for multiple users and application requires that there cannot be duplicate user names on the right side. \$\endgroup\$CodeYogi– CodeYogi2016年12月27日 04:19:49 +00:00Commented Dec 27, 2016 at 4:19 -
\$\begingroup\$ I have designed the application is a decoupled manner and my core just accepts the user names so I have to parse the message upfront before calling the core logic. Now, one important variation is that the message can be something like
AB: 40.00|BC+1,CD*2,PIZZABUDDYS+1 opt_comment
which means that money would be distributed equally butBC
would pay 1 dollar extraCD
would pay double and users inPIZZABUDDY
would each pay 1 dollar extra. So, my main pain point was to somehow parse the message and extract these meaningful information in the most efficient way, makes sense now? \$\endgroup\$CodeYogi– CodeYogi2016年12月27日 04:24:02 +00:00Commented Dec 27, 2016 at 4:24 -
1\$\begingroup\$ @CodeYogi: It would (have) helped to have that motivation/description of "expressions" in the question. Note that Vogel612 is right in rolling back even a single underscore: edits that improve the question are welcome. They never may invalidate answers - in contrast, comments may have been given hoping them to become dispensable. (Feeling with anyone confronted with a haphazard requirements specification (interview or not) - I consider smooth handling a hard professional skill.) \$\endgroup\$greybeard– greybeard2016年12月27日 11:34:18 +00:00Commented Dec 27, 2016 at 11:34
Explore related questions
See similar questions with these tags.
CD
andRM
are coming from in your example? Also what is the+2
for? \$\endgroup\$+n
suffix mean? Are there always exactly two user names? If not, thenAB, CD
is ambiguous - CD could be a group. So I assume it's always two usernames. But then what's the purpose of the 2nd valid example, if the idea is to associate users with a group and no group is specified? \$\endgroup\$