In our product we are trying to parse the following different formats from a given piece of text -
${{node::123456}}${{node:123456}}$fn{{#functionName('abcd',',',' somethingWithASpace')}}$fn{{#functionName('abcd','#','${{node::123456}}')}}${{rmtrqst:someText[]->abcd}}
Sample of the text is like -
Hi, how are you ${{node::123456}}? Your order id is ${{node::636636}}.
or
Your order was placed on $fn{{#dateConverterFunction('abcd','#','${{node::123456}}')}}
I tried with Regex /\$((fn)\{{2}(\#|)(\w*)((\(.*\))|([^\$]*))\}{2})/gi - but this is not helping much. Can anyone suggest me how to write a parser for this?
A grammar could be like this -
- Every expression starts with $ followed by either fn{{ or {{
- After that there will be a string like node or #functionName or something else
- that might be followed by a parenthesis enclosed string (this may contain the whole expression like ${{node::1234}} inside it - we should ignore whatever inside parenthesis
- Finally it will be closed by }}
1 Answer 1
Use a tokenizer and let it break the strings down to a meaningful structure.
The nearly.js library is a popular choice for parsing non-linear structures like yours. You can choose to keep your expressions simple - or, if choose otherwise, the library can create an abstract syntax tree for complicated grimmer.
To write a parser using the library, define your vocabulary in a seperate file and use it for parsing.
Or you can directly using the tokanizer to get your string tokanized.
@{%
const moo = require("moo");
const lexer = moo.compile({
ws: /[ \t]+/,
number: /[0-9]+/,
word: /[a-z]+/,
times: /\*|x/
});
%}
# Pass your lexer object using the @lexer option:
@lexer lexer
# Use %token to match any token of that type instead of "token":
multiplication -> %number %ws %times %ws %number {% ([first, , , , second]) => first * second %}
# Literal strings now match tokens with that text:
trig -> "sin" %number
$followed by eitherfn{{or{{2. After that there will be a string likenodeor#functionNameorsomething else3. that might be followed by a parenthesis enclosed string (this may contain the whole expression like${{node::1234}}inside it - we should ignore whatever inside parenthesis 4. Finally it will be closed by}}${{node