Background
A request (req
in the code) is made to my server, and a response (res
) returned. By the time the response has filtered through to the code below, most of the heavy lifting - looking up entries in a database, etc - has already been done. This code just puts some final tweaks on all responses before it's pushed out to the user's browser.
What This Code Does
More specifically, the code achieves three things:
- A time stamp is added to the footer of the page.
- The crude kind of apostrophes are transmuted into the prettier, curlier kind. (This is more important in some fonts than in others.)
--
and---
are transmuted into en and em dashes respectively.
Entry Point
This code is fired only when code in another file calls finaliser.protoRender(req, res, view, properties);
. So the protoRender
method is the only entry point; it's also the only exit.
The Code
/*
This code contains a class which handles any final, universal touches to the
page before it's passed to the browser.
*/
// The class in question.
class Finaliser
{
constructor()
{
}
// Ronseal.
static fixApostrophes(input)
{
while(input.indexOf("``") >= 0)
{
input = input.replace("``", "“");
}
while(input.indexOf("''") >= 0)
{
input = input.replace("''", "”");
}
while(input.indexOf("`") >= 0)
{
input = input.replace("`", "‘");
}
while(input.indexOf("'") >= 0)
{
input = input.replace("'", "’");
}
return input;
}
// Ronseal.
static fixDashes(input)
{
while(input.indexOf("---") >= 0)
{
input = input.replace("---", "—");
}
while(input.indexOf("--") >= 0)
{
input = input.replace("--", "–");
}
return input;
}
// Render, and deliver the page to the browser.
protoRender(req, res, view, properties)
{
var date = new Date();
properties.footstamp = date.toISOString();
res.render(view, properties, function(err, html){
if(html === undefined)
{
res.render(view, properties);
}
else
{
html = Finaliser.fixApostrophes(html);
html = Finaliser.fixDashes(html);
res.send(html);
}
});
}
}
// Exports.
module.exports = Finaliser;
2 Answers 2
Ways of improving/optimizations:
What fixApostrophes
and fixDashes
functions actually try to do is replacing a specific punctuation chars with respective HTML entities.
Instead of those numerous horrifying while
loops - a more optimized, concise and extendable approach would be to:
compose a predefined replacement map (where keys are search patterns and values - respective entities values):
const replaceMap = {"``": "“", "''": "”", "`": "‘", "'": "’", "---": "—", "--": "–"};
perform all the replacements at once with
String.replace
function based on combined regex pattern:input = input.replace(new RegExp(Object.keys(replaceMap).join('|'), 'g'), function(m){ return replaceMap[m] || m; });
where
Object.keys(replaceMap).join('|')
is used to compose regex alternation group fromreplaceMap
keys like''|'|---|--
the former 2 functions can be conceptually combined into a single function called, say
punctToEntities
("punctuations to entities")
Eventually, the Finaliser
class would look as:
const replaceMap = {"``": "“", "''": "”", "`": "‘",
"'": "’", "---": "—", "--": "–"};
class Finaliser
{
constructor()
{
}
static punctToEntities(input) {
/** Converts punctuation chars to respective HTML entities **/
input = input.replace(new RegExp(Object.keys(replaceMap).join('|'), 'g'), function(m){
return replaceMap[m] || m;
});
return input;
}
// Render, and deliver the page to the browser.
protoRender(req, res, view, properties) {
var date = new Date();
properties.footstamp = date.toISOString();
res.render(view, properties, function(err, html){
if (html === undefined) {
res.render(view, properties);
} else {
html = Finaliser.punctToEntities(html);
res.send(html);
}
});
}
}
-
1\$\begingroup\$ Why use a template literal? Just use
new RegExp(Object.keys(replaceMap).join('|'), 'g')
. \$\endgroup\$RoToRa– RoToRa2019年11月14日 09:44:06 +00:00Commented Nov 14, 2019 at 9:44 -
1\$\begingroup\$ @RoToRa, correct. That's because I was starting reasoning that approach as
( ${Object.keys(replaceMap).join('|')})
(enclosed with braces) but forgot to simplify it at the end. Thanks, see my update \$\endgroup\$RomanPerekhrest– RomanPerekhrest2019年11月14日 09:56:43 +00:00Commented Nov 14, 2019 at 9:56 -
1\$\begingroup\$ Thank you! I always felt, intuitively, that the
while
loops were somehow unholy, but I couldn't think of a better way of achieving the same result. Now I know. I'll be implementing your suggestions. \$\endgroup\$Tom Hosker– Tom Hosker2019年11月14日 11:33:44 +00:00Commented Nov 14, 2019 at 11:33 -
1\$\begingroup\$ @TomHosker, you're welcome! \$\endgroup\$RomanPerekhrest– RomanPerekhrest2019年11月14日 11:40:05 +00:00Commented Nov 14, 2019 at 11:40
-
1\$\begingroup\$ @TomHosker, it's great that you sense that
while
(or in fact,for
loops) are very often not the best approach (especially for readability). I think you'll find the following very helpful. The learnings are actually little to do with RxJs, and more to do with learning a more functional, declarative approach: reactivex.io/learnrx \$\endgroup\$Duncan Awerbuck– Duncan Awerbuck2019年11月19日 13:12:08 +00:00Commented Nov 19, 2019 at 13:12
(Extension of my comment above)
It would be a better idea to use literal Unicode rather than HTML entities. The advantage of Unicode characters is that they are usable universally and not only when outputting HTML. If you don't like the literal characters in the source code, or find them hard them to read, then you can use JavaScript escape sequences with the hex Unicode. For example: """ === "\u201C"
. Additionally you can define a constant with a readable name: const LEFT_DOUBLE_QUOTES = "\u201C";
or/and do what one always should do if the code is difficult to read and there are no better option: Use comments.
There is nothing to say against using placeholders/markup like this for input, however conversion should happen earlier and not as the last thing.
In case of data stored in the database, if you have a front-end for editing the data in the database, then have that front-end convert the text and store the converted text in the database.
Or this could be done in the HTML templates or even by the template engines themselves by having helper functions, plugins or extensions. That way you can avoid, what I see as the biggest danger when doing globally as the last step: Converting things that shouldn't be converted, such as empty HTML attributes or comments.
<input value=''>
→ <input value=”>
<!-- comment -->
→ <!– comment –>
BTW, there is another problem with the placeholders you have chosen: ambiguity. '''
could mean either ”’
or ’”
.
Finally: There are keyboard layouts that have typographical quotes. And there are macro programs that, for example, allow you to define abbreviations or key combinations that could output quotes or other characters, and text editors/IDE also often have a such mechanism built in.
-
\$\begingroup\$ You make some fascinating points - particularly regarding the handling of HTML comments. If I could accept two different answers, I would! \$\endgroup\$Tom Hosker– Tom Hosker2019年11月14日 17:35:12 +00:00Commented Nov 14, 2019 at 17:35
'
and’
;’
is unamiguous (at least to me!). Both the HTML templates and the database entries lack the literal characters - and will continue to lack them unless you can point me to a convenient way of touch-typing them using a standard keyboard. \$\endgroup\$