A JavaScript class, which finalises the HTML of each page of my NodeJS (Express) website

Question 1

Background

A request (req in the code) is made to my server, and a response (res) returned. By the time the response has filtered through to the code below, most of the heavy lifting - looking up entries in a database, etc - has already been done. This code just puts some final tweaks on all responses before it's pushed out to the user's browser.

What This Code Does

More specifically, the code achieves three things:

A time stamp is added to the footer of the page.
The crude kind of apostrophes are transmuted into the prettier, curlier kind. (This is more important in some fonts than in others.)
-- and --- are transmuted into en and em dashes respectively.

Entry Point

This code is fired only when code in another file calls finaliser.protoRender(req, res, view, properties);. So the protoRender method is the only entry point; it's also the only exit.

The Code

/*
This code contains a class which handles any final, universal touches to the
page before it's passed to the browser.
*/
// The class in question.
class Finaliser
{
 constructor()
 {
 }
 // Ronseal.
 static fixApostrophes(input)
 {
 while(input.indexOf("``") >= 0)
 {
 input = input.replace("``", "&ldquo;");
 }
 while(input.indexOf("''") >= 0)
 {
 input = input.replace("''", "&rdquo;");
 }
 while(input.indexOf("`") >= 0)
 {
 input = input.replace("`", "&lsquo;");
 }
 while(input.indexOf("'") >= 0)
 {
 input = input.replace("'", "&rsquo;");
 }
 return input;
 }
 // Ronseal.
 static fixDashes(input)
 {
 while(input.indexOf("---") >= 0)
 {
 input = input.replace("---", "&mdash;");
 }
 while(input.indexOf("--") >= 0)
 {
 input = input.replace("--", "&ndash;");
 }
 return input;
 }
 // Render, and deliver the page to the browser.
 protoRender(req, res, view, properties)
 {
 var date = new Date();
 properties.footstamp = date.toISOString();
 res.render(view, properties, function(err, html){
 if(html === undefined)
 {
 res.render(view, properties);
 }
 else
 {
 html = Finaliser.fixApostrophes(html);
 html = Finaliser.fixDashes(html);
 res.send(html);
 }
 });
 }
}
// Exports.
module.exports = Finaliser;

Question 2

Is there a specific reason you are using HTML entities? Unless you are not delivering UTF-8 (which you should be) you can just use the literal Unicode characters. Why don't your original texts use the proper characters directly?

Question 3

It's mostly a matter of personal taste. I've heard enough horror stories about programmers getting muddled up between ' and ’; ’ is unamiguous (at least to me!). Both the HTML templates and the database entries lack the literal characters - and will continue to lack them unless you can point me to a convenient way of touch-typing them using a standard keyboard.

Question 4

Ways of improving/optimizations:

What fixApostrophes and fixDashes functions actually try to do is replacing a specific punctuation chars with respective HTML entities.
Instead of those numerous horrifying while loops - a more optimized, concise and extendable approach would be to:

compose a predefined replacement map (where keys are search patterns and values - respective entities values):

const replaceMap = {"``": "&ldquo;", "''": "&rdquo;", "`": "&lsquo;",
 "'": "&rsquo;", "---": "&mdash;", "--": "&ndash;"};

perform all the replacements at once with String.replace function based on combined regex pattern:
```
input = input.replace(new RegExp(Object.keys(replaceMap).join('|'), 'g'), function(m){
 return replaceMap[m] || m;
}); 
```
where Object.keys(replaceMap).join('|') is used to compose regex alternation group from replaceMap keys like ''|'|---|--
the former 2 functions can be conceptually combined into a single function called, say punctToEntities ("punctuations to entities")

Eventually, the Finaliser class would look as:

const replaceMap = {"``": "&ldquo;", "''": "&rdquo;", "`": "&lsquo;",
 "'": "&rsquo;", "---": "&mdash;", "--": "&ndash;"};
class Finaliser
{
 constructor()
 {
 }
 static punctToEntities(input) {
 /** Converts punctuation chars to respective HTML entities **/
 input = input.replace(new RegExp(Object.keys(replaceMap).join('|'), 'g'), function(m){
 return replaceMap[m] || m;
 }); 
 return input;
 }
 // Render, and deliver the page to the browser.
 protoRender(req, res, view, properties) {
 var date = new Date();
 properties.footstamp = date.toISOString();
 res.render(view, properties, function(err, html){
 if (html === undefined) {
 res.render(view, properties);
 } else {
 html = Finaliser.punctToEntities(html);
 res.send(html);
 }
 });
 }
}

Question 5

Why use a template literal? Just use new RegExp(Object.keys(replaceMap).join('|'), 'g').

Question 6

@RoToRa, correct. That's because I was starting reasoning that approach as ( ${Object.keys(replaceMap).join('|')}) (enclosed with braces) but forgot to simplify it at the end. Thanks, see my update

Question 7

Thank you! I always felt, intuitively, that the while loops were somehow unholy, but I couldn't think of a better way of achieving the same result. Now I know. I'll be implementing your suggestions.

Question 8

@TomHosker, you're welcome!

Question 9

@TomHosker, it's great that you sense that while (or in fact, for loops) are very often not the best approach (especially for readability). I think you'll find the following very helpful. The learnings are actually little to do with RxJs, and more to do with learning a more functional, declarative approach: reactivex.io/learnrx

Question 10

(Extension of my comment above)

It would be a better idea to use literal Unicode rather than HTML entities. The advantage of Unicode characters is that they are usable universally and not only when outputting HTML. If you don't like the literal characters in the source code, or find them hard them to read, then you can use JavaScript escape sequences with the hex Unicode. For example: """ === "\u201C". Additionally you can define a constant with a readable name: const LEFT_DOUBLE_QUOTES = "\u201C"; or/and do what one always should do if the code is difficult to read and there are no better option: Use comments.

There is nothing to say against using placeholders/markup like this for input, however conversion should happen earlier and not as the last thing.

In case of data stored in the database, if you have a front-end for editing the data in the database, then have that front-end convert the text and store the converted text in the database.

Or this could be done in the HTML templates or even by the template engines themselves by having helper functions, plugins or extensions. That way you can avoid, what I see as the biggest danger when doing globally as the last step: Converting things that shouldn't be converted, such as empty HTML attributes or comments.

<input value=''> → <input value=”>

 → <!– comment –>

BTW, there is another problem with the placeholders you have chosen: ambiguity. ''' could mean either ”’ or ’”.

Finally: There are keyboard layouts that have typographical quotes. And there are macro programs that, for example, allow you to define abbreviations or key combinations that could output quotes or other characters, and text editors/IDE also often have a such mechanism built in.

Question 11

You make some fascinating points - particularly regarding the handling of HTML comments. If I could accept two different answers, I would!

score 4 · Accepted Answer · 2019-11-13 17:53:44Z

Ways of improving/optimizations:

What fixApostrophes and fixDashes functions actually try to do is replacing a specific punctuation chars with respective HTML entities.
Instead of those numerous horrifying while loops - a more optimized, concise and extendable approach would be to:

compose a predefined replacement map (where keys are search patterns and values - respective entities values):

const replaceMap = {"``": "&ldquo;", "''": "&rdquo;", "`": "&lsquo;",
 "'": "&rsquo;", "---": "&mdash;", "--": "&ndash;"};

perform all the replacements at once with String.replace function based on combined regex pattern:
```
input = input.replace(new RegExp(Object.keys(replaceMap).join('|'), 'g'), function(m){
 return replaceMap[m] || m;
}); 
```
where Object.keys(replaceMap).join('|') is used to compose regex alternation group from replaceMap keys like ''|'|---|--
the former 2 functions can be conceptually combined into a single function called, say punctToEntities ("punctuations to entities")

Eventually, the Finaliser class would look as:

const replaceMap = {"``": "&ldquo;", "''": "&rdquo;", "`": "&lsquo;",
 "'": "&rsquo;", "---": "&mdash;", "--": "&ndash;"};
class Finaliser
{
 constructor()
 {
 }
 static punctToEntities(input) {
 /** Converts punctuation chars to respective HTML entities **/
 input = input.replace(new RegExp(Object.keys(replaceMap).join('|'), 'g'), function(m){
 return replaceMap[m] || m;
 }); 
 return input;
 }
 // Render, and deliver the page to the browser.
 protoRender(req, res, view, properties) {
 var date = new Date();
 properties.footstamp = date.toISOString();
 res.render(view, properties, function(err, html){
 if (html === undefined) {
 res.render(view, properties);
 } else {
 html = Finaliser.punctToEntities(html);
 res.send(html);
 }
 });
 }
}

Why use a template literal? Just use new RegExp(Object.keys(replaceMap).join('|'), 'g').
@RoToRa, correct. That's because I was starting reasoning that approach as ( ${Object.keys(replaceMap).join('|')}) (enclosed with braces) but forgot to simplify it at the end. Thanks, see my update
Thank you! I always felt, intuitively, that the while loops were somehow unholy, but I couldn't think of a better way of achieving the same result. Now I know. I'll be implementing your suggestions.
@TomHosker, it's great that you sense that while (or in fact, for loops) are very often not the best approach (especially for readability). I think you'll find the following very helpful. The learnings are actually little to do with RxJs, and more to do with learning a more functional, declarative approach: reactivex.io/learnrx

Stack Exchange Network

A JavaScript class, which finalises the HTML of each page of my NodeJS (Express) website

Background

What This Code Does

Entry Point

The Code

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

A JavaScript class, which finalises the HTML of each page of my NodeJS (Express) website

Background

What This Code Does

Entry Point

The Code

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions