Find and replace text in a webpage with JavaScript

Question 1

I am new to web development and would like to get feedback on a JavaScript library that automatically finds and replaces text on a webpage using the MutationObserver interface.

EDIT: Seeing that this has gone a week unanswered I've cut out non-essential code to reduce the line count from 300 to more manageable 150.

The README may also be helpful in understanding how it is meant to be used. My main questions are:

As you might have inferred from the heavy use of private variables, I have a Java background. Is such usage of OOP/encapsulation idiomatic in JS?
To prevent replacements made by one observer's callback from alerting other observers and creating an infinite "ping-pong" effect, I store every created TextObserver in a static class variable and deactivate them before running the callback. Is this good practice? It feels like a God object.
Is there anything else I should do to make this more fit for use as a library? (UMD?) And is the README understandable? (If it is appropriate to ask for reviews on documentation here.)

class TextObserver {
 #target;
 #callback;
 #observer;
 // Sometimes, MutationRecords of type 'childList' have added nodes with overlapping subtrees
 // Nodes can also be removed and reattached from one parent to another
 // This can cause resource usage to explode with "recursive" replacements, e.g. expands -> physically expands
 // The processed set ensures that each added node is only processed once so the above doesn't happen
 #processed = new Set();
 // Keep track of all created observers to prevent infinite callbacks
 static #observers = new Set();
 // Use static read-only properties as class constants
 static get #IGNORED_NODES() {
 // Node types that implement the CharacterData interface but are not relevant or visible to the user
 return [Node.CDATA_SECTION_NODE, Node.PROCESSING_INSTRUCTION_NODE, Node.COMMENT_NODE];
 }
 static get #IGNORED_TAGS() {
 // Text nodes that are not front-facing content
 return ['SCRIPT', 'STYLE', 'NOSCRIPT'];
 }
 static get #CONFIG() {
 return {
 subtree: true,
 childList: true,
 characterData: true,
 characterDataOldValue: true,
 };
 }
 constructor(callback, target = document, processExisting = true) {
 this.#callback = callback;
 // If target is entire document, manually process <title> and skip the rest of the <head>
 // Processing the <head> can increase runtime by a factor of two
 if (target === document) {
 document.title = callback(document.title);
 // Sometimes <body> may be missing, like when viewing an .SVG file in the browser
 if (document.body !== null) {
 target = document.body;
 } else {
 console.warn('Document body does not exist, exiting...');
 return;
 }
 }
 this.#target = target;
 if (processExisting) {
 TextObserver.#flushAndSleepDuring(() => this.#processNodes(target));
 }
 const observer = new MutationObserver(mutations => {
 const records = [];
 for (const textObserver of TextObserver.#observers) {
 // This ternary is why this section does not use flushAndSleepDuring
 // It allows the nice-to-have property of callbacks being processed in the order they were declared
 records.push(textObserver === this ? mutations : textObserver.#observer.takeRecords());
 textObserver.#observer.disconnect();
 }
 let i = 0;
 for (const textObserver of TextObserver.#observers) {
 textObserver.#observerCallback(records[i]);
 i++;
 }
 TextObserver.#observers.forEach(textObserver => textObserver.#observer.observe(textObserver.#target, TextObserver.#CONFIG));
 });
 observer.observe(target, TextObserver.#CONFIG);
 this.#observer = observer;
 TextObserver.#observers.add(this);
 }
 #observerCallback(mutations) {
 for (const mutation of mutations) {
 const target = mutation.target;
 switch (mutation.type) {
 case 'childList':
 for (const node of mutation.addedNodes) {
 if (node.nodeType === Node.TEXT_NODE) {
 if (this.#valid(node) && !this.#processed.has(node)) {
 node.nodeValue = this.#callback(node.nodeValue);
 this.#processed.add(node);
 }
 } else if (!TextObserver.#IGNORED_NODES.includes(node.nodeType)) {
 // If added node is not text, process subtree
 this.#processNodes(node);
 }
 }
 break;
 case 'characterData':
 if (this.#valid(target) && target.nodeValue !== mutation.oldValue) {
 target.nodeValue = this.#callback(target.nodeValue);
 this.#processed.add(target);
 }
 break;
 }
 }
 }
 static #flushAndSleepDuring(callback) {
 // Disconnect all other observers to prevent infinite callbacks
 const records = [];
 for (const textObserver of TextObserver.#observers) {
 // Collect pending mutation notifications
 records.push(textObserver.#observer.takeRecords());
 textObserver.#observer.disconnect();
 }
 // This is in its own separate loop from the above because we want to disconnect everything before proceeding
 // Otherwise, the mutations in the callback may trigger the other observers
 let i = 0;
 for (const textObserver of TextObserver.#observers) {
 textObserver.#observerCallback(records[i]);
 i++;
 }
 callback();
 TextObserver.#observers.forEach(textObserver => textObserver.#observer.observe(textObserver.#target, TextObserver.#CONFIG));
 }
 #valid(node) {
 return (
 // Sometimes the node is removed from the document before we can process it, so check for valid parent
 node.parentNode !== null
 && !TextObserver.#IGNORED_NODES.includes(node.nodeType)
 && !TextObserver.#IGNORED_TAGS.includes(node.parentNode.tagName)
 // Ignore contentEditable elements as touching them messes up the cursor position
 && !node.parentNode.isContentEditable
 // HACK: workaround to avoid breaking icon fonts
 && !window.getComputedStyle(node.parentNode).getPropertyValue('font-family').toUpperCase().includes('ICON')
 );
 }
 #processNodes(root) {
 // Process valid Text nodes
 const nodes = document.createTreeWalker(root, NodeFilter.SHOW_TEXT, { acceptNode: node => (
 this.#valid(node) && !this.#processed.has(node)) ? NodeFilter.FILTER_ACCEPT : NodeFilter.FILTER_REJECT
 });
 while (nodes.nextNode()) {
 nodes.currentNode.nodeValue = this.#callback(nodes.currentNode.nodeValue);
 this.#processed.add(nodes.currentNode);
 }
 }
}
// A good page to try this out on would be https://en.wikipedia.org/wiki/Heck
const badWordFilter = new TextObserver(text => text.replaceAll(/heck/gi, 'h*ck'));

Question 2

From a short review;

Classes are idiomatic (and becoming more so over time), but the heavy usage of private variables makes the code harder to read for me.
For production code, I would drop calls to console.warn() it does not seem warranted here
Unless you are worried about observers not managed in TextObserver, I would have a static in your class called sleepMode, and in your mutation observer you leave immediately if sleepMode is on. To me that would be KISS and remove the Gode Mode problem.
The name of TextObserver is not great, it is meant to replace text which is not clear from the name
In the README, I would follow the Mozilla format/approach for the replacement function you pass instead of just a function that takes a string as its only argument and returns a string to replace it with

I would add a feature where an observer can have a list/array of functions to apply so that the event needs to be processed only once and then have something like

 const grammarPolice = new TextObserver(text => text.replaceAll(/would of/gi, 'would have'));
 grammarPolice.add(text => text.replaceAll(/should of/gi, 'should have'));

Comments are good, but some of them are too much like // Use static read-only properties as class constants
I would have called valid -> isValid

Overall, my first impression was that this looked indeed like a Java coder writing, but really after a closer look, this is just battled tested code.

konijn konijn 34.2k5 gold badges70 silver badges267 bronze badges · Accepted Answer · 2022-08-09 09:14:45Z

From a short review;

Classes are idiomatic (and becoming more so over time), but the heavy usage of private variables makes the code harder to read for me.
For production code, I would drop calls to console.warn() it does not seem warranted here
Unless you are worried about observers not managed in TextObserver, I would have a static in your class called sleepMode, and in your mutation observer you leave immediately if sleepMode is on. To me that would be KISS and remove the Gode Mode problem.
The name of TextObserver is not great, it is meant to replace text which is not clear from the name
In the README, I would follow the Mozilla format/approach for the replacement function you pass instead of just a function that takes a string as its only argument and returns a string to replace it with

I would add a feature where an observer can have a list/array of functions to apply so that the event needs to be processed only once and then have something like

 const grammarPolice = new TextObserver(text => text.replaceAll(/would of/gi, 'would have'));
 grammarPolice.add(text => text.replaceAll(/should of/gi, 'should have'));

Comments are good, but some of them are too much like // Use static read-only properties as class constants
I would have called valid -> isValid

Overall, my first impression was that this looked indeed like a Java coder writing, but really after a closer look, this is just battled tested code.

Stack Exchange Network

Find and replace text in a webpage with JavaScript

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Find and replace text in a webpage with JavaScript

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions