I am attempting to build a find and remove type functionality with regex. I was able to get to the point where I could remove strings, but discovered that strings with characters could not be removed. When, using a normal string such as var word = "is", it seems to work fine until it encounters a ., and then I get strange unwanted output.
Some other unwanted occurrences also developed when incorporating characters into the strings I wanted to remove, for example (note that var word = "is." and not is in the code below:
var myarray = ["Dr. this is", "this is. iss", "Is this IS"]
var my2array = []
var word = "is."
//var regex = new RegExp(`\\b${word}\\b`, 'gi');
var regex = new RegExp('\\b' + word + '\\b', 'gi');
for (const i of myarray) {
var x = i.replace(regex, "")
my2array.push(x)
}
myarray = my2array
console.log(myarray)
["Dr. this is", "this is. ", "this IS"]
This ^ is wrong in several ways (for some reason iss is gone, is. remains - which was the main string I was trying to remove, the first is in the last index is gone...)
I.E. my desired output in this case would be ["Dr. this is", "this iss", "Is this IS"]
I also tried using template literal, as can be seen in my commented out code.
The goal is to simply remove whatever might be the value in var word from my array. Whether the value be a regular string, a string with characters, or just characters. (And of course within the framework of the breaks I have).
1 Answer 1
There are few issues in regex approach:
.is special regex character that needs to be escaped in your word- Word boundary
\bwill not be matched after.
You may use this regex based solution:
var myarray = ["Dr. this is", "this is. iss", "Is this IS"]
var my2array = []
var word = "is."
// using lookahead and lookbehind instead of word boundary
var regex = new RegExp('\\s*(?<!\\S)' +
word.replace(/\W/g, "\\$&") + '(?!\\S)\\s*')
for (const i of myarray) {
var x = i.replace(regex, " ")
my2array.push(x)
}
myarray = my2array
console.log(myarray)
.replace(/\W/g, "\\$&")will escape all non-word characters in given word.(?<!\S)is negative lookbehind to assert that previous character is not a non-space character(?!\S)is negative lookbehind to assert that next character is not a non-space character
5 Comments
var word = "is", it gets rid of the is in this.\\. is not escaped in var word = "is\\." For example var word = "is.". I am planning on having user input for var word, so it would save me trouble if I didn't need to alter the var word string in order to make it work. Will this present a problem for me later or is the escaping in var word redundant?is( from user which if left unescaped will cause regex engine error
.matches any character in a regex, and it’s also not a word character. You can escape it as\.(i.e."is\\.") to make it match a literal period instead, but the word boundary will still be before it, not after..maprather than a for+push.myarray = myarray.map(str => str.replace(regex, ''));word. See: MDN Regular Expressions - Escaping and Escape string for use in Javascript regex