I am facing a bit problem in Regex. I want to find the start and end index of the complete matched string in the input string.
e.g. I have an array of strings like
["a", "aa"]
and I have a text like I like a problem aa
I am doing with iteration of array strings.
let arr = ["a", "aa"];
let str = "I like a problem aa";
let indicesArr = [];
arr.forEach(a=>{
const regexObj = new RegExp(a, "gi");
let match;
while ((match = regexObj.exec(str))) {
let obj = { start: match.index, end: regexObj.lastIndex }
indicesArr.push(obj);
if(!match.index || !regexObj.lastIndex) break;
}
});
above code gives me the result
[
{start: 7, end: 8},
{start: 17, end: 18},
{start: 18, end: 19},
{start: 17, end: 19}
]
I want the result should be
[
{start: 7, end: 8},
{start: 17, end: 19}
]
Any suggestion would be very helpful, thanks:)
1 Answer 1
The problem here is that a finds two matches in aa. You need to make sure you match all occurrences of a regex that finds either aa or a in this order. It means, the regex must be /aa|a/g and not /a|aa/g as the order of alternation matters in regex.
Here, you can use
let arr = ["a", "aa"];
let str = "I like a problem aa";
let indicesArr = [];
arr.sort((a, b) => b.length - a.length);
const regexObj = new RegExp(arr.map(x=> x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')).join('|'), "gi");
let match;
while (match = regexObj.exec(str)) {
let obj = { start: match.index, end: regexObj.lastIndex }
indicesArr.push(obj);
}
console.log(indicesArr);
Note these two lines:
arr.sort((a, b) => b.length - a.length);- sorts thearritems by length in the descending order (to putaabeforea)const regexObj = new RegExp(arr.map(x=> x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')).join('|'), "gi");- escapes all items in thearrarray for use inside a regex, and joins the items with|alternation operator into a single string regex pattern.
I like a problem aa and aaaa, actually it's not matching the lastaaaapart.aa|aregex.