Return to Answer

Fix typo in RegExp instanciation, 6 characters edits minimum...

String concatenation is slow. The reason is that you need to allocate new memory anytime you concatenate two strings. A smart compiler may optimise that, but not too much. However, what you are trying to do with that reduce already exists, and it's called join:
```
 pattern = pattern.split("").join(".*")
```
The regex itself can be optimised: .* initially grabs as much as possible, then retracts only if the rest doesn't match. This takes a larger than neccessary amount of backtracking to find out. Instead, you could use a reluctant quantifier: .*?. This attempts to match as little as possible. If a match is found, it is found without backtracking. However, it still does a lot of backtracking if there is no match. We can do better: a[^b]*b (at this point, it doesn't really matter if the quantifier is greedy). A possessive quantifier (a[^b]*+b) would be even better, but javascript doesn't support these. Using a character class deprives us of a join, but see the next point.
```
 pattern = pattern.split("").reduce(function(a,b){ return a+'[^'+b+']*'+b; });
```
Since you are complaining about an operation that takes about 3000 ns (as noted in the comments), it can be assumed you are doing a lot of queries. If there are very few patterns, you can cache your regexes. Underscore.js has a handy function that I'll demonstrate, but you could easily build the cache yourself.
```
 var cache = _.memoize(function(pattern){
 return new RegExp(pattern.split("").reduce(function(a,b){
 return a+'[^'+b+']*'+b;
 })
 })
 function fuzzy_match(str,pattern){
 return cache(pattern).test(str)
 };
```
If there is a lot of repetition among the tested strings, you should use these as a pattern instead, and the pattern as the tested string. This is even easier. I will also demonstrate how to scope your variables using the export pattern. Also, a bugfix must be inserted here (we can't assume all characters in the string are alphanumeric), to properly escape non-alphanumeric characters:
```
 var fuzzy_match = (function(){
 var cache = _.memoize(function(str){
 return new RexEgpRegExp("^"+str.replace(/./g, function(x){
 return /[\-\[\]\/\{\}\*\+\?\.\\\^\$\|]/.test(x) ? "\\"+x+"?" : x+"?";
 })+"$");
 });
 return function(str, pattern){
 return cache(str).test(pattern)
 });
 })();
```

String concatenation is slow. The reason is that you need to allocate new memory anytime you concatenate two strings. A smart compiler may optimise that, but not too much. However, what you are trying to do with that reduce already exists, and it's called join:
```
 pattern = pattern.split("").join(".*")
```
The regex itself can be optimised: .* initially grabs as much as possible, then retracts only if the rest doesn't match. This takes a larger than neccessary amount of backtracking to find out. Instead, you could use a reluctant quantifier: .*?. This attempts to match as little as possible. If a match is found, it is found without backtracking. However, it still does a lot of backtracking if there is no match. We can do better: a[^b]*b (at this point, it doesn't really matter if the quantifier is greedy). A possessive quantifier (a[^b]*+b) would be even better, but javascript doesn't support these. Using a character class deprives us of a join, but see the next point.
```
 pattern = pattern.split("").reduce(function(a,b){ return a+'[^'+b+']*'+b; });
```
Since you are complaining about an operation that takes about 3000 ns (as noted in the comments), it can be assumed you are doing a lot of queries. If there are very few patterns, you can cache your regexes. Underscore.js has a handy function that I'll demonstrate, but you could easily build the cache yourself.
```
 var cache = _.memoize(function(pattern){
 return new RegExp(pattern.split("").reduce(function(a,b){
 return a+'[^'+b+']*'+b;
 })
 })
 function fuzzy_match(str,pattern){
 return cache(pattern).test(str)
 };
```
If there is a lot of repetition among the tested strings, you should use these as a pattern instead, and the pattern as the tested string. This is even easier. I will also demonstrate how to scope your variables using the export pattern. Also, a bugfix must be inserted here (we can't assume all characters in the string are alphanumeric), to properly escape non-alphanumeric characters:
```
 var fuzzy_match = (function(){
 var cache = _.memoize(function(str){
 return new RexEgp("^"+str.replace(/./g, function(x){
 return /[\-\[\]\/\{\}\*\+\?\.\\\^\$\|]/.test(x) ? "\\"+x+"?" : x+"?";
 })+"$");
 })
 return function(str, pattern){
 return cache(str).test(pattern)
 })
 })()
```

String concatenation is slow. The reason is that you need to allocate new memory anytime you concatenate two strings. A smart compiler may optimise that, but not too much. However, what you are trying to do with that reduce already exists, and it's called join:
```
 pattern = pattern.split("").join(".*")
```
The regex itself can be optimised: .* initially grabs as much as possible, then retracts only if the rest doesn't match. This takes a larger than neccessary amount of backtracking to find out. Instead, you could use a reluctant quantifier: .*?. This attempts to match as little as possible. If a match is found, it is found without backtracking. However, it still does a lot of backtracking if there is no match. We can do better: a[^b]*b (at this point, it doesn't really matter if the quantifier is greedy). A possessive quantifier (a[^b]*+b) would be even better, but javascript doesn't support these. Using a character class deprives us of a join, but see the next point.
```
 pattern = pattern.split("").reduce(function(a,b){ return a+'[^'+b+']*'+b; });
```
Since you are complaining about an operation that takes about 3000 ns (as noted in the comments), it can be assumed you are doing a lot of queries. If there are very few patterns, you can cache your regexes. Underscore.js has a handy function that I'll demonstrate, but you could easily build the cache yourself.
```
 var cache = _.memoize(function(pattern){
 return new RegExp(pattern.split("").reduce(function(a,b){
 return a+'[^'+b+']*'+b;
 })
 })
 function fuzzy_match(str,pattern){
 return cache(pattern).test(str)
 };
```
If there is a lot of repetition among the tested strings, you should use these as a pattern instead, and the pattern as the tested string. This is even easier. I will also demonstrate how to scope your variables using the export pattern. Also, a bugfix must be inserted here (we can't assume all characters in the string are alphanumeric), to properly escape non-alphanumeric characters:
```
 var fuzzy_match = (function(){
 var cache = _.memoize(function(str){
 return new RegExp("^"+str.replace(/./g, function(x){
 return /[\-\[\]\/\{\}\*\+\?\.\\\^\$\|]/.test(x) ? "\\"+x+"?" : x+"?";
 })+"$");
 });
 return function(str, pattern){
 return cache(str).test(pattern)
 });
 })();
```

replaced http://stackoverflow.com/ with https://stackoverflow.com/

Source Link

edited May 23, 2017 at 11:33

Community Bot

edited May 23, 2017 at 11:33

Community Bot

String concatenation is slow. The reason is that you need to allocate new memory anytime you concatenate two strings. A smart compiler may optimise that, but not too much. However, what you are trying to do with that reduce already exists, and it's called join:
```
 pattern = pattern.split("").join(".*")
```
The regex itself can be optimised: .* initially grabs as much as possible, then retracts only if the rest doesn't match. This takes a larger than neccessary amount of backtracking to find out. Instead, you could use a reluctant quantifier: .*?. This attempts to match as little as possible. If a match is found, it is found without backtracking. However, it still does a lot of backtracking if there is no match. We can do better: a[^b]*b (at this point, it doesn't really matter if the quantifier is greedy). A possessive quantifier (a[^b]*+b) would be even better, but javascript doesn't support these javascript doesn't support these. Using a character class deprives us of a join, but see the next point.
```
 pattern = pattern.split("").reduce(function(a,b){ return a+'[^'+b+']*'+b; });
```
Since you are complaining about an operation that takes about 3000 ns (as noted in the comments), it can be assumed you are doing a lot of queries. If there are very few patterns, you can cache your regexes. Underscore.js has a handy function that I'll demonstrate, but you could easily build the cache yourself.
```
 var cache = _.memoize(function(pattern){
 return new RegExp(pattern.split("").reduce(function(a,b){
 return a+'[^'+b+']*'+b;
 })
 })
 function fuzzy_match(str,pattern){
 return cache(pattern).test(str)
 };
```
If there is a lot of repetition among the tested strings, you should use these as a pattern instead, and the pattern as the tested string. This is even easier. I will also demonstrate how to scope your variables using the export pattern. Also, a bugfix must be inserted here (we can't assume all characters in the string are alphanumeric), to properly escape non-alphanumeric characters:
```
 var fuzzy_match = (function(){
 var cache = _.memoize(function(str){
 return new RexEgp("^"+str.replace(/./g, function(x){
 return /[\-\[\]\/\{\}\*\+\?\.\\\^\$\|]/.test(x) ? "\\"+x+"?" : x+"?";
 })+"$");
 })
 return function(str, pattern){
 return cache(str).test(pattern)
 })
 })()
```

If you want to test if a given string "abcde" is matched some pattern: The pattern must only contain the characters of the string, in the correct order: /^a?b?c?d?e?$/. To do this, we regex-escape every regex special character (pattern source: CoolAj86 CoolAj86), make every character optional (?), and flank the regex with string boundary anchors.

String concatenation is slow. The reason is that you need to allocate new memory anytime you concatenate two strings. A smart compiler may optimise that, but not too much. However, what you are trying to do with that reduce already exists, and it's called join:
```
 pattern = pattern.split("").join(".*")
```
The regex itself can be optimised: .* initially grabs as much as possible, then retracts only if the rest doesn't match. This takes a larger than neccessary amount of backtracking to find out. Instead, you could use a reluctant quantifier: .*?. This attempts to match as little as possible. If a match is found, it is found without backtracking. However, it still does a lot of backtracking if there is no match. We can do better: a[^b]*b (at this point, it doesn't really matter if the quantifier is greedy). A possessive quantifier (a[^b]*+b) would be even better, but javascript doesn't support these. Using a character class deprives us of a join, but see the next point.
```
 pattern = pattern.split("").reduce(function(a,b){ return a+'[^'+b+']*'+b; });
```
Since you are complaining about an operation that takes about 3000 ns (as noted in the comments), it can be assumed you are doing a lot of queries. If there are very few patterns, you can cache your regexes. Underscore.js has a handy function that I'll demonstrate, but you could easily build the cache yourself.
```
 var cache = _.memoize(function(pattern){
 return new RegExp(pattern.split("").reduce(function(a,b){
 return a+'[^'+b+']*'+b;
 })
 })
 function fuzzy_match(str,pattern){
 return cache(pattern).test(str)
 };
```
If there is a lot of repetition among the tested strings, you should use these as a pattern instead, and the pattern as the tested string. This is even easier. I will also demonstrate how to scope your variables using the export pattern. Also, a bugfix must be inserted here (we can't assume all characters in the string are alphanumeric), to properly escape non-alphanumeric characters:
```
 var fuzzy_match = (function(){
 var cache = _.memoize(function(str){
 return new RexEgp("^"+str.replace(/./g, function(x){
 return /[\-\[\]\/\{\}\*\+\?\.\\\^\$\|]/.test(x) ? "\\"+x+"?" : x+"?";
 })+"$");
 })
 return function(str, pattern){
 return cache(str).test(pattern)
 })
 })()
```

If you want to test if a given string "abcde" is matched some pattern: The pattern must only contain the characters of the string, in the correct order: /^a?b?c?d?e?$/. To do this, we regex-escape every regex special character (pattern source: CoolAj86), make every character optional (?), and flank the regex with string boundary anchors.

String concatenation is slow. The reason is that you need to allocate new memory anytime you concatenate two strings. A smart compiler may optimise that, but not too much. However, what you are trying to do with that reduce already exists, and it's called join:
```
 pattern = pattern.split("").join(".*")
```
The regex itself can be optimised: .* initially grabs as much as possible, then retracts only if the rest doesn't match. This takes a larger than neccessary amount of backtracking to find out. Instead, you could use a reluctant quantifier: .*?. This attempts to match as little as possible. If a match is found, it is found without backtracking. However, it still does a lot of backtracking if there is no match. We can do better: a[^b]*b (at this point, it doesn't really matter if the quantifier is greedy). A possessive quantifier (a[^b]*+b) would be even better, but javascript doesn't support these. Using a character class deprives us of a join, but see the next point.
```
 pattern = pattern.split("").reduce(function(a,b){ return a+'[^'+b+']*'+b; });
```
Since you are complaining about an operation that takes about 3000 ns (as noted in the comments), it can be assumed you are doing a lot of queries. If there are very few patterns, you can cache your regexes. Underscore.js has a handy function that I'll demonstrate, but you could easily build the cache yourself.
```
 var cache = _.memoize(function(pattern){
 return new RegExp(pattern.split("").reduce(function(a,b){
 return a+'[^'+b+']*'+b;
 })
 })
 function fuzzy_match(str,pattern){
 return cache(pattern).test(str)
 };
```
If there is a lot of repetition among the tested strings, you should use these as a pattern instead, and the pattern as the tested string. This is even easier. I will also demonstrate how to scope your variables using the export pattern. Also, a bugfix must be inserted here (we can't assume all characters in the string are alphanumeric), to properly escape non-alphanumeric characters:
```
 var fuzzy_match = (function(){
 var cache = _.memoize(function(str){
 return new RexEgp("^"+str.replace(/./g, function(x){
 return /[\-\[\]\/\{\}\*\+\?\.\\\^\$\|]/.test(x) ? "\\"+x+"?" : x+"?";
 })+"$");
 })
 return function(str, pattern){
 return cache(str).test(pattern)
 })
 })()
```

If you want to test if a given string "abcde" is matched some pattern: The pattern must only contain the characters of the string, in the correct order: /^a?b?c?d?e?$/. To do this, we regex-escape every regex special character (pattern source: CoolAj86), make every character optional (?), and flank the regex with string boundary anchors.

added 658 characters in body

Source Link

edited Mar 14, 2013 at 13:48

John Dvorak

edited Mar 14, 2013 at 13:48

John Dvorak

1.2k
9
15

String concatenation is slow. The reason is that you need to allocate new memory anytime you concatenate two strings. A smart compiler may optimise that, but not too much. However, what you are trying to do with that reduce already exists, and it's called join:
```
 pattern = pattern.split("").join(".*")
```
The regex itself can be optimised: .* initially grabs as much as possible, then retracts only if the rest doesn't match. This takes a larger than neccessary amount of backtracking to find out. Instead, you could use a reluctant quantifier: .*?. This attempts to match as little as possible. If a match is found, it is found without backtracking. However, it still does a lot of backtracking if there is no match. We can do better: a[^b]*b (at this point, it doesn't really matter if the quantifier is greedy). A possessive quantifier (a[^b]*+b) would be even better, but javascript doesn't support these. Using a character class deprives us of a join, but see the next point.
```
 pattern = pattern.split("").reduce(function(a,b){ return a+'[^'+b+']*'+b; });
```
Since you are complaining about an operation that takes about 3000 ns (as noted in the comments), it can be assumed you are doing a lot of queries. If there are very few patterns, you can cache your regexes. Underscore.js has a handy function that I'll demonstrate, but you could easily build the cache yourself.
```
 var cache = _.memoize(function(pattern){
 return new RegExp(pattern.split("").reduce(function(a,b){
 return a+'[^'+b+']*'+b;
 })
 })
 function fuzzy_match(str,pattern){
 return cache(pattern).test(str)
 };
```
If there is a lot of repetition among the tested strings, you should use these as a pattern instead, and the pattern as the tested string. This is even easier. I will also demonstrate how to scope your variables using the export pattern. Also, a bugfix must be inserted here (we can't assume all characters in the string are alphanumeric), to properly escape non-alphanumeric characters (assuming the string does not contain "\\E", quoting will do):
```
 var fuzzy_match = (function(){
 var cache = _.memoize(function(str){
 return new RexEgp("\\Q"+str"^"+str.replace(/./g, "$&function(x){
 return /[\-\[\]\/\{\}\*\+\?\.\\\^\$\|]/.test(x) ? "\\"+x+"?" : x+"?";
 })+"\\E"+"$");
 })
 return function(str, pattern){
 return cache(str).test(pattern)
 })
 })()
```

Concerning the last regex:

Given some pattern "ace", the regex you build (/a.*c.*e/) tests if the string contains the characters of the pattern in the string, in the correct order.

If you want to test if a given string "abcde" is matched some pattern: The pattern must only contain the characters of the string, in the correct order: /^a?b?c?d?e?$/. To do this, we regex-escape every regex special character (pattern source: CoolAj86 ), make every character optional (?), and flank the regex with string boundary anchors.

String concatenation is slow. The reason is that you need to allocate new memory anytime you concatenate two strings. A smart compiler may optimise that, but not too much. However, what you are trying to do with that reduce already exists, and it's called join:
```
 pattern = pattern.split("").join(".*")
```
The regex itself can be optimised: .* initially grabs as much as possible, then retracts only if the rest doesn't match. This takes a larger than neccessary amount of backtracking to find out. Instead, you could use a reluctant quantifier: .*?. This attempts to match as little as possible. If a match is found, it is found without backtracking. However, it still does a lot of backtracking if there is no match. We can do better: a[^b]*b (at this point, it doesn't really matter if the quantifier is greedy). A possessive quantifier (a[^b]*+b) would be even better, but javascript doesn't support these. Using a character class deprives us of a join, but see the next point.
```
 pattern = pattern.split("").reduce(function(a,b){ return a+'[^'+b+']*'+b; });
```
Since you are complaining about an operation that takes about 3000 ns (as noted in the comments), it can be assumed you are doing a lot of queries. If there are very few patterns, you can cache your regexes. Underscore.js has a handy function that I'll demonstrate, but you could easily build the cache yourself.
```
 var cache = _.memoize(function(pattern){
 return new RegExp(pattern.split("").reduce(function(a,b){
 return a+'[^'+b+']*'+b;
 })
 })
 function fuzzy_match(str,pattern){
 return cache(pattern).test(str)
 };
```
If there is a lot of repetition among the tested strings, you should use these as a pattern instead, and the pattern as the tested string. This is even easier. I will also demonstrate how to scope your variables using the export pattern. Also, a bugfix must be inserted here (we can't assume all characters in the string are alphanumeric), to properly escape non-alphanumeric characters (assuming the string does not contain "\\E", quoting will do):
```
 var fuzzy_match = (function(){
 var cache = _.memoize(function(str){
 return new RexEgp("\\Q"+str.replace(/./g, "$&?")+"\\E");
 })
 return function(str, pattern){
 return cache(str).test(pattern)
 })
 })()
```

String concatenation is slow. The reason is that you need to allocate new memory anytime you concatenate two strings. A smart compiler may optimise that, but not too much. However, what you are trying to do with that reduce already exists, and it's called join:
```
 pattern = pattern.split("").join(".*")
```
The regex itself can be optimised: .* initially grabs as much as possible, then retracts only if the rest doesn't match. This takes a larger than neccessary amount of backtracking to find out. Instead, you could use a reluctant quantifier: .*?. This attempts to match as little as possible. If a match is found, it is found without backtracking. However, it still does a lot of backtracking if there is no match. We can do better: a[^b]*b (at this point, it doesn't really matter if the quantifier is greedy). A possessive quantifier (a[^b]*+b) would be even better, but javascript doesn't support these. Using a character class deprives us of a join, but see the next point.
```
 pattern = pattern.split("").reduce(function(a,b){ return a+'[^'+b+']*'+b; });
```
Since you are complaining about an operation that takes about 3000 ns (as noted in the comments), it can be assumed you are doing a lot of queries. If there are very few patterns, you can cache your regexes. Underscore.js has a handy function that I'll demonstrate, but you could easily build the cache yourself.
```
 var cache = _.memoize(function(pattern){
 return new RegExp(pattern.split("").reduce(function(a,b){
 return a+'[^'+b+']*'+b;
 })
 })
 function fuzzy_match(str,pattern){
 return cache(pattern).test(str)
 };
```
If there is a lot of repetition among the tested strings, you should use these as a pattern instead, and the pattern as the tested string. This is even easier. I will also demonstrate how to scope your variables using the export pattern. Also, a bugfix must be inserted here (we can't assume all characters in the string are alphanumeric), to properly escape non-alphanumeric characters:
```
 var fuzzy_match = (function(){
 var cache = _.memoize(function(str){
 return new RexEgp("^"+str.replace(/./g, function(x){
 return /[\-\[\]\/\{\}\*\+\?\.\\\^\$\|]/.test(x) ? "\\"+x+"?" : x+"?";
 })+"$");
 })
 return function(str, pattern){
 return cache(str).test(pattern)
 })
 })()
```

Concerning the last regex:

Given some pattern "ace", the regex you build (/a.*c.*e/) tests if the string contains the characters of the pattern in the string, in the correct order.

If you want to test if a given string "abcde" is matched some pattern: The pattern must only contain the characters of the string, in the correct order: /^a?b?c?d?e?$/. To do this, we regex-escape every regex special character (pattern source: CoolAj86 ), make every character optional (?), and flank the regex with string boundary anchors.

Source Link

answered Mar 14, 2013 at 12:37

John Dvorak

answered Mar 14, 2013 at 12:37

John Dvorak

1.2k
9
15

default