I've got the following string: |Africa||Africans||African Society||Go Africa Go||Mafricano||Go Mafricano Go||West Africa|.
I am trying to write a regular expression that only matches terms that include the word Africa or any deriative of it (meaning yes to all terms above except for |Mafricano| and |Go Mafricano Go|. Each term is enclosed between two |.
Right now I've come up with: /\|[^\|]*africa[^\|]*\|/gi, which says:
\|Match|
[^\|]*Match zero to unlimited instances of any character except|
africaMatchafricaliterally
[^\|]*Match zero to unlimited instances of any character except|
\|Match|
I've tried inserting ((?:\s)|(?!\w)) to make it /\|[^\|]*((?:\s)|(?!\w))africa[^\|]*\|/gi. Although it succeeds in excluding |Mafricano| and |Go Mafricano Go|, it also excludes all other entries except for |West Africa| and |Go Africa Go|. So that is good but it needs to include all single word Africa and its derived forms too.
Can anybody help me?
2 Answers 2
You can use this regex
[^|]*\bAfrica[a-z]*\b[^|]*
var str = "|Africa||Africans||African Society||Go Africa Go||Mafricano||Go Mafricano Go||West Africa|";
var arr = str.match(/[^|]*\bAfrica[a-z]*\b[^|]*/g);
console.log(arr); // ["Africa", "Africans", "African Society", "Go Africa Go", "West Africa"]
Comments
I think you want something like this,
\|(?:(?!Mafrica|\|).)*?africa(?:(?!Mafrica|\|).)*?\|
> "|Africa||Africans||African Society||Go Africa Go||Mafricano||Go Mafricano Go||West Africa|".match(/\|(?:(?!Mafrica|\|).)*?africa(?:(?!Mafrica|\|).)*?\|/gi);
[ '|Africa|',
'|Africans|',
'|African Society|',
'|Go Africa Go|',
'|West Africa|' ]
Don't forget to turn on the i modifier to do a case insensitive match.
Explanation:
\| '|'
(?: group, but do not capture (0 or more
times):
(?! look ahead to see if there is not:
Mafrica 'Mafrica'
| OR
\| '|'
) end of look-ahead
. any character except \n
)*? end of grouping
africa 'africa'
(?: group, but do not capture (0 or more
times):
(?! look ahead to see if there is not:
Mafrica 'Mafrica'
| OR
\| '|'
) end of look-ahead
. any character except \n
)*? end of grouping
\| '|'
3 Comments
Mafricano, so this would not scale for other entries. Thanks though.
replace? It should be much more easier.match()function