More regex stuff by me:

• Awesome Regex
List of the best regex resources

• Regex+
JS regexes + future

Regular Expressions Cookbook (book cover)

“Regular Expressions Cookbook manages to be simultaneously accessible and almost ridiculously comprehensive.”
—Jeff Atwood

Unicode
XRegExp.matchRecursive
XRegExp.build

Addons

If you want, you can download XRegExp bundled with all addons as xregexp-all.js . Alternatively, you can download the individual addon scripts from GitHub. XRegExp's npm package uses xregexp-all.js.

Unicode

The Unicode Base script adds base support for Unicode matching via the \p{…} syntax. À la carte token addon packages add support for Unicode categories, scripts, and other properties. All Unicode tokens can be inverted using \P{…} or \p{^…}. Token names are case insensitive, and any spaces, hyphens, and underscores are ignored. You can omit the braces for token names that are a single letter.

Example

// Categories
XRegExp('\\p{Sc}\\pN+'); // Sc = currency symbol, N = number
// Can also use the full names \p{Currency_Symbol} and \p{Number}
// Scripts
XRegExp('\\p{Cyrillic}');
XRegExp('[\\p{Latin}\\p{Common}]');
// Can also use the Script= prefix to match ES2018: \p{Script=Cyrillic}
// Properties
XRegExp('\\p{ASCII}');
XRegExp('\\p{Assigned}');
// In action...
const unicodeWord = XRegExp("^\\pL+$"); // L = letter
unicodeWord.test("Русский"); // true
unicodeWord.test("日本語"); // true
unicodeWord.test("العربية"); // true
XRegExp("^\\p{Katakana}+$").test("カタカナ"); // true

By default, \p{…} and \P{…} support the Basic Multilingual Plane (i.e. code points up to U+FFFF). You can opt-in to full 21-bit Unicode support (with code points up to U+10FFFF) on a per-regex basis by using flag A. In XRegExp, this is called astral mode. You can automatically add flag A for all new regexes by running XRegExp.install('astral'). When in astral mode, \p{…} and \P{…} always match a full code point rather than a code unit, using surrogate pairs for code points above U+FFFF.

// Using flag A to match astral code points
XRegExp('^\\pS$').test('💩'); // -> false
XRegExp('^\\pS$', 'A').test('💩'); // -> true
// Using surrogate pair U+D83D U+DCA9 to represent U+1F4A9 (pile of poo)
XRegExp('^\\pS$', 'A').test('\uD83D\uDCA9'); // -> true
// Implicit flag A
XRegExp.install('astral');
XRegExp('^\\pS$').test('💩'); // -> true

Opting in to astral mode disables the use of \p{…} and \P{…} within character classes. In astral mode, use e.g. (\pL|[0-9_])+ instead of [\pL0-9_]+.

XRegExp.matchRecursive

See API: XRegExp.matchRecursive .

XRegExp.build

See API: XRegExp.build .

XRegExp

The one of a kind JavaScript regular expression library

Table of contents

Addons

Unicode

Example

XRegExp.matchRecursive

XRegExp.build