Showing posts with label RGI. Show all posts
Showing posts with label RGI. Show all posts
Thursday, June 18, 2020
Unicode Regular Expressions v21 Released
Regex image Regular expressions are a powerful tool for using patterns to search and modify text, and are vital in many programs, programming languages, databases, and
spreadsheets.
Starting in 1999, UTS #18: Unicode Regular Expressions has supplied guidelines and conformance levels for supporting Unicode in regular expressions. The new version 21 broadens the scope of properties for regular expressions (regex) to allow for properties of strings (such as for emoji sequences). For example, the following matches all emoji flags except the French flag:
Starting in 1999, UTS #18: Unicode Regular Expressions has supplied guidelines and conformance levels for supporting Unicode in regular expressions. The new version 21 broadens the scope of properties for regular expressions (regex) to allow for properties of strings (such as for emoji sequences). For example, the following matches all emoji flags except the French flag:
/[\p{RGI_Emoji_Flag_Sequence}--\q{🇫🇷}]/
Among the improvements are:- Provides a new Annex D: Resolving Character Classes with Strings for handling negations of sets of strings.
- Updates the full property list to include the latest UCD properties, plus Emoji properties and UTS #39 properties.
- Removes obsolete text passages, and makes editorial changes for clarity.
Thursday, November 21, 2019
Call for feedback on UTS #18: Unicode Regular Expressions
Regex image Regular expressions are a powerful tool for using patterns to search and modify text. They are a key component of many programming languages, databases, and
spreadsheets.
Starting in 1999, UTS #18: Unicode Regular Expressions has supplied guidelines and conformance levels for supporting Unicode in regular expressions. A proposed update of that specification is now available for public review and comment. The following are the main modifications in this draft:
The review period closes on 2020年01月06日. For more information on reviewing and supplying feedback, see Proposed Update UTS #18, Unicode Regular Expressions.
Starting in 1999, UTS #18: Unicode Regular Expressions has supplied guidelines and conformance levels for supporting Unicode in regular expressions. A proposed update of that specification is now available for public review and comment. The following are the main modifications in this draft:
- Broadened the scope of properties to allow for properties of strings (as well as properties of code points).
- Added 11 Emoji properties including RGI sets as Full Properties in Level 2.
- Added other new properties as Full Properties in Level 2: Equivalent_Unified_Ideograph, Vertical_Orientation, Regional_Indicator, Indic_Positional_Category, Indic_Syllabic_Category.
- Provided a draft data file with property metadata for matching and validating non-UCD properties and their values for syntax such as \p{pname=pvalue}, so that such properties can be used in the same way as UCD properties. See Annex D.
The review period closes on 2020年01月06日. For more information on reviewing and supplying feedback, see Proposed Update UTS #18, Unicode Regular Expressions.
Subscribe to:
Comments (Atom)