Showing posts with label UTS #10. Show all posts
Showing posts with label UTS #10. Show all posts
Friday, April 13, 2018
Last Call on Unicode 11.0 Review
[画像:stopwatch image ]The beta review period for Unicode 11.0 and related technical standards will close
on April 23, 2018. This is the last opportunity for technical comments before
version 11.0 is released in Q2 2018. Implementers and interested parties are
encouraged to download data files, review proposed updates, and submit comments.
Unicode 11.0 adds seven new scripts, including Hanifi Rohingya, 66 additional emoji characters, including four new components for hair color (for a total of 157 emoj sequences). The set of Georgian Mtavruli capital letters has been added to support modern casing practices.
UAX #14, Unicode Line Breaking Algorithm
Unicode 11.0 adds seven new scripts, including Hanifi Rohingya, 66 additional emoji characters, including four new components for hair color (for a total of 157 emoj sequences). The set of Georgian Mtavruli capital letters has been added to support modern casing practices.
- For more information about testing the 11.0 beta, see unicode.org/versions/beta-11.0.0.html
- For the current draft summary of Unicode 11.0, see unicode.org/versions/Unicode11.0.0
UAX #14, Unicode Line Breaking Algorithm
- Uses Extended_Pictographic property for future-proofing
- New support for Indic virama handling
- Uses Extended_Pictographic property for future-proofing
- A new table of formal regex definitions
- Refines the use of ZWJ in identifiers
- Broadens the definition of hashtag identifiers
- Five new fields and improved regular expressions.
- Document extension of Unihan properties to non-Unihan
- New property Equivalent_Unified_Ideograph
- New regular expressions Bidi_Paired_Bracket & Equivalent_Unified_Ideograph
- More discussion of emoji variation sequences
- Clarification of values allowed for the Age property
- Updates data to Unicode 11.0
- Clarification of search tailoring in visual-order scripts
- Updates data to Unicode 11.0
- Enhances discussions of joining controls & combining sequences
- Updates data to Unicode 11.0
- Changes the format of the test file for arbitrary input settings
- Updates input setting for Transitional_Processing
- Supplies Extended_Pictographic property for future-proofing
- Simplifies emoji sequence definitions
- EBNF and Regex expressions for loose matches
- More proposed guidelines: gender-neutral emoji, skin-tone modifiers, ZWJ visible fallbacks, hair-style components
- Mechanism for changing the “facing” direction for emoji
Tuesday, June 21, 2016
Announcing The Unicode® Standard, Version 9.0
🥂Version 9.0 of the Unicode Standard is now available. Version 9.0 adds exactly 7,500 characters, for a total of 128,172 characters. These additions include six new scripts and 72 new emoji characters.
The new scripts and characters in Version 9.0 add support for lesser-used languages worldwide, including:
For the full list, see emoji additions for Unicode 9.0. For a detailed description of support for emoji characters by the Unicode Standard, see UTR #51, Unicode Emoji.
Three other important Unicode specifications have been updated for Version 9.0:
The new scripts and characters in Version 9.0 add support for lesser-used languages worldwide, including:
- Osage, a Native American language
- Nepal Bhasa, a language of Nepal
- Fulani and other African languages
- The Bravanese dialect of Swahili, used in Somalia
- The Warsh orthography for Arabic, used in North and West Africa
- Tangut, a major historic script of China
- 19 symbols for the new 4K TV standard
- 72 emoji characters such as the following
Smileys & people
🤣
ROLLING ON THE FLOOR LAUGHING
🤦
FACE PALM
Hand gestures
🤞
HAND WITH INDEX AND
MIDDLE FINGERS CROSSED
Animals
🦋
BUTTERFLY
Food
🥑
AVOCADO
🥘
SHALLOW PAN
OF FOOD
Drink
🥂
CLINKING GLASSES
Travel
🛵
MOTOR SCOOTER
Sports
🤸
PERSON DOING
CARTWHEEL
For the full list, see emoji additions for Unicode 9.0. For a detailed description of support for emoji characters by the Unicode Standard, see UTR #51, Unicode Emoji.
Three other important Unicode specifications have been updated for Version 9.0:
- UTS #10, Unicode Collation Algorithm — sorting Unicode text
- UTS #39, Unicode Security Mechanisms — reducing Unicode spoofing
- UTS #46, Unicode IDNA Compatibility Processing — compatible processing of non-ASCII URLs
Wednesday, June 17, 2015
Announcing The Unicode® Standard, Version 8.0
Version 8.0 of the Unicode Standard is now available. It includes
41 new emoji characters (including five modifiers for diversity), 5,771 new
ideographs for Chinese, Japanese, and Korean, the new Georgian lari currency
symbol, and 86 lowercase Cherokee syllables. It also adds letters to existing
scripts to support Arwi (the Tamil language written in the Arabic script), the
Ik language in Uganda, Kulango in the Côte d’Ivoire, and other languages of
Africa. In total, this version adds 7,716 new characters and six new scripts.
The first version of Unicode Technical Report #51, Unicode Emoji is being released at the same time. That document describes the new emoji characters. It provides design guidelines and data for improving emoji interoperability across platforms, gives background information about emoji symbols, and describes how they are selected for inclusion in the Unicode Standard. The data is used to support emoji characters in implementations, specifying which symbols are commonly displayed as emoji, how the new skin-tone modifiers work, and how composite emoji can be formed with joiners. The Unicode website now supplies charts of emoji characters, showing vendor variations and providing other useful information.
The 41 new emoji in Unicode 8.0 include the following:
(For the full list, including images, see emoji additions for Unicode 8.0.)
Phones and computers often need operating system updates to support new emoji, which may take some time. It is also now clear which existing characters, such as the often requested SHOPPING BAGS, can be used as emoji. Once phones and computers support these characters, people will be able to see colorful images such as the BOTTLE WITH POPPING CORK above.
Three other important Unicode specifications are updated for Version 8.0:
The first version of Unicode Technical Report #51, Unicode Emoji is being released at the same time. That document describes the new emoji characters. It provides design guidelines and data for improving emoji interoperability across platforms, gives background information about emoji symbols, and describes how they are selected for inclusion in the Unicode Standard. The data is used to support emoji characters in implementations, specifying which symbols are commonly displayed as emoji, how the new skin-tone modifiers work, and how composite emoji can be formed with joiners. The Unicode website now supplies charts of emoji characters, showing vendor variations and providing other useful information.
The 41 new emoji in Unicode 8.0 include the following:
Diversity
five emoji modifiers
Faces and Hands
NERD FACE, FACE WITH ROLLING EYES, ROBOT FACE
Food-Related
HOT DOG, TACO, CHEESE WEDGE, POPCORN
Sports
CRICKET BAT AND BALL, VOLLEYBALL, BOW AND ARROW
Animals
UNICORN FACE, LION FACE, CRAB, SCORPION
Religious
MOSQUE, SYNAGOGUE, PRAYER BEADS
(For the full list, including images, see emoji additions for Unicode 8.0.)
Phones and computers often need operating system updates to support new emoji, which may take some time. It is also now clear which existing characters, such as the often requested SHOPPING BAGS, can be used as emoji. Once phones and computers support these characters, people will be able to see colorful images such as the BOTTLE WITH POPPING CORK above.
Three other important Unicode specifications are updated for Version 8.0:
- UTS #10, Unicode Collation Algorithm — for sorting Unicode text
- UTS #39, Unicode Security Mechanisms — for reducing Unicode spoofing
- UTS #46, Unicode IDNA Compatibility Processing — for compatible processing of non-ASCII URLs
Wednesday, September 24, 2014
Proposed Update UAXes for Unicode 8.0
Proposed updates for several of the Unicode Standard Annexes for Version 8.0 of
the Unicode Standard have been posted for public review. See
http://www.unicode.org/review/ for
details and links to the various documents.
UTS #10, Unicode Collation Algorithm has also been posted for public review. In this update, Cyrillic contractions have been removed. See the Modifications section of the draft document for further information.
Review periods for provision of feedback on these proposed updates close on October 20, 2014 for the November UTC meeting, but there will be further opportunities for feedback on the annexes after that November meeting.
To supply feedback on these issues, please see http://www.unicode.org/review/#feedback
UTS #10, Unicode Collation Algorithm has also been posted for public review. In this update, Cyrillic contractions have been removed. See the Modifications section of the draft document for further information.
Review periods for provision of feedback on these proposed updates close on October 20, 2014 for the November UTC meeting, but there will be further opportunities for feedback on the annexes after that November meeting.
To supply feedback on these issues, please see http://www.unicode.org/review/#feedback
Monday, June 16, 2014
Announcing The Unicode Standard, Version 7.0
ruble signVersion 7.0 of the Unicode Standard is now available, adding 2,834 new characters. This latest version adds the new currency symbols for the Russian ruble and Azerbaijani manat, approximately 250 emoji (pictographic symbols), many other symbols, and 23 new lesser-used and historic scripts, as well as character additions to many existing scripts. These additions extend support for written languages of North America, China, India, other Asian countries, and Africa. For full details, see http://www.unicode.org/versions/Unicode7.0.0/.
Most of the new emoji characters derive from characters in long-standing and widespread use in Wingdings and Webdings fonts. Additions to emoji characters include, for example:
Unicode character properties were extended to the new characters. The old characters have enhancements to Script and Alphabetic properties, and casing and line-breaking behavior. There were also nearly 3,000 new Cantonese pronunciation entries, as well as new or clarified stability policies for promoting interoperable implementations.
Two other important Unicode specifications are maintained in synchrony with the Unicode Standard, and have updates for Version 7.0. These will be released at the same time:
Most of the new emoji characters derive from characters in long-standing and widespread use in Wingdings and Webdings fonts. Additions to emoji characters include, for example:
Major enhancements were made to the Indic script properties. New property values were added to enable a more algorithmic approach to rendering Indic scripts. These include properties for joining behavior, new classes for numbers, and a further division of the syllabic categories of viramas and rephas. With these enhancements, the default rendering for newly added Indic scripts can be significantly improved.
Unicode character properties were extended to the new characters. The old characters have enhancements to Script and Alphabetic properties, and casing and line-breaking behavior. There were also nearly 3,000 new Cantonese pronunciation entries, as well as new or clarified stability policies for promoting interoperable implementations.
Two other important Unicode specifications are maintained in synchrony with the Unicode Standard, and have updates for Version 7.0. These will be released at the same time:
- UTS #10, Unicode Collation Algorithm — the standard for sorting Unicode text
- UTS #46, Unicode IDNA Compatibility Processing — for processing of non-ASCII URLs (IDNs)
Friday, December 13, 2013
Unicode 7.0 Annexes Available for Early Review
As technical work gets underway to prepare the publication of Unicode 7.0 (tentatively scheduled for June, 2014), the Unicode Technical Committee has posted proposed updates for several important specifications:
PRI #260, Proposed Update UTS #10, Unicode Collation Algorithm
PRI #261, Proposed Update UAX #15, Unicode Normalization Forms
PRI #262, Proposed Update UAX #44, Unicode Character Database
In UTS #10, collation weights are discussed more generically, with fewer references to the 16-bit weights used in the DUCET. Section 6.3.2, Large Values for Secondary or Tertiary Weights was merged into Section 6.2, Large Weight Values. In UAX #44, the derivation of the Alphabetic property has been updated and the discussion of @missing in Section 4.2.10 @missing Conventions has been simplified to reflect the revised conventions in the UCD data files, which eliminated special edge cases.
Review periods for these new public review issues close January 27, 2014. For details about reviewing and commenting, please see the Public Review Issues page.
http://unicode-inc.blogspot.com/2013/12/unicode-70-annexes-available-for-early.html
PRI #260, Proposed Update UTS #10, Unicode Collation Algorithm
PRI #261, Proposed Update UAX #15, Unicode Normalization Forms
PRI #262, Proposed Update UAX #44, Unicode Character Database
In UTS #10, collation weights are discussed more generically, with fewer references to the 16-bit weights used in the DUCET. Section 6.3.2, Large Values for Secondary or Tertiary Weights was merged into Section 6.2, Large Weight Values. In UAX #44, the derivation of the Alphabetic property has been updated and the discussion of @missing in Section 4.2.10 @missing Conventions has been simplified to reflect the revised conventions in the UCD data files, which eliminated special edge cases.
Review periods for these new public review issues close January 27, 2014. For details about reviewing and commenting, please see the Public Review Issues page.
http://unicode-inc.blogspot.com/2013/12/unicode-70-annexes-available-for-early.html
Subscribe to:
Comments (Atom)