The Unicode Blog: 2011

Tuesday, December 13, 2011

Two New Public Review Issues: UTR #36, UTS #39

The Unicode Technical Committee has posted two new issues for public
review and comment. Details are on the following web page:

http://www.unicode.org/review/

Review periods for the new items close on January 30, 2012.

Please see the page for links to discussion and relevant documents.
Briefly, the new issues are:

Issue #208 Proposed Update UTR #36: Unicode Security Considerations
http://www.unicode.org/review/pri208/

This UTR is being prepared for an update to bring the IDNA 2008
references up to date. Public review and comment is invited on this draft.

Issue #209 Proposed Update Unicode Technical Standard #39 Unicode
Security Mechanisms
http://www.unicode.org/review/pri209/

This UTS is being prepared for an update to align with Unicode 6.1.
Public review and comment is invited on this draft.

To supply feedback on these issues, see
http://www.unicode.org/review/#feedback .

----
All of the Unicode Consortium lists are strictly opt-in lists for members
or interested users of our standards. We make every effort to remove
users who do not wish to receive e-mail from us. To see why you are getting
this mail and how to remove yourself from our lists if you want, please
see http://www.unicode.org/consortium/distlist.html#announcements

Posted by Unicode, Inc. at 12:20 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

CLDR v21 Milestone 2 available for testing

Milestone releases of CLDR provide an opportunity to test a snapshot of the next version of CLDR; they are not intended for use in production. CLDR v21 is not a data submission release; instead, the CLDR group is engaged in improving tools, and making specific changes to data.

Note that the CLDR v21 release is intended to support Unicode 6.1, and depends on some new Unicode 6.1 property values for grapheme break and line break. This Milestone 2 release depends on values from the beta versions of Unicode 6.1 data files.

New additions in this Milestone 2 release include:

Changes to the segmentation data to match Unicode 6.1. The behaviors associated with the former "th" grapheme break tailoring and "he" line break tailoring have been moved into the root behavior, so those tailorings are no longer necessary and have been deleted.
Two new calendar element structures needed for support of the Chinese lunar calendar (and other calendars such as the Hindu lunar calendars); for more information see http://cldr.unicode.org/development/development-process/design-proposals/chinese-calendar-support:

Addition of the <monthPatterns> element structure to indicate how to modify standard month names to mark intercalary leap months, as well as (for some calendars) months adjacent to leap months and combined months. This is supported via the standard month pattern characters 'M' and 'L', so the pattern character 'l' (SMALL LETTER L) formerly provided as a way to mark leap months has been deprecated (it was never supported by underlying data).
Addition of the <cyclicNameSets> element structure to support cyclic names for years (and other calendar entities in some calendars).

A new "ar_001" locale for Modern Standard Arabic as the default content for "ar". This will permit the "ar_EG" locale (formerly the default content for "ar") to use some Egypt-specific names.
Addition of codes for South Sudan
Other specific data fixes such as for Ukrainian collation, Ewe day periods, various metazones, and some specific translation errors.

Highlights in the Milestone 1 release (Sept. 29) included:

Work in support of pending -t- extension in BCP47
Deprecation of 'commonlyUsed' element in timezone names
Removal of "whole-locale" aliases (data for constructing is in supplementaldata.xml)
First cut at incorporating European Ordering Rules (EOR)

The data is available from SVN under "tag/release-21-d02" as described in

http://cldr.unicode.org/index/downloads#latest_draft_version

The full list of changes in this milestone is

http://unicode.org/cldr/trac/query?milestone=21m2

The current draft LDML specification is

http://unicode.org/repos/cldr/trunk/docs/web/tr35.html

Posted by Unicode, Inc. at 12:19 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Thursday, October 6, 2011

UAX #15 and Chapter 3 of Unicode Standard Updated Again for Unicode 6.1 Beta

The text of Chapter 3, Conformance of the Unicode Standard core specification, and the Version 6.1.0 Proposed Update for UAX #15, Unicode Normalization Forms have been updated for the ongoing beta review for Version 6.1.0. These latest changes move the example code for various normative Hangul-related algorithms into the immediate vicinity of the definition of those algorithms in the core specification. Please see PRI #191 and PRI #206 for details. http://www.unicode.org/review/pri191/ http://www.unicode.org/review/pri206/

Please also note that the public review periods for many open issues, especially the 6.1 beta review, will be closing on October 24 ahead of the November UTC meeting.

Posted by Unicode, Inc. at 2:39 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

New version of UTR #45 published

An new version of UTR #45, U-Source Ideographs has been published. http://www.unicode.org/reports/tr45/tr45-5.html This version corrects the syntax of U-source identifiers to match ISO/IEC 10646 data files. It also introduces a new syntax with "UCI" labels to identify U-sources for unified ideographs which have been "orphaned" of their original sources.

Posted by Unicode, Inc. at 2:37 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Wednesday, October 5, 2011

Proposed Draft UTR #50, Unicode Properties for Vertical Text Layout now available

The Unicode Technical Committee has posted a new issue for public
review and comment. Details are on the following web page:

http://www.unicode.org/review/pri207/

Review period for the new item closes on October 24, 2011.

Please see the page for links to discussion and relevant documents. Briefly, the new issue is:

PRI #207, Proposed Draft UTR #50, Unicode Properties for Vertical Text Layout

The layout of Japanese text follows different conventions than the layout of Western texts. Many of the requirements are described in the W3C Working Group Note "Requirements for Japanese Text Layout". This new proposed draft technical report describes two Unicode character properties which can be used to implement those requirements.

This is a moderated Public Review Issue. An unmoderated discussion takes place on the designated forum:
http://www.unicode.org/forum/viewforum.php?f=35
The moderator of this PRI will summarize the discussion and provide it as formal feedback to the Unicode Technical Committee. Towards the end of the PRI, formal feedback will also be accepted from other parties, if they feel that the summary does not reflect accurately their concerns.

Posted by Unicode, Inc. at 2:14 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Friday, September 23, 2011

PRI #185 has been modified

PRI 185 Extension of UBA for improved display of URL/IRIs has been modified as per discussion in the Unicode Technical Committee, based on feedback received. It is also now a moderated public review issue, to allow for extended informal discussion of the issues. For details, see http://www.unicode.org/review/pri185/. A forum has been created for this PRI, and further discussion should take place there, rather than via the email lists. See http://www.unicode.org/forum/viewforum.php?f=34

Posted by Unicode, Inc. at 5:39 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Monday, September 19, 2011

Unicode 6.1 Beta Review

Mountain View, CA, USA – September 19, 2011 – The Unicode® Consortium today announced the start of beta review for the forthcoming Unicode 6.1.0. All beta feedback must be submitted by October 24, 2011.

Unicode is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, and smart phones; modern web protocols (HTML, XML,...); and internationalized domain names. Thus it is important to ensure a smooth transition to each new version of the Unicode Standard. Software developers and other experts are strongly encouraged to review the beta data files and documentation for Unicode 6.1.0 carefully, and to provide any feedback regarding errors or other issues to the Unicode Consortium. Software developers can also get an early start in testing their programs with the beta data files so they they will be ready for the release of Unicode 6.1.0 in February, 2012.

See http://www.unicode.org/versions/beta-6.1.0.html, for information about testing the 6.1.0 beta.
See http://www.unicode.org/versions/Unicode6.1.0/ for the current draft summary of Unicode 6.1.0.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.

The membership of the consortium represents a broad spectrum of corporations and organizations in the computer and information processing industry. Members are: Adobe Systems, Apple, Google, Government of Andhra Pradesh, Goverment of Bangladesh, Government of India, IBM, Microsoft, Monotype Imaging, Oracle, Rearden Commerce, SAP, Tamil Virtual University, The Society for Natural Language Technology Research, The University of California (Berkeley), Yahoo!, plus well over a hundred Associate, Liaison, and Individual members.

For more information, please contact the Unicode Consortium http://www.unicode.org/contacts.html.

Posted by Unicode, Inc. at 5:56 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Friday, September 2, 2011

Proposed Update UAX #14 now availalbe for Unicode 6.1

The Proposed Update for UAX #14 has been substantially updated with changes intended for Unicode 6.1. These changes include:

Add rule 21a, don't break after Hebrew + hyphen.
Introduce character class HL (Hebrew Letter).
Include the proposal to move small kana from class NS to class ID, to align UAX 14 more closely with CSS "normal" behavior.

Details are on the PRI page for this UAX: http://www.unicode.org/review/pri190/

Posted by Unicode, Inc. at 3:03 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

PRI #205: Proposed addition of AL MARK and LEVEL DIRECTION MARK

The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page:

http://www.unicode.org/review/pri205/
Review period for the new item closes October 24, 2011. Please see the page for links to discussion and relevant documents. Briefly, the new issue is:

PRI #205 Proposed addition of AL MARK and LEVEL DIRECTION MARK

The UTC is considering proposals for two characters to help address various difficult issues in bidirectional text layout. These two characters are similar to the already-encoded LRM and RLM.

1. The proposed AL MARK (ALM) provides a direction mark with Bidi_Class AL in order to complete the set of direction marks with strong direction classes (L, R, AL). It requires no additional Bidi_Class value or change to the Unicode Bidirectional Algorithm (UBA).

2. The proposed LEVEL DIRECTION MARK (LDM) behaves like a direction mark which dynamically takes on the resolved direction associated with the current embedding level. While the optimum implementation would use a new Bidi_Class value, this is prohibited by the Unicode Character Encoding Stability Policy. Several alternatives are described. Note: In earlier discussions the LDM was referred to as EMBEDDING LEVEL MARK (ELM).

Please see the background document for details, including alternative implementation options.

Posted by Unicode, Inc. at 3:02 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

PRI #204: Proposed Update for UTS #46: Unicode IDNA Compatibility Processing

The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page: http://www.unicode.org/review/

Review period for the new item closes on October 24, 2011.

Please see the page for links to discussion and relevant documents. Briefly, the new issue is:

PRI #204 Proposed Update for UTS #46: Unicode IDNA Compatibility
Processing
http://www.unicode.org/review/pri204/

This proposed update aligns with Unicode 6.1, and includes the following changes:

Changed text to be more version-independent.
Updated figures in Table 4.
Added NV8 values to Table 2b, Data File Fields and to Section 8, Conformance Testing.

To supply feedback on this issue, see
http://www.unicode.org/review/#feedback .

Posted by Unicode, Inc. at 3:00 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Thursday, September 1, 2011

Proposed Update UAX #42 for Unicode 6.1 now available for public review

Unicode Standard Annex #42, Unicode Character Database in XML, will be updated for Unicode 6.1. The proposed update is now available for general public review and comment.
Review period for this issue closes October 24, 2011. Details are on the following web page: http://www.unicode.org/review/pri198/
Changes include:

Updated the patterns for kIRG_USource and kMandarin.
New values for the jg attribute: Rohingya_Yeh.
New values for the script attribute: Cakm, Merc, Mero, Plrd, Shrd, Sora, Takr.
The values of the ccc are now restricted to 0..254, instead of 0..255.
New value for the age attribute: 6.1.

To supply feedback on this issue, see http://www.unicode.org/review/#feedback .

Posted by Unicode, Inc. at 5:17 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Tuesday, August 30, 2011

PRI #203: Proposed Update UTS #10: Unicode Collation Algorithm

The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page:

http://www.unicode.org/review/

Review period for the new item closes on October 24, 2011.

Please see the page for links to discussion and relevant documents. Briefly, the new issue is:

PRI #203 Proposed Update for UTS #10: Unicode Collation Algorithm
http://www.unicode.org/review/pri203/

This proposed update aligns with Unicode 6.1, and includes the following changes:

A major revision to the ordering of variable characters into groups, separating punctuation and symbols. Some other characters have changed ordering as well.
A new option for sorting, IgnoreSP, that ignores (shifts) only whitespace and punctuation (and not general symbols)
Clarifications or fixes to text on soft-hyphen, contiguous weights, and collation grapheme clusters.
A new section on asymmetric search.
Review notes for new UCA verification tables, and several other areas where the text will be changed.

To supply feedback on this issue, see
http://www.unicode.org/review/#feedback .

Posted by Unicode, Inc. at 5:55 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Monday, August 29, 2011

New Unicode Acknowledgements Page

The Consortium is pleased to announce a new section on our website devoted to acknowledging the many contributors to the Unicode Standard and its related standards: http://www.unicode.org/acknowledgements/

Expect to see additional changes to our font acknowledgments section, as well as the addition of new names as the information becomes available.

Posted by Unicode, Inc. at 5:27 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Friday, August 26, 2011

PRI #202: Extensions to NameAliases.txt for Unicode 6.1.0

The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page:

http://www.unicode.org/review/pri202/

Review periods for the new items close on October 24, 2011.

Please see the page for links to discussion and relevant documents. Briefly, the new issue is:

PRI #202: Extensions to NameAliases.txt for Unicode 6.1.0

The UTC is planning to extend the format and content of the Unicode Character Database file NameAliases.txt for Unicode 6.1.0. In addition to the current scope of NameAliases.txt, which covers the definition of formal name aliases for characters whose names have serious mistakes in them, the intent is to add various standard and de facto aliases for control characters, which have no names defined for them in the Unicode Standard, as well as various character abbreviations which are in widespread use.

Details of the proposal and draft data file are available on the public review issue page: http://www.unicode.org/review/pri202/

To supply feedback on this issue, see http://www.unicode.org/review/#feedback .

Posted by Unicode, Inc. at 2:04 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Thursday, August 18, 2011

UTS #37 Updated to Version 3

A new version of UTS #37, Unicode Ideographic Variation Selectors is now available. Please see: http://www.unicode.org/reports/tr37/tr37-7.html In this version, there are new requirements for registrations as well as several editorial updates and clarifications.

Posted by Unicode, Inc. at 2:50 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Thursday, August 11, 2011

Proposed new characters updated in Pipeline Table

The Pipeline Table for proposed new characters has been updated to reflect recent decisions by the UTC. Changes include the approval of characters for Grantha and several other historic scripts, and many new symbols to cover the popular sets of Wingdings. Please see http://www.unicode.org/alloc/Pipeline.html .

The information in the Pipeline Table is intended only to summarize the interim status of the committee progress on processing new character proposals. The Pipeline Table a good place to find out what's on the horizon for future versions of the Unicode Standard, before the encoding is finalized. However, because the committee review continues, the information is necessarily subject to change. Products should be created based only on final, public release of a version of the standard.

Posted by Unicode, Inc. at 8:16 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Wednesday, August 10, 2011

Public Review Issues Extended

The closing dates of the Public Review Issues for all proposed updates to Unicode Standard Annexes for Unicode Version 6.1 have been extended to October 24, 2011. The text of UAX #44 has also been updated to take into account feedback reviewed at the UTC meeting last week.

http://www.unicode.org/review/

Review periods for the following issues have also been extended to October 24, 2011. The text of PRI #182 and PRI #185 will be changed, and a separate announcement will be made about the changes.

177 Proposed Update UTS #46: Unicode IDNA Compatibility Processing
182 Proposed Update UTS #18: Unicode Regular Expressions
185 Revision of UBA for improved display of URL/IRIs

To supply feedback on any of these issues, see http://www.unicode.org/review/#feedback .

Posted by Unicode, Inc. at 7:11 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Wednesday, July 27, 2011

PRI #201 - Proposed Update UTR #45 now available

The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page:

http://www.unicode.org/review/pri201/

A proposed update for UTR #45 has been posted. This update corrects the syntax of U-source identifiers to match ISO/IEC 10646 data files. It also introduces a new syntax with "UCI" labels to identify U-sources for unified ideographs which have been "orphaned" of their original sources.

Note that the review period for the proposed update has been compressed, so that the updated U-source listing and syntax will be available for use in the 10646 3rd Edition. Review feedback should be posted in time for consideration at the UTC meeting scheduled for August 1 - 5, 2011.

To supply feedback on this issue, see http://www.unicode.org/review/#feedback

Posted by Unicode, Inc. at 10:55 AM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Monday, July 25, 2011

Proposed Updates to Unicode Standard Annexes for Unicode 6.1

The proposed update documents for some Unicode Standard Annexes have been updated. These updates include:

UAX #15: Added an implementation note about using 255 as an implementation specific value for optimization of tables containing Canonical_Combining_Class values.

UAX #29: Updated the discussion of legacy grapheme clusters for Thai. Moved the section on Hangul syllable boundary determination to a new section in this UAX, from Chapter 3 of the Core Specification. Made other small editorial fixes.

UAX #31: Supplied the recommended usage for identifiers for new scripts encoded in Unicode 6.1. Clarified the status of lists of Other_ID_Start and Other_ID_Continue characters.

Posted by Unicode, Inc. at 7:18 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Call for Bulldog Award Nominations

Dear Unicode community,

As many of you already know, the Unicode Consortium sponsors an annual award for outstanding personal contributions to the philosophy and dissemination of the Unicode Standard. Known as the "Bulldog Award," it is presented at the Unicode conference to recognize "those tenacious champions of Unicode who have produced solid achievements in promoting its use around the globe".

The Consortium invites the Unicode community to nominate up to two people they believe are most deserving of this award. Nominations should include a brief rationale why the candidate would be a good choice. Please check http://unicode.org/conference/bulldog.html to see a list of past winners and send nominations to magda@unicode.org with "Bulldog Award nomination" in the subject line by August 29, 2011.

Executive officers and staff of the Unicode Consortium are not eligible for the award.

Thank you for your time.

Magda Danish

Senior Administrative Director

The Unicode Consortium

Posted by Unicode, Inc. at 7:07 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Monday, July 18, 2011

Unicode Releases Common Locale Data Repository, Version 2.0.1

Mountain View, CA, July 18, 2011 - The Unicode® Consortium announced today the release of a new version of the Unicode Common Locale Data Repository (Unicode CLDR 2.0.1), providing key building blocks for software to support the world's languages.

CLDR 2.0.1 is an minor release, with no new translations. It includes about 80 changes that were not ready for CLDR 2.0, including fixes for collation, number spellout, format consistency, and metazone (timezone) data. It also now provides descriptions for all of the bcp47 data items in CLDR. For more information on what else has changed since the 2.0 release, see the CLDR 2.0.1 Release Note.

Unicode CLDR is by far the largest and most extensive standard repository of locale data. This data is used by a wide spectrum of companies for their software internationalization and localization: adapting software to the conventions of different languages for such common software tasks as formatting of dates, times, time zones, numbers, and currency values; sorting text; choosing languages or countries by name; transliterating different alphabets; and many others. Unicode CLDR 2.0.1 is part of the Unicode locale data project, together with the Unicode Locale Data Markup Language (LDML: http://unicode.org/reports/tr35/). LDML is an XML format used for general interchange of locale data, such as in Microsoft's .NET.

For web pages with different views of CLDR data, see http://cldr.unicode.org/index/charts. For more information about the Unicode CLDR project (including charts) see http://cldr.unicode.org/.

Posted by Unicode, Inc. at 7:46 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Wednesday, July 13, 2011

PRI #200: Draft UTR #49, Unicode Character Categories

The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page:

http://www.unicode.org/review/

Review period for the new item closes on July 27, 2011.

Please see the page for links to discussion and relevant documents. Briefly, the new issue is:

PRI #200 Draft UTR #49: Unicode Character Categories

This document presents an approach to the categorization of Unicode characters, and documents data files that implementers can use for defining and labeling Unicode character categories.

To supply feedback on this issue, please see http://www.unicode.org/review/#feedback

Posted by Unicode, Inc. at 4:48 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Thursday, July 7, 2011

Proposed Update UAXes for Unicode 6.1

Proposed updates for most Unicode Standard Annexes for Version 6.1 of the Unicode Standard have been posted for public review. See http://www.unicode.org/review/ for details and links to the various documents.

Review periods for provision of feedback on these proposed updates close on July 25, 2011 for the August UTC meeting, but there will be further opportunities for feedback on the annexes after that August meeting.

To supply feedback on these issues, please see http://www.unicode.org/review/#feedback

Posted by Unicode, Inc. at 1:58 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Tuesday, July 5, 2011

PRI #187: Second registration of sequences for the Hanyo-Denshi collection

The Unicode Consortium has posted a new issue for public review and comment.

Public Review Issue #187: A submission for the "Second registration of sequences for the Hanyo-Denshi collection" has been received by the IVD registrar. This submission is currently under review according to the procedures of UTS#37, Ideographic Variation Database, with an expected close date of 2011年10月05日.

Please see the submission page for details and instructions on how to review this issue and provide comments: http://www.unicode.org/ivd/pri/pri187/

For further information on Public Review Issues, please see: http://www.unicode.org/review/

If you wish to discuss issues on the Unicode forum or the Unicode mail list, then please use the following links to subscribe (if necessary). Please be aware that discussion comments on the Unicode mail list are not automatically recorded as input. You must use the reporting link above to generate comments for consideration.

http://www.unicode.org/forum/
http://www.unicode.org/consortium/distlist.html

Posted by Unicode, Inc. at 4:42 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Friday, July 1, 2011

PRI #186: Word-Joining Hyphen

The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page:

http://www.unicode.org/review/

Review period for the new item closes on July 27, 2011.

Please see the page for links to discussion and relevant documents. Briefly, the new issue is:

PRI #186 Word-Joining Hyphen

The Unicode Standard has different character properties for line-break and word-break behavior that reflect differences in behavior. The standard specifies that U+2011 NON-BREAKING HYPHEN disallows line breaks. The Unicode Technical Committee is currently considering whether this non-breaking behavior should be broadened to also affect word breaking behavior.

Details concerning this proposal are in the PRI itself:
http://www.unicode.org/review/pri186/

If you have comments for official UTC consideration, please post them by submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode forum or the Unicode mail list, then please use the following links to subscribe (if necessary). Please be aware that discussion comments on the Unicode mail list are not automatically recorded as input to the UTC. You must use the reporting link above to generate comments for UTC consideration.

http://www.unicode.org/forum/
http://www.unicode.org/consortium/distlist.html

Posted by Unicode, Inc. at 3:13 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

PRI #185: Revision of UBA for improved display of URL/IRIs

The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page:

http://www.unicode.org/review/

Review period for the new item closes on July 27, 2011.

Please see the page for links to discussion and relevant documents. Briefly, the new issue is:

PRI #185 Revision of UBA for improved display of URL/IRIs

The Unicode Bidirectional Algorithm (UBA), specified in UAX #9, was designed for handling ordinary text, and predated the rise of the web. Unfortunately, IRI/URLs are not ordinary text; they are syntactically complex in ways that don't work well with the UBA. That causes IRIs that contain right-to-left text (such as Arabic or Hebrew) to appear jumbled, to the point where the IRIs are either uninterpretable, misleading, or ambiguous. In particular the ambiguous displays could cause security problems.

The background document for this PRI provides a detailed description of the problem, and proposes a solution. The Unicode Technical Committee would like feedback on the feasibility of the proposal, and in particular, on the open issues listed in the background document.

If you have comments for official UTC consideration, please post them by submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode forum or the Unicode mail list, then please use the following links to subscribe (if necessary). Please be aware that discussion comments on the Unicode mail list are not automatically recorded as input to the UTC. You must use the reporting link above to generate comments for UTC consideration.

http://www.unicode.org/forum/
http://www.unicode.org/consortium/distlist.html

Posted by Unicode, Inc. at 3:10 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

PRI #184: Proposed update UTS 37 has been updated

The proposed update for UTS #37 has been updated with a minor change in section 3. Please see: http://www.unicode.org/review/pri184/ and the draft for details.
http://www.unicode.org/reports/tr37/tr37-6.html

Posted by Unicode, Inc. at 3:10 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Friday, June 24, 2011

Update to PRI #183: Supplementary registration of the AdobeJapan1 collection and of sequences in that collection

The Unicode Consortium has posted a minor update of an issue for public review and comment.

Public Review Issue #183: A minor update of the submission for the "Supplementary registration of the AdobeJapan1 collection and of sequences in that collection" has been received by the IVD registrar. The "complete charts" file (PDF) has been revised with better organization and presentation of supplementary information. The submitted glyphs, sequence identifiers, and base characters remain unchanged. This submission is still under review according to the procedures of UTS#37, Ideographic Variation Database, with an expected close date of 2011年09月20日.

Please see the submission page for details and instructions on how to review this issue and provide comments: http://www.unicode.org/ivd/pri/pri183/

For further information on Public Review Issues, please see:
http://www.unicode.org/review/

http://www.unicode.org/forum/
http://www.unicode.org/consortium/distlist.html

Posted by Unicode, Inc. at 1:05 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Monday, June 20, 2011

PRI #183: Supplementary registration of the AdobeJapan1 collection and of sequences in that collection

The Unicode Consortium has posted a new issue for public review and comment.

Public Review Issue #183: A submission for the "Supplementary registration of the AdobeJapan1 collection and of sequences in that collection" has been received by the IVD registrar. This submission is currently under review according to the procedures of UTS#37, Ideographic Variation Database, with an expected close date of 2011年09月20日.

Please see the submission page for details and instructions on how to review this issue and provide comments: http://www.unicode.org/ivd/pri/pri183/

For further information on Public Review Issues, please see:
http://www.unicode.org/review/

http://www.unicode.org/forum/
http://www.unicode.org/consortium/distlist.html

Posted by Unicode, Inc. at 5:28 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Monday, June 6, 2011

Program Announced for 35th Internationalization and Unicode Conference (IUC 35)

Mountain View, CA, USA -- June 1, 2011 -- The Unicode® Consortium today announced the program for the Thirty-fifth Internationalization & Unicode® Conference (IUC 35), taking place in Santa Clara, Calif., USA; October 17-19, 2011, sponsored by Adobe. The conference is produced by OMG®. http://www.unicodeconference.org/conference-at-a-glance.htm
This is the premier conference on technologies and practices for the creation and management of global and multilingual software solutions. This annual event is praised for its excellent technical content, industry-tested recommendations and updates on the latest standards.
The program committee has created an updated program full of new and cutting-edge topics that is relevant and engaging for the internationalization community. The three-day conference will feature a full day of tutorials followed by two days of presentations and discussions. The program features cases studies, best practices, effective software design, innovative technology, and information about Unicode 6.0.
Highlights of the Conference:
Tutorials in Three Tracks:

An Introduction to Writing Systems & Unicode
Unicode - a Grand Tour
Internationalization: An Introduction
Web Internationalization - Standards and Best Practices
Internationalization Testing Best Practices
Comprehensive Arabic Script
Smart Code Set Conversions for Unicode Support in Heterogeneous Environments
Building Multilingual Websites in Drupal7 and Joomla1.6
Using ICU Workshop

Tutorial presenters come from such organizations as Amazon, IBM, Intel, DecoType, Jim DeLaHunt & Associates, W3C, XenCraft, and Yahoo! Inc.
Sessions in Three Tracks include I18N, L10N, Platforms, Arabic/Bidi, Windows, Fonts, Languages, Scripts, ICU and East Asia. Some topics that will be covered include:

Social networking is important to your business. Learn more about
Twitter and other networks' global support.
Find out about the latest in HTML5, CSS3, Java 7, ICU, and the latest font developments.
Learn about speech internationalization at Google.
Are your localization practices in need of modernization? Learn how to reduce costs and time to market with crowd sourcing, machine translation and more.

Session Presenters come from such organizations as Adobe, Aoyama Gakuin University, Beijing UniHan Digital Tech Co. Ltd., Brill, DecoType, Diwan Software, Google, Inc., Government of India, HighTech Passport, IBM, Intel, Lab126, Microsoft, SIL International, Teradata, Tiro Typeworks, Twitter, Inc., University of Michigan, and W3C.
The Internationalization & Unicode Conference is the premier technical conference for both software and Web internationalization. Unicode and internationalization experts, implementers, clients and vendors are invited to attend this unique conference. The interactive format makes the Internationalization & Unicode Conference a great place to meet and exchange ideas with leading experts, find out about the needs of potential clients, or get information about new and existing Unicode and internationalization-enabled products.
IUC 35 is sponsored by Gold Sponsor Adobe and Media Sponsor Multilingual magazine. Additional sponsorships and exhibit space are available; for more information on sponsoring contact Ken Berk at ken.berk@omg.org, +1-781-444 0404. For exhibiting questions email event_marketing@omg.org. For all other questions email info@unicodeconference.org.
###
About the Unicode Consortium The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.
The membership of the consortium represents a broad spectrum of corporations and organizations in the computer and information processing industry. Members are: Adobe Systems, Apple, Google, Government of Bangladesh, Government of India, IBM, Microsoft, Monotype Imaging, Oracle, Rearden Commerce, SAP, The Society for Natural Language Technology Research, The University of California (Berkeley), Yahoo!, plus well over a hundred Associate, Liaison, and Individual members.
For more information, please contact the Unicode Consortium http://www.unicode.org/contacts.html.
About the Event Producer OMG® is the Event Producer for the Internationalization & Unicode Conferences. OMG is an open membership, not-for-profit consortium that produces and maintains computer industry specifications for interoperable enterprise applications. Our specifications include MDA®, UML®, CORBA®, MOF™, XMI® and CWM™. OMG's specifications are all available for download by everyone without charge.
For more information about OMG, visit us online at http://www.omg.org.

Posted by Unicode, Inc. at 9:20 AM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Thursday, May 26, 2011

PRI #184: Proposed Update UTS #37, Unicode Ideographic Variation Database

The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page:

http://www.unicode.org/review/

Review periods for the new items close on July 25, 2011.

Please see the page for links to discussion and relevant documents. Briefly, the new issue is:

184 Proposed Update UTS #37, Unicode Ideographic Variation Database (IVD)

This update to UTS #37 will clarify how Ideographic Variation Sequences (IVSes) can be shared across IVD collections, that no IVD collection has special status, and that implementers can support any subset of the registered IVSes.

It will further amend the specification to allow duplicate sequence identifiers within an IVD collection under particular circumstances, and to allow registrants to supply multiple representative glyphs for IVSes in their IVD collections. Finally, the registration procedures will be strengthened to require registrants to supply representative glyphs for registered IVSes and to supply a data file as part of a submission.

If you have comments for official UTC consideration, please post them by submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode forum or the Unicode mail list, then please use the following links to subscribe (if necessary). Please be aware that discussion comments on the Unicode mail list are not automatically recorded as input to the UTC. You must use the reporting link above to generate comments for UTC consideration.

http://www.unicode.org/forum/
http://www.unicode.org/consortium/distlist.html

Posted by Unicode, Inc. at 5:05 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Wednesday, May 25, 2011

Unicode Releases Common Locale Data Repository, Version 2.0

Mountain View, CA, May 25, 2011 - The Unicode® Consortium announced today the release of a new version of the Unicode Common Locale Data Repository (Unicode CLDR 2.0), providing key building blocks for software to support the world's languages. The main features of CLDR 2.0 are improved data for top 55 languages, with an increase of over 45% in data fields. The details are found in the CLDR 2.0 Release Note (http://cldr.unicode.org/index/downloads/cldr-2-0).

Unicode CLDR is by far the largest and most extensive standard repository of locale data. This data is used by a wide spectrum of companies for their software internationalization and localization: adapting software to the conventions of different languages for such common software tasks as formatting of dates, times, time zones, numbers, and currency values; sorting text; choosing languages or countries by name; transliterating different alphabets; and many others. Unicode CLDR 2.0 is part of the Unicode locale data project, together with the Unicode Locale Data Markup Language (LDML: http://unicode.org/reports/tr35/). LDML is an XML format used for general interchange of locale data, such as in Microsoft's .NET.

For web pages with different views of CLDR data, see http://cldr.unicode.org/index/charts. For more information about the Unicode CLDR project (including charts) see http://cldr.unicode.org/.

Posted by Unicode, Inc. at 3:20 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Extended Public Review Issues, 177, 179, 182

The review periods for three public review issues have been extended to July 25, 2011, for review at the August UTC meeting.

The affected PRIs are:

177 Proposed Update UTS #46: Unicode IDNA Compatibility Processing

179 Changes to Unicode Regular Expression Guidelines

182 Proposed Update UTS #18: Unicode Regular Expressions

There is an updated version of the proposed update for UTS #18, Unicode Regular Expressions available at http://unicode.org/reports/tr18/proposed.html . The update only has editorial changes; the UTC did not have enough time to review all of the feedback on this document at its last meeting, and will be reviewing it at the August meeting.

If you have comments for official UTC consideration, please post them by submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode forum or the Unicode mail list, then please use the following links to subscribe (if necessary). Please be aware that discussion comments on the Unicode mail list are not automatically recorded as input to the UTC. You must use the reporting link above to generate comments for UTC consideration.

http://www.unicode.org/forum/
http://www.unicode.org/consortium/distlist.html

Posted by Unicode, Inc. at 8:06 AM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Friday, April 29, 2011

Unicode Consortium announces Localization Interoperability Technical Committee

The Unicode Consortium announces a new technical committee, the Unicode Localization Interoperability Technical Committee. Localization of software information is a key part of the adoption of most software offerings in many countries. The purpose of the new committee is to ensure interoperable data interchange for critical localization-related assets, such as language segmentation, translation source strings, translated strings, and translation memories.

The initial focus of the Unicode Localization Interoperability (ULI) Technical Committee is on the improved interoperability of translation memories in the TMX format, segmentation rules that use the SRX format, and translation source strings and resulting translated strings that use the XLIFF format.

The ULI Technical Committee will establish profiles of use for TMX, SRX, and XLIFF. The committee will develop and publish specifications that document specific usage conventions that can be shared for interoperability. This will improve data interchange through more consistent implementations and will enhance the usefulness of these three standards.

For information on how to join the ULI effort and get involved in its work, contact the Unicode Consortium with the contact form (see http://www.unicode.org/reporting.html) and ask about the ULI.

To become a voting participant in the work of the ULI committee, join Unicode in one of the three voting categories of membership: Full, Institutional, or Supporting. See http://www.unicode.org/consortium/join.html .

For more details about the ULI, see: http://unicode.org/uli/

Posted by Unicode, Inc. at 3:23 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Tuesday, April 12, 2011

PRI #182: Proposed Update UTS #18, Unicode Regular Expressions

The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page:

http://www.unicode.org/review/pri181/

Review period for this item closes on May 6, 2011.

Please see the page for links to discussion and relevant documents. Briefly, the new issue is:

PRI #182: Proposed Update UTS #18

A proposed update for UTS #18, Unicode Regular Expressions has been posted by the Unicode Consortium at http://unicode.org/reports/tr18/proposed.html . Feedback on this draft is welcome, especially from regular expression engine developers and users. The main changes in this update include:

* Changes in conformance clauses for PRI #179, Changes to Unicode Regular Expression Guidelines
http://www.unicode.org/review/pri179/
* A new conformance clause for Level 2: RL2.7 Full Properties.
* Simplification of the definition of \p{blank} in Annex C Compatibility Properties.
* Clarifications of various conformance issues.

The review period for feedback closes on May 6, 2011; feedback on the document will be reviewed at the May Unicode Technical Committee meeting.

If you have comments for official UTC consideration, please post them by submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode forum or the Unicode mail list, then please use the following links to subscribe (if necessary). Please be aware that discussion comments on the Unicode mail list are not automatically recorded as input to the UTC. You must use the reporting link above to generate comments for UTC consideration.

http://www.unicode.org/forum/
http://www.unicode.org/consortium/distlist.html

Posted by Unicode, Inc. at 5:45 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Friday, April 1, 2011

PRI 181: Changing General Category of Twelve Characters

The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page:

http://www.unicode.org/review/pri181/

Review periods for the new items close on May 2, 2011.

Please see the page for links to discussion and relevant documents. Briefly, the new issue is:

The UTC has decided to change the general category of twelve characters, and is requesting public feedback on the proposed changes. Details of the proposal and the list of affected characters are in the background document: http://www.unicode.org/review/pri181/

If you have comments for official UTC consideration, please post them by submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode forum or the Unicode mail list, then please use the following links to subscribe (if necessary). Please be aware that discussion comments on the Unicode mail list are not automatically recorded as input to the UTC. You must use the reporting link above to generate comments for UTC consideration.

http://www.unicode.org/forum/
http://www.unicode.org/consortium/distlist.html

Posted by Unicode, Inc. at 3:20 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Friday, March 25, 2011

PRI #180: Proposed addition of address form metadata to Unicode CLDR

The Unicode Consortium has posted a new issue for public review and comment. Details are on the following web page:

http://www.unicode.org/review/pri180/

Review period for the new item closes on April 11, 2011.

Please see the page for links to discussion and relevant documents. Briefly, the new issue is:

Issue #180 Proposed addition of address form metadata to Unicode CLDR

The Unicode Consortium is considering the addition to CLDR of address form metadata. This metadata is intended for presenting a form for users to fill in with address data. The format and data is being donated by Google. The consortium is soliciting feedback on these changes. Feedback should be submitted as comments to http://unicode.org/cldr/trac/ticket/3572.

The detailed proposal and background information can be found at this link: http://www.unicode.org/review/pri180/

If you have comments for official consideration, please follow the instructions and the link in the text of the PRI.

Posted by Unicode, Inc. at 5:23 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Monday, March 14, 2011

Unicode Releases Common Locale Data Repository, Version 1.9.1

The Unicode CLDR 1.9.1 maintenance release is now available. See http://cldr.unicode.org/index/downloads/cldr-1-9-1 for details.

The next major release is CLDR 2.0, scheduled for late May. The CLDR 2.0 release does involve general data submission, which is currently in progress. For the lastest schedule, see http://cldr.unicode.org .

Posted by Unicode, Inc. at 12:54 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Wednesday, March 2, 2011

Public Review Issue #179: Change to Unicode Regular Expression Guidelines

The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page:

http://www.unicode.org/review/pri179/

Review periods for the new items close on May 2, 2011.

Please see the page for links to discussion and relevant documents. Briefly, the new issue is:

PRI #179: Changes to Unicode Regular Expression Guidelines

The Unicode Consortium is considering changes to UTS #18 Unicode Regular Expressions (http://www.unicode.org/reports/tr18/). These proposed changes have arisen in connection with questions about case-insensitive and canonical-equivalent matching.

The proposed changes eliminate some requirements on implementations of Unicode Regular Expressions which have proven to be problematic in implementations, and add clarifications. The consortium is soliciting feedback on these changes.

If you have comments for official UTC consideration, please post them by submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please use the following link to subscribe (if necessary). Please be aware that discussion comments on the Unicode mail list are not automatically recorded as input to the UTC. You must use the reporting link above to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Posted by Unicode, Inc. at 3:07 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Friday, February 18, 2011

Unicode Version 6.0 - Complete Text of Core Specification Published

Mountain View, CA, February 17, 2011 - The Unicode® Consortium is pleased to announce the publication of the final text of the core specification for Unicode 6.0. The Unicode 6.0 core specification includes information on scripts newly encoded in Unicode 6.0, as well as many updates and clarifications to other sections of the text. The release of the core specification completes the definitive documentation of the Unicode Standard, Version 6.0

In Version 6.0, the standard grew by 2,088 characters. Over 1,000 of these characters are symbols used for text exchange on mobile phones. The Unicode Standard now also includes the recently created official symbol for the Indian rupee. After computers and mobile phones update to Version 6.0, the rupee sign will be available for use like the $ or € now.

In addition, this version adds many CJK Unified Ideographs in common use in China, Taiwan, and Japan,as well as characters for African language support, including extensions to the Tifinagh, Ethiopic,and Bamum scripts. Three scripts are supported for the first time: Mandaic, Batak, and Brahmi.

In October of 2010, the other portions of Unicode 6.0 were released: the Unicode Standard Annexes, code charts, and the Unicode Character Database. This allowed vendors to update their implementations of Unicode 6.0 as quickly as possible.

For more information on all of The Unicode Standard, Version 6.0, see http://www.unicode.org/versions/Unicode6.0.0/

Posted by Unicode, Inc. at 11:34 AM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Tuesday, February 8, 2011

Call for Participation: 35th Internationalization and Unicode Conference

Mountain View, CA, USA – February 8, 2011 – The Unicode® Consortium today announced a call for participation in the Thirty-fifth Internationalization & Unicode® Conference (IUC 35), taking place in Santa Clara, Calif., USA; October 17-19, 2011, sponsored by Adobe. The conference is produced by OMG®.

This is the premier conference on technologies and practices for the creation and management of global and multilingual software solutions. This annual event is praised for its excellent technical content, industry-tested recommendations and updates on the latest standards.

The Program Committee is soliciting proposals for presentations that describe cases studies, best practices, effective software design, innovative technology, or important standards. Tutorial presentations are also welcome. Suitable topics include, but are not limited to:

Application Areas

• Designing software platforms, operating systems, software as a service, (SAAS), or programming environments
• Social networks
• Search engines, SEO, discovery and navigation best practices
• Websites and web services
• Libraries and education
• Mobile applications, including iPhone, Android, iPad, Kindle, Windows Mobile, etc.
• Publishing and broadcasting for a global audience
• Internationalized Domain Names and other identifiers
• Security concerns and practices
• Semantic Web
• Voice to text, text to voice
• Machine translation
• Unicode, encodings, scripts, character properties, and algorithms

General Techniques

• Advances in technologies, algorithms or methodologies
• Using internationalization libraries and programming environments
• Handling bidirectional or other complex scripts
• Dealing with data formats: XML, JSON, HTML5, DITA, and upcoming standards
• Project management and methodologies for global development teams e.g. Agile
• Best practices in localization process and technology
• Best practices in world-ready development, test, and deployment
• Improving globalization capabilities within organizations
• Approaches for migrating legacy applications to global markets
• Font development and Typography

Culture and Technology

• Endangered Languages
• Unencoded Languages
• Case studies and research on cross-culture communication
• Digital Divide

Regional Considerations

• Languages of Africa, Asia, and the Middle East
• Locales and the Unicode Common Locale Data Repository (CLDR)
• Emoji support

Details of the call for participation are available at: http://www.unicodeconference.org/iuc35call . Interested individuals or organizations are invited to submit a brief (up to 600 word) abstract of their proposed conference presentation by Friday, March 25 using this web form: http://www.unicodeconference.org/abstracts .

The Program Committee will notify authors by Wednesday, April 20. Final presentation materials will be required from selected presenters by Wednesday, August 3. The conference agenda will be available by Wednesday, May 4 at: http://www.unicodeconference.org/ .

Sponsorships and exhibit space are available; for more information on sponsoring contact Ken Berk at ken.berk@omg.org, +1-781-444 0404. For exhibiting questions email event_marketing@omg.org. For all other questions email info@unicodeconference.org .

For more information, please contact the Unicode Consortium
http://www.unicode.org/contacts.html .

About the Event Producer

OMG® is the Event Producer for the Internationalization & Unicode Conferences. OMG is an open membership, not-for-profit consortium that produces and maintains computer industry specifications for interoperable enterprise applications. Our specifications include MDA®, UML®, CORBA®, MOF™, XMI® and CWM™. OMG’s specifications are all available for download by everyone without charge.

For more information about OMG, visit us online at http://www.omg.org .

Note to editors: Unicode Standard, Unicode and the Unicode Logo are trademarks of Unicode, Inc. Unicode Consortium is a registered trademark of Unicode, Inc. OMG and Object Management Group are trademarks of Object Management Group. All other trademarks are the property of their respective owners.

Posted by Unicode, Inc. at 2:37 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Opening of Data Submission for CLDR 2.0

The Unicode Consortium is announcing the opening of data submission for the 2.0 release of Unicode CLDR, the Common Locale Data Repository. Data for the CLDR project is collected via the CLDR survey tool.

The CLDR survey tool is at http://unicode.org/cldr/apps/survey

The data submission is currently scheduled to run from now until March 9, 2011, after which the data will go into vetting mode in order to resolve any conflicts that may occur in the data.

If you plan to submit data for the 2.0 release of CLDR you will need a login ID and password associated with your e-mail address.

1. If you have used ST in a previous release of CLDR, your login ID and password are still active.

2. If you need to set up a new account, please see the instructions at http://cldr.unicode.org/index/survey-tool/accounts

A summary of the new features in 2.0 release of CLDR survey tool is at http://cldr.unicode.org/index/survey-tool/survey-tool-2-0-new-features

Posted by Unicode, Inc. at 1:43 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Thursday, February 3, 2011

Unicode CLDR announces Survey Tool 2.0 BETA

The Unicode Consortium is announcing the beta test release of the survey tool for data submission for the 2.0 release of Unicode CLDR, the Common Locale Data Repository.

The CLDR survey tool is at http://unicode.org/cldr/apps/survey

The beta test period allows users to get comfortable with and provide feedback on the user interface before attempting to input real data. Any data entered during the BETA period will be discarded. The CLDR committee plans to switch over to production mode on Monday February 7, 2011, unless we find some critical error during beta period that would prevent that.

If you plan to submit data for the 2.0 release of CLDR you will need a login ID and password associated with your e-mail address.

1. If you have used ST in a previous release of CLDR, your login ID and password are still active.

2. If you need to set up a new account, please see the instructions at http://cldr.unicode.org/index/survey-tool/accounts

A summary of the new features in 2.0 release of CLDR survey tool is at http://cldr.unicode.org/index/survey-tool/survey-tool-2-0-new-features

Posted by Unicode, Inc. at 3:01 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Public Review Issue #178: Collation Rules for Non-Latin Scripts in Unicode CLDR

The Unicode CLDR Technical Committee has posted a new issue for public review and comment. Details are on the following web page:

http://www.unicode.org/review/pri178/

Review period for the new item closes on February 28, 2011.

Please see the page for links to discussion and relevant documents. Briefly, the new issue is:

PRI #178 Collation Rules for Non-Latin Scripts in Unicode CLDR

In Unicode CLDR 1.9 and earlier versions, the collation order for given languages changes the order of characters within a script, but doesn't change the order of scripts. In Unicode CLDR 2.0 and later versions, there is the capability to customize the collation order by re-ordering one or more scripts with respect to other scripts. The proposal addressed by this Public Review Issue is to change the customized collation order for certain languages in the Unicode CLDR data tables.

If you have comments for official consideration, please post them by submitting your comments through the Unicode CLDR reporting form:

http://unicode.org/cldr/trac/newticket

If you wish to discuss issues on the Unicode CLDR mail list, then please use the following link to subscribe (if necessary). Please be aware that discussion comments on the mail list are not automatically recorded as input to the technical committee. You must use the reporting link above to generate comments for consideration.

http://www.unicode.org/consortium/distlist.html

Posted by Unicode, Inc. at 2:59 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Tuesday, January 25, 2011

Save the Date: 35th Internationalization and Unicode Conference: Santa Clara, CA, USA - 10/17-19, 2011

Mountain View, CA, USA - January 25, 2011 - The Unicode® Consortium today announced that the Thirty-fifth Internationalization and Unicode Conference (IUC) will take place in Santa Clara, Calif., USA at the Hyatt Regency Hotel on October 17-19, 2011, sponsored by Adobe. This is the premier conference on technologies and practices for the creation and management of global and multilingual software applications.

This annual event focuses on software and Web globalization. It brings together internationalization experts, tools vendors, software implementers, and business and program managers from around the world. Expert practitioners and industry leaders present detailed recommendations for businesses looking to expand to new international markets and those seeking to improve time to market and cost-efficiency of supporting existing markets. Recent conferences have provided specific advice on designing software for European countries, Latin America, China, India, Japan, Korea, the Middle East, and emerging markets.

This highly rated conference features excellent technical content, industry-tested recommendations and updates on the latest standards and technology. Subject areas include cloud, upgrading to HTML5, integrating with social networking software, and implementing mobile apps. This year's conference will also highlight new features in Unicode Version 6 and other relevant standards published this year.

The Call for Participation will be issued shortly. The abstract submission deadline will be March 25, 2011. For more information about IUC 35, please visit http://www.unicodeconference.org/save-the-date.

About the Event Producer

OMG® is the Event Producer for the Internationalization & Unicode Conferences. OMG is an open membership, not-for-profit consortium that produces and maintains computer industry specifications for interoperable enterprise applications. Our specifications include MDA®, UML®, CORBA®, MOF(tm), XMI® and CWM(tm). OMG's specifications are all available for download by everyone without charge.

For more information about OMG, visit us online at http://www.omg.org.

Posted by Unicode, Inc. at 5:22 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Tuesday, January 18, 2011

Corrected mapping data for Unihan_Variants.txt now available

The Unicode Consortium announces the availability of corrected mapping data for Unihan_Variants.txt. Due to a production problem, many of the traditional and simplified CJK mappings posted in Unihan_Variants.txt as part of the Unihan Database (Unihan.zip) for Unicode 6.0 have corrupted values in them. A corrected version of this mapping data has been posted at: http://www.unicode.org/Public/6.1.0/ucd/Unihan_Variants-6.1.0d1.txt

It is anticipated these corrected mappings will be incorporated in the next version of the Unicode Standard. In the interim, applications which make use of the traditional and simplified CJK mappings in the Unihan Database may wish to correct their mappings based on the revised data file. The traditional and simplified CJK mappings are classified as provisional properties in the Unicode Character Database. Users of provisional properties are cautioned that their use is at the implementer's risk.

Posted by Unicode, Inc. at 5:32 PM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Wednesday, January 12, 2011

Unicode Discussion Forum

The Unicode Forum provides a new, more open means for the community of Unicode users and experts to ask questions and discuss topics. For Unicode users, the forum provides an indexed, categorized and easily searchable means of accessing information about the Unicode Standards, related specifications and their use. For experts, the forum provides a place to discuss the desirability and ramification of proposed future extensions to the standard, as well as a place to discuss the state of the art in implementing various features.

To view the forum and participate in discussions, please direct your browser to this URL: http://www.unicode.org/forum/ . Registration is free. As a registered user, you can set up RSS feeds from the forum or subscribe to e-mail notification for topics of interest.

Posted by Unicode, Inc. at 8:04 AM

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Tuesday, December 13, 2011

Thursday, October 6, 2011

Wednesday, October 5, 2011

Friday, September 23, 2011

Monday, September 19, 2011

Unicode 6.1 Beta Review

About the Unicode Consortium

Friday, September 2, 2011

Thursday, September 1, 2011

Tuesday, August 30, 2011

Monday, August 29, 2011

Friday, August 26, 2011

Thursday, August 18, 2011

Thursday, August 11, 2011

Wednesday, August 10, 2011

Wednesday, July 27, 2011

Monday, July 25, 2011

Monday, July 18, 2011

Wednesday, July 13, 2011

Thursday, July 7, 2011

Tuesday, July 5, 2011

Friday, July 1, 2011

Friday, June 24, 2011

Monday, June 20, 2011

Monday, June 6, 2011

Thursday, May 26, 2011

Wednesday, May 25, 2011

Friday, April 29, 2011

Tuesday, April 12, 2011

Friday, April 1, 2011

Friday, March 25, 2011

Monday, March 14, 2011

Wednesday, March 2, 2011

Friday, February 18, 2011

Tuesday, February 8, 2011

Thursday, February 3, 2011

Tuesday, January 25, 2011

Tuesday, January 18, 2011

Wednesday, January 12, 2011

Links of Interest

Blog Archive

Labels

Followers

Subscribe to this blog