Monday, August 21, 2023
Volunteer Spotlight!
Lorna Evans, SIL International
Lorna first becameinvolved with Unicodein 2000 as a conferenceparticipant. Herenthusiasm led her tovolunteer as a lecturer at a Unicodeconference, and for thepast several years asan active proposalcontributor andcommittee member.Lorna’s heart andpassion are to assistdigitally disadvantagedcommunities bybringing their language fonts, characters, and soundsto the Unicode standard.Lorna began her language journey typesetting Biblesin Ethiopia in 1990 and was fascinated by the multiplefonts and characters needing representation. Whenshe heard that Unicode was moving to supportEthiopic characters, she had to get involved.
When asked what she is most proud of, Lorna said,“Anytime I do a proposal to Unicode, it feels like themost important thing (for that language community).”She thrives on research and feels with each proposal,she is bringing digital access to people who need itmost. Lorna is completely self-taught and currentlyfocused on documenting Arabic script. She describesSIL International, an Associate member of theUnicode Consortium and her current employer, as theformer kings of creating custom encoded fonts andshe is working diligently to help SIL transition toUnicode.
As for her time involved with other Unicode staff andvolunteers, she has enjoyed the camaraderie andattending technical committee and editorial meetings.Lorna is an active member of the Script Ad HocSubcommittee, as well as the primary representativeto Unicode for SIL International.
Lorna shared that she grew up in Bolivia and says thatsalteñas, a savory pastry filled with beef stew, is stillher favorite food.
Editor’s Note: We appreciate and thank Lorna fortaking time to tell us a little about herself as well asher years of contributions
🌻🌻🌻🌻🌻 SUPPORTUNICODE🌻🌻🌻🌻🌻
Finally, if you are already a contributor — ormemberof Unicode(or your company or organization is), thankyou, Danke, děkujeme, धन्यवाद, merci, 谢谢你, grazie,நன்றி, and gracias! What we accomplish isonly possible because of supporters like you.Tosupport Unicode’s mission to ensure everyone cancommunicate in their languages across all devices,please consider
adopting a character , making a gift of stock , or makinga donation .
As Unicode, Inc. is a US-based open source, openstandards, non-profit, 501(c)3 organization, yourcontribution may be eligible for a tax deduction.
Please consult with a tax advisor for details.
Make your adoption today!
Tuesday, August 15, 2023
Unicode Consortium Board Votes to Elevate ICU4X to Technical Committee
Across the globe, people are using alternative ways to get online, such as smartphones, smart watches, and other compact devices. Formed as a Subcommittee of the ICU Technical Committee in 2020, ICU4X is a modular, lightweight, and secure library that brings internationalization to client-side and resource-constrained environments, written in Rust with bindings into many programming languages.
“We are all very excited about the ICU4X project. It dramatically expands the number of apps and systems that can easily deploy internationalization: with a smaller, modular footprint and the advantages of security and performance from Rust.” — Mark Davis, Co-founder and Board Chair
At its July meeting, the Board agreed that the ICU4X Subcommittee needed to have the authority to make technical decisions relating to the ICU4X architecture, structure, and coding, and voted that the ICU4X Subcommittee be elevated to the level of Technical Committee with the authority to make such decisions, effective immediately.
The ICU4X Technical Committee will be responsible for the design and implementation of the library, with the goal to ensure that mobile devices and other low-resource devices can access scalable internationalization services. It is particularly applicable for devices that cannot run full ICU (C/C++ and Java) — and for those in emerging markets with more limited resources and digitally disadvantaged languages.
Chair Shane Carr (Google) and Vice Chairs Zibi Braniecki (Amazon) and Nebojša Ćirić (Google) are the ICU4X Technical Committee Chair and Vice Chairs.
Congratulations to the ICU4X team!
Support Unicode
To support Unicode’s mission to ensure everyone can communicate in their languages across all devices, please consider adopting a character, making a gift of stock, or making a donation. As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.
[badge]
Tuesday, August 8, 2023
Unicode Technology Workshop — Call for Submissions and Registration Open!
Save the Date! November 7-8, 2023. Bay Area (Hosted at Google)
About the Workshop
Join us in person for two days of community building around the Unicode technology that makes software work for billions of people. Expect two days of workshops, seminars, free-form discussions, and lightning talks centered around i18n libraries, locale data frameworks, globalization tooling, localization pipelines, input methods, and text rendering. Network with the developers and users to help shape the future of Unicode technology.
This is a new type of event for Unicode, with a focus on building more connections within the internationalization community. Expect to come away with deeper knowledge on how to solve tough problems in the i18n and l10n space and how to engineer products that work better for global users. GILT professionals, especially those who build or use Unicode technologies, are encouraged to attend and to host sessions. To encourage maximum collaboration amongst the attendees, this is an in-person-only event.
Call for Submissions
For those interested in participating in and contributing to the event, the call for submissions is now open. If you work on Unicode internationalization technologies or use Unicode internationalization technologies in your work, we want to hear from you. You can register your interest in contributing using the following link.
About the Unicode Consortium
The Unicode Consortium is the premier non-profit open source, open standards body for the internationalization of all software and services.
For more than 30 years, the Unicode Consortium has coordinated the efforts of a world-wide team of volunteer programmers and linguists to standardize, evolve, and maintain a global software foundation that allows virtually every computer system and service to help people connect using their native language.
For additional information about Unicode, visit home.unicode.org .
Support Unicode
To support Unicode’s mission to ensure everyone can communicate in their languages across all devices, please consider adopting a character, making a gift of stock, or making a donation. As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.
[badge]
Wednesday, August 2, 2023
622 New CJK Ideographs to be Available in Unicode Version 15.1
The Unicode Standard will include 622 new CJK characters in Version 15.1, which will be released on September 12, 2023. The characters are in a new block, CJK Unified Ideographs Extension I, with code point assignments as reflected in the proposal document .
The characters in the Extension I block have been deemed to be very urgently needed for use in China. The Extension I proposal was based on characters that appeared in a draft amendment of China’s mandatory GB 18030 standard. For this reason, the Unicode Technical Committee (UTC) considered it imperative to arrive at a stable encoding for these characters as quickly as possible.
With the Unicode 15.1 beta review period completed, and with endorsements from liaison partners, China Electronics Standardization Institute (CESI) and ISO/IEC JTC 1/SC 2, UTC at its recent meeting was able to commit to including these characters for this next release of the Unicode Standard. The code point assignments are now stable, and vendors can begin working on implementations with confidence.
The Unicode Consortium would like to thank experts in UTC’s CJK & Unihan Group and ISO’s Ideographic Research Group (IRG) for their expedited work in preparing a proposal for encoding CJK Extension I, and would also like to thank Mr. Chen Zhuang and partners in CESI for their cooperation in this process.
Support Unicode
To support Unicode’s mission to ensure everyone can communicate in their languages across all devices, please consider adopting a character, making a gift of stock, or making a donation. As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.
[badge]
Thursday, June 15, 2023
ICU 73.2 & CLDR 43.1 released: GB18030 compliance updates & compatibility fixes
-
ICU is the
premier library for
software internationalization, used by a
wide array of companies and organizations to support the world's
languages, implementing both the latest version of the Unicode Standard and
of the Unicode locale data (CLDR).
-
CLDR provides key building blocks for
software to support the world's languages (dates, times, numbers,
sort-order, etc.). All major browsers and all modern mobile phones use CLDR
for language support. (See
Who uses CLDR?)
-
CLDR extends the support for “short”
Chinese sort orders to cover some additional, required characters for Level
2. This is carried over into ICU collation.
-
ICU has a modified character conversion
table, mapping some GB18030 characters to Unicode characters that were
encoded after GB18030-2005.
-
There are optional variants of time formats
with AM/PM (only for English) using ASCII spaces in CLDR that can also be
used in ICU via custom data generation. This is intended to help certain
implementers transition to the improved patterns, which have used a narrow
no-break space between the time and AM/PM since
CLDR 42.
- For how to generate ICU data with this option, look for alt="ascii" on tools/cldr/cldr-to-icu/README.md
-
The changes to the word segmentation
behavior of @ sign that were in CLDR 42 (ICU 72) have been reverted. These
caused problems for certain parsers that did not expect @ to join to
letters.
For details, please see:
-
ICU 73.2 Release Note:
ICU 73.2
maintenance release
-
CLDR 43.1 Release Note:
Version 43.1 Changes
Support Unicode
To support Unicode’s mission to ensure everyone can communicate in their languages across all devices, please consider adopting a character, making a gift of stock, or making a donation. As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.
[badge]
Thursday, June 1, 2023
Unlocking the Power of CLDR Person Name Formatting: A Solution for Formatting Names in a Globalized World
[image]
CLDR Person Names has moved from “tech preview” to “draft” status and is available for initial testing by implementors through ICU4J.
How a person’s name is displayed and used can convey respect, familiarity, or even be interpreted as rude if used improperly. That’s why it’s important to format names correctly, especially because naming practices vary across the globe. In many cultures, names can indicate gender, status, birthplace, nationality, ethnicity, religion, and more.
Until now, there have been no good standards for how to format people’s names in various contexts. A number of Unicode members wanted to address this problem and provide a mechanism that anyone could use to format people’s names in a wide variety of applications, such as contact lists, air travel, billing applications, CRMs, social media, and any other application that asks for user information and presents it back to the user or others.
The Unicode® Person Name Formats defines patterns used to take a person’s name and format it correctly in a given language or locale depending on a chosen context. With the Unicode Common Locale Data Repository (CLDR), locale codes and name sequences can be selected to create a specific pattern for formatting a person’s name — including preferences for formal, informal, or abbreviated versions. As a result, designers and developers can correctly display names according to the user’s native locale and culture, especially important when integrating names in different character scripts, such as Japanese, Chinese, or Russian.
The Unicode Consortium added Person Name formatting to CLDR in v42 and has been refined and enhanced for v43, which just released in April. In CLDR v43, with the help of linguists from around the world, we completed data for formatting people’s names for CLDR locales at modern coverage. Its formal name is "Unicode Technical Standard #35 Unicode Locale Data Markup Language (LDML); Part 8: Person Names". ICU has added the PersonNameFormatter class and is available in ICU 73.
To learn more, and get an idea of the implications for user experience and application design, see the following paper, which provides an illustration of the many contexts in which names can be formatted through CLDR Person Names.
LDML (UTS#35) Part 8: Person Names - a story teller’s case study
Support Unicode
To support Unicode’s mission to ensure everyone can communicate in their languages across all devices, please consider adopting a character, making a gift of stock, or making a donation. As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.
[badge]
Tuesday, May 23, 2023
Unicode 15.1 Beta Review Open
Normally at this phase of a release, the character repertoire is considered stable and very unlikely to change. Also, the plan for Unicode 15.1 had been for a minor release with only a very limited set of new characters.
Recent developments have led to a tentative change in those plans, however.
China has a very urgent need for encoding of certain CJK ideographs used in public services databases. To accommodate this urgent need, the Unicode Technical Committee (UTC) decided at its April 2023 meeting to encode 603 new characters in Unicode 15.1 as CJK Unified Ideographs Extension I. This new block is included in the delta charts for the Unicode 15.1 beta. However, inclusion of these characters in Unicode 15.1 is contingent on support for this addition from China, and on support for this addition in the corresponding ISO/IEC 10646 standard from ISO/IEC JTC 1/SC 2 at their upcoming meeting in June. While support for the new block is anticipated, there is a small chance that minor changes to this repertoire will be made after the beta, or that UTC will pull this block entirely from the 15.1 release.
Several of the Unicode Standard Annexes have significant modifications and associated data changes for version 15.1. For example, UAX #14, Unicode Line Breaking Algorithm has significant enhancements to support line breaking at orthographic syllable boundaries in several South and Southeast Asian scripts. Also, in conjunction with the parallel development of a new standard, UTS #55, Unicode Source Code Handling (see Public Review Issue #474), there are significant revisions to UAX #31, Unicode Identifiers and Syntax that will provide better specifications and guidance related to security, and also improved guidance for applications that define identifier systems using Unicode.
While draft content for the beta has been published as of May 23rd, the work groups preparing updates to the content could continue to make changes to data or specs during the Beta review period. Any substantive changes for the beta will be frozen by June 5th.
Please review the documentation, adjust your code, test the data files, and report errors and other issues to the Unicode Consortium by July 4, 2023. The review period will only be for six weeks, so prompt feedback is appreciated. Feedback instructions are on the beta page.
See https://www.unicode.org/versions/beta-15.1.0.html for more information about testing and providing feedback on the 15.1.0 beta.
See https://www.unicode.org/versions/Unicode15.1.0/ for the current draft summary of Unicode Version 15.1.0.
Support Unicode
To support Unicode’s mission to ensure everyone can communicate in their languages across all devices, please consider adopting a character, making a gift of stock, or making a donation. As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.
[badge]
Tuesday, May 16, 2023
LDML (UTS#35) Part 7: Keyboards
Today, every platform must independently evaluate, prioritize, and implement all new or updated keyboard layouts, leading to major inconsistencies and delays especially where digitally disadvantaged languages are concerned. Consequently, language communities and other keyboard authors must see their designs developed independently for every platform/operating system, resulting in unnecessary duplication of technical and organizational effort.
“Keyboard 3.0” is designed from the ground up to be usable as a solution to support both hardware and on-screen (touch) layouts for all platforms in a single source file for each language.
With Keyboard 3.0, leading members of the language communities will be able to submit their layout once to CLDR, and it will be available to all platforms as part of the latest version of CLDR, making adoption much easier for platforms. Platform vendors will not need to develop and maintain their own keyboard layout data, especially for languages that they don’t yet support.
This work contributes to the goals of the United Nations International Decade of Indigenous Languages by improving the path for Digitally Disadvantaged Language communities to develop platform support for their languages. Users should see improvements in consistency between platforms, as layouts can be shared.
Support Unicode
To support Unicode’s mission to ensure everyone can communicate in their languages across all devices, please consider adopting a character, making a gift of stock, or making a donation. As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.
[badge]