Friday, October 13, 2017

New Gold Sponsor comprigo

The Unicode Consortium is pleased to announce that comprigo is now a gold sponsor for:
comprigo's  sponsorship directly funds the work of the Unicode Consortium in enabling modern software and computing systems to support the widest range of human languages. There are approximately 7,000 living human languages. Fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. AAC donations are used to improve support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage.
As a product- and price comparison website, comprigo redefines the assets of online shopping by helping our customers make the best possible purchase decision. With our permanent adoption we want to make a statement and support the unique visual semantics of Unicode Consortium in a world where visual communication becomes more and more important. As a globally active software company, we want to support Unicode by not just preserving linguistic heritage, but also by enabling intercultural communication. The comprigo Moneybag is a perfect representation of what users gain from using our services: purchase the best product for the best price.  — comprigo
The Unicode Consortium thanks comprigo for their support!

All sponsors are listed on Sponsors of Adopted Characters. More than 128,000 other characters are available for adoption — see Adopt a Character

Monday, September 25, 2017

Proposed Draft UTR #53, Unicode Arabic Mark Ordering Algorithm Now Available for Public Review

The Unicode Consortium has released Proposed Draft Unicode Technical Report #53, Unicode Arabic Mark Ordering Algorithm. This UTR describes an algorithm for determining correct rendering of Arabic combining mark sequences.

The combining classes of Arabic combining characters in Unicode are a mixture of special classes for specific marks plus two more generalized classes for all the other marks. For many years this has resulted in inconsistent rendering for sequences with multiple combining marks such as:


The algorithm described in this UTR provides a method to reorder Arabic combining marks in order to accomplish the following goals:
  • The inside-out rendering rule will display combining marks in the expected visual order.
  • Ensure identical display of canonically equivalent sequences.
  • Provide a mechanism for overriding the display order in exceptional cases.
The document is in “Proposed Draft” state, and made available for public review and comment. Information about this type of document can be found on the About Unicode Technical Reports page.

For information about how to discuss this Public Review Issue and how to supply formal feedback, please see the PRI #359 page.

Monday, September 18, 2017

Unicode CLDR 32α available for testing

cldr v31 alpha The alpha version of Unicode CLDR 32 is available for testing. The alpha period lasts until the beta release on September 27, which will include updates to the LDML spec. The final release is expected on October 19.

CLDR 32 provides an update to the key building blocks for software supporting the world's languages. This data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.

CLDR 32 included a Survey Tool data collection phase, with a resulting significant increase in data size, especially for emoji names/annotations and geographic subdivision names. Other enhancements include rule-based number formats for additional languages, a new “disjunctive” list style (a, b, or c), and fixes for Chinese collation and transliteration. The draft release page at http://cldr.unicode.org/index/downloads/cldr-32 lists the major features, and has pointers to the newest data and charts. It will be fleshed out over the coming weeks with more details, migration issues, known problems, and so on. Particularly useful for review are:
Please report any problems that you find using a CLDR ticket. We'd also appreciate it if programmatic users of CLDR data download the xml files and do a trial integration to see if any problems arise.

Wednesday, August 30, 2017

New Emoji Subcommittee Vice-Chairs

silhouette imageThe Emoji Subcommittee (ESC) is on the front lines of Unicode emoji. It is responsible for accepting requests for new emoji and emoji sequences, helping requesters to fill out missing areas in their proposals, and providing prioritized recommendations to the Unicode Technical Committee.

Peter Edberg is stepping down as the co-chair of ESC, a role he has filled since its inception. He is one of the key people involved in Unicode emoji since the very beginning, so we are very lucky that he will continue as one of the technical leaders of ESC, and remain the co-author of “Unicode Emoji” (UTS #51). To ensure the smooth operation of the ESC, we have three eminently-qualified new vice-chairs: Jeremy Burge, who has been responsible for crafting and refining proposals from the most popular requests received at Emojipedia; Jennifer 8 Lee, who has played a pivotal role in developing, inspiring, and mentoring emoji requests through Emojination; and Tayfun Karadeniz, who has researched, organized, and shepherded emoji in the very popular areas of smileys and human-form emoji. All three have already made lasting contributions to the work of the ESC, and we welcome them in their new roles.

— Mark Davis, ESC chair

Over 100,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[fortune cookie badge]

Friday, August 25, 2017

Unicode 10.0 Paperback Available

[Unicode 10.0 Cover Art] The Unicode 10.0 core specification is now available in paperback book form with a new, original cover design. This edition consists of a pair of modestly priced print-on-demand volumes containing the complete text of the core specification of Version 10.0 of the Unicode Standard.

Each of the two volumes is a compact 6×9 inch US trade paperback size. The two volumes may be purchased separately or together, although they are intended as a set. The cost for the pair is US $16.85, plus postage and applicable taxes. Please visit the description page to order.

Note that these volumes do not include the Version 10.0 code charts, nor do they include the Version 10.0 Standard Annexes and Unicode Character Database, which are freely available on the Unicode website.

Purchase The Unicode Standard, Version 10.0 - Core Specification

Thursday, August 24, 2017

Gold Sponsor SMU Guildhall



The Unicode Consortium is pleased to announce that SMU Guildhall is now a gold sponsor for:


SMU Guildhall's  sponsorship directly funds the work of the Unicode Consortium in enabling modern software and computing systems to support the widest range of human languages. There are approximately 7,000 living human languages. Fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. AAC donations are used to improve support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage.
SMU Guildhall is the #1 graduate school for video game design, the first in the world to offer a master's degree in interactive technology, and the only program with specializations in all four cornerstones of game development: Art, Design, Programming, and Production. Their game industry faculty turn a passion for gaming into a viable and fulfilling career by mentoring students through the two year program and 3+ team game projects. Over 700 Guildhall alumni have worked at over 250 game studios globally.  — SMU Guildhall
The Unicode Consortium thanks SMU Guildhall for their support!

All sponsors are listed on Sponsors of Adopted Characters. More than 128,000 other characters are available for adoption — see Adopt a Character

Tuesday, August 22, 2017

Keynote Speaker Announced for IUC 41


Can We Escape Alphabetic Order? Chinese I.T. Before and After Unicode

Thomas S. Mullaney
Associate Professor of Chinese History, Stanford University

Drawing upon more than a decade of research in the fields of Chinese and Non-Western information technology, Stanford historian Tom Mullaney maps out the parallel and still-uncharted worlds of Chinese IT. Worlds where text encoding systems have never been hidden from the average user’s view, but instead where they are directly manipulated, command-line-style, by hundreds of millions of code-conscious users. Worlds where the mythology of “plaintext” was never allowed to take hold, and where everyone knows that ‘WYS’ is not ‘WYG’. With vivid examples pulled from the archives of telegraphy, computing, machine translation, digital typography, and more, he will give a guided tour of China‘s 200-year-old quest for a stable information order, posing a fundamental question along the way: Why has the Chinese script proven so difficult to encode, and what does this tell us about the fundamental inequalities that are still baked into our modern-day information order?

About IUC 41, October 16-18, 2017: For twenty-six years the Internationalization & Unicode® Conference (IUC) has been the preeminent event highlighting the latest innovations and best practices of global and multilingual software providers. Please join us for our 41st conference! This year's event is being held on October 16-18, 2017 in Santa Clara, California. Read more.

Monday, August 21, 2017

Unicode Consortium Announces Cover Design

The Unicode Consortium is pleased to announce the new design selected for the cover of the forthcoming print-on-demand publication of The Unicode Standard, Version 10.0. The Unicode Consortium issued an open call for artists and designers to submit cover design proposals. All submitted designs were reviewed by an independent panel.

[cover art by Kosala Senevirathne]
The selected cover artwork is an original design by Kosala Senevirathne, art director and graphic designer at Mooniak, a design and art direction studio in Colombo, Sri Lanka, that focuses on multilingual design. The design is about the spirit of Unicode Standard; a universal standard that enables equal opportunity for discussion and discourse in writing systems of the world.

Two runner-up designs by Diana Gomez and Maitray Shah were also selected. Diana Gomez is currently a senior in the Mechanical Engineering at the University of California, Berkeley. Maitray Shah is a graduate student at San Jose State University pursuing a Masters in Software Engineering.

Diana Gomez:
[Diana Gomez]
Maitray Shah:
[Maitray Shah]

Wednesday, August 16, 2017

Unicode Emoji 6.0 initial drafts / Draft Candidate chart updated

AAC imageEmoji 6.0 is starting development, and initial drafts of the specification and data files are available. In the specification and data, a new property is added that helps to “future-proof” segmentation for emoji. The specification also contains more proposed guidelines: for gender-neutral emoji, the application of skin-tone modifiers, and others.

There are two types of emoji: characters and sequences. While these appear and behave similarly for users, they are released on different time schedules.
  • Emoji characters at Draft Candidate status are targeted at Unicode 11.0 (due in June 2018). These characters are “short-listed”. The Emoji Candidates chart has been updated with these characters, and feedback is solicited on names, keywords, and ordering. They will be reviewed at the October UTC meeting and are on track for Final Candidate status.
  • Emoji sequences may be released as a part of Emoji 6.0. The exact content and release schedule of Emoji 6.0 has yet to be determined: it could appear earlier than Unicode 11.0. The Proposals for new sequences for Emoji 6.0 were presented in L2/17-287 and will be reviewed in the October UTC meeting. Other proposals may be considered at that meeting.
Over 100,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[salad badge]

Tuesday, August 15, 2017

PRI 354: Registration of additional sequences in the Moji_Joho collection

IVD imageThe Unicode Consortium has posted a new issue for public review and comment.

Public Review Issue #354: A submission for the "Registration of additional sequences in the Moji_Joho collection" has been received by the IVD registrar.

This submission is currently under review according to the procedures of UTS #37, Unicode Ideographic Variation Database, with an expected close date of 2017-11-17. Please see the submission page for details and instructions on how to review this issue and provide comments:

http://www.unicode.org/ivd/pri/pri354/

The IVD (Ideographic Variation Database) establishes a registry for collections of unique, and sometimes shared, variation sequences for Ideographs, which enables standardized interchange in plain text, in accordance with UTS #37, Unicode Ideographic Variation Database.

Monday, July 31, 2017

Gold Sponsor Oakland Athletics Baseball

The Unicode Consortium is pleased to announce that Oakland Athletics Baseball is now a gold sponsor for:





Oakland Athletics Baseball's  sponsorship directly funds the work of the Unicode Consortium in enabling modern software and computing systems to support the widest range of human languages. There are approximately 7,000 living human languages. Fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. AAC donations are used to improve support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage. 
The team adopted the three characters as visual representation for Oakland Athletics Baseball. The organization's nine World Series titles and 15 American League Pennants make the Athletics one of the most storied clubs in Major League Baseball. The Athletics take great pride in the achievements of the past, and view them as a challenge to push further. The Oakland Athletics are committed to creating winning experiences that encompass the many aspects of the game and the Oakland community. — Oakland Athletics Baseball
The Unicode Consortium thanks Oakland Athletics Baseball for their support!

All sponsors are listed on Sponsors of Adopted Characters. More than 128,000 other characters are available for adoption — see Adopt a Character

Monday, July 24, 2017

Gold Sponsor White Unicorn Agency

The Unicode Consortium is pleased to announce that White Unicorn Agency is now a gold sponsor for:


White Unicorn Agency's  sponsorship directly funds the work of the Unicode Consortium in enabling modern software and computing systems to support the widest range of human languages. There are approximately 7,000 living human languages. Fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. AAC donations are used to improve support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage. 
White Unicorn Agency is a dynamic, full-service creative agency based in the Design District of Dallas, Texas. We’re a diverse team of dreamers who build remarkable brands with design, technology, and a touch of magic. We value process and function as much as beauty and creativity, and summon them all to help bring your ideas to life. For us, the creative process seamlessly unites strategy and imagination – producing insightful, innovative business solutions delivered in new and interesting ways. We work across a variety of mediums to create captivating experiences that convey our client’s message effectively. We're proud to be the official sponsor of the Unicorn Emoji and support the Unicode Consortium. White Unicorn Agency

The Unicode Consortium thanks White Unicorn Agency for their support!

All sponsors are listed on Sponsors of Adopted Characters. More than 128,000 other characters are available for adoption — see Adopt a Character

Thursday, July 20, 2017

Gold Sponsor JMP Software

The Unicode Consortium is pleased to announce that JMP is now a gold sponsor for:


JMP's  sponsorship directly funds the work of the Unicode Consortium in enabling modern software and computing systems to support the widest range of human languages. There are approximately 7,000 living human languages. Fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. AAC donations are used to improve support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage. 

Scientific advances, engineering breakthroughs and statistical discoveries: Clarity comes to those who explore data visually and interactively with statistical software from JMP. That’s why we adopted the lightbulb — to represent the aha moments scientists, engineers and other data explorers experience. From its beginning, JMP has connected statistics with data visualization for data analysis, later adding design of experiments, predictive modeling, and quality, reliability, and consumer research analysis. Unicode allows us to focus on statistical discovery rather than on character set differences among the seven languages our software supports or on multiple operating systems. That’s another reason we decided to support Unicode’s work, as explained in this post. — JMP
The Unicode Consortium thanks JMP for their support!

All sponsors are listed on Sponsors of Adopted Characters. More than 128,000 other characters are available for adoption — see Adopt a Character

Tuesday, July 18, 2017

Gold Sponsor discourse.org

The Unicode Consortium is pleased to announce that discourse.org is now a gold sponsor for:


discourse.org's  sponsorship directly funds the work of the Unicode Consortium in enabling modern software and computing systems to support the widest range of human languages. There are approximately 7,000 living human languages. Fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. AAC donations are used to improve support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage. 

Discourse is the 100% open source discussion platform built for the next decade of the Internet. It works as a mailing list, discussion forum, long-form chat room, and more! Install it yourself, or try our managed hosting service. As a team discussion platform, emoji (and Unicode) are essential to the Discourse mission. Thanks to the efforts of the greater community, Discourse has already been translated into 87 languages and counting. We’re thrilled to support the Unicode Consortium’s mission of making all software available in every language. — discourse.org
The Unicode Consortium thanks discourse.org for their support!

All sponsors are listed on Sponsors of Adopted Characters. More than 128,000 other characters are available for adoption — see Adopt a Character

Gold Sponsor dtSearch Corp.

The Unicode Consortium is pleased to announce that dtSearch Corp. is now a gold sponsor for:


dtSearch Corp.'s  sponsorship directly funds the work of the Unicode Consortium in enabling modern software and computing systems to support the widest range of human languages. There are approximately 7,000 living human languages. Fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. AAC donations are used to improve support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage. 
dtSearch Corp. appreciates the critical role that the Unicode Standard has played in making search possible across so many of the world's languages. The recent Adopt-a-Character grants to support encoding Mayan Script and Egyptian Hieroglyphs demonstrate how the Unicode Consortium's continuing efforts further the preservation and sharing of human knowledge. — dtSearch Corp.
The Unicode Consortium thanks dtSearch Corp. for their support!

All sponsors are listed on Sponsors of Adopted Characters. More than 128,000 other characters are available for adoption — see Adopt a Character

Adopt-A-Character Grant to Support Three Historic Scripts

AAC imageThe Adopt-a-Character Program has awarded a grant to support further development of the following three historic scripts in the Unicode Standard:
  • Dhives Akuru, a Brahmi-based script formerly used to write the Maldivian language in the Maldive islands
  • Elymaic, an Aramaic-based script formerly used in the region southeast of the Tigris river in Iran
  • Khwarezmian, a script formerly used in the northern part of Uzbekistan and the adjacent areas of Turkmenistan and Kazakhstan
This grant will fund the development of proposals for encoding scripts that can be included in the Unicode Standard. The work will be done by Anshuman Pandey under the direction of Deborah Anderson (SEI, UC Berkeley) and Rick McGowan (Unicode Consortium).

Over 100,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

[infinity badge]

Wednesday, July 5, 2017

Gold Sponsor MediaLab inc.

The Unicode Consortium is pleased to announce that MediaLab, Inc. is now a gold sponsor for:


MediaLab, Inc.'s  sponsorship directly funds the work of the Unicode Consortium in enabling modern software and computing systems to support the widest range of human languages. There are approximately 7,000 living human languages. Fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. AAC donations are used to improve support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage. 

The Unicode Consortium thanks MediaLab, Inc. for their support!

All sponsors are listed on Sponsors of Adopted Characters. More than 128,000 other characters are available for adoption — see Adopt a Character.   

Tuesday, June 27, 2017

Gold Sponsor Elastic

The Unicode Consortium is pleased to announce that Elastic is now a gold sponsor for:

Elastic's  sponsorship directly funds the work of the Unicode Consortium in enabling modern software and computing systems to support the widest range of human languages. There are approximately 7,000 living human languages. Fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. AAC donations are used to improve support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage. 

Elastic builds software to make data usable in real time and at scale for search, logging, security, and analytics use cases. Founded in 2012, Elastic develops the open source Elastic Stack (Elasticsearch, Kibana, Beats, and Logstash), X-Pack (commercial features), and Elastic Cloud (a SaaS offering). When Elastic Founder and CEO Shay Banon found out about the Unicode adoption program, he had a cool idea: why not allow every engineer at Elastic (as well as other teammates within the company) to choose and adopt a character? Check out this blog— Elastic


The Unicode Consortium thanks Elastic for their support!

All sponsors are listed on Sponsors of Adopted Characters. More than 128,000 other characters are available for adoption — see Adopt a Character.   

Monday, June 26, 2017

Gold Sponsor Avocados from Mexico

The Unicode Consortium is pleased to announce that Avocados from Mexico is now a gold sponsor for:

Avocados from Mexico’s sponsorship directly funds the work of the Unicode Consortium in enabling modern software and computing systems to support the widest range of human languages. There are approximately 7,000 living human languages. Fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. AAC donations are used to improve support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage.

Avocados From Mexico are healthy, always in season and a delicious way to elevate go-to dishes into a nutritious meal. They provide naturally good fats, nearly 20 vitamins and minerals,  are cholesterol- and sodium-free, making this fresh fruit a heart-healthy fruit. You can find more information and recipe ideas at AvocadosFromMexico.com
Avocados from Mexico

The Unicode Consortium thanks Avocados from Mexico for their support!

All sponsors are listed on Sponsors of Adopted Characters. More than 128,000 other characters are available for adoption — see Adopt a Character.   

Tuesday, June 20, 2017

Announcing The Unicode® Standard, Version 10.0

Soyombo 11A9EVersion 10.0 of the Unicode Standard is now available. For the first time, both the core specification and the data files are available on the same date. Version 10.0 adds 8,518 characters, for a total of 136,690 characters. These additions include four new scripts, for a total of 139 scripts, as well as 56 new emoji characters.

The new scripts and characters in Version 10.0 add support for lesser-used languages and unique written requirements worldwide, including:
  • Masaram Gondi, used to write Gondi in Central and Southeast India
  • Nüshu,used by women in China to write poetry and other discourses until the late twentieth century
  • Soyombo and Zanabazar Square, used in historic Buddhist texts to write Sanskrit, Tibetan, and Mongolian
  • Syriac letters used for writing Suriyani Malayalam, also known as Garshuni and as Syriac Malayalam
  • Gujarati signs used for the transliteration of the Arabic script into Gujarati by Ismaili Khoja communities
  • A set of 285 Hentaigana characters used in Japan (historic variants of Hiragana characters)
  • CJK Extension F (7,473 Han characters)
Among important symbol additions are:
  • Bitcoin sign
  • A set of Typicon marks and symbols
  • 56 emoji characters including:
🧙  mage 🥦  coconut
 fairy 🥦  broccoli
🧛  vampire 🥪  sandwich

For the full list of emoji characters, see emoji additions for Unicode 10.0, and Emoji Counts. For a detailed description of support for emoji characters by the Unicode Standard, see UTS #51, Unicode Emoji.

Three other important Unicode specifications have been updated for Version 10.0:

Unicode 10.0 includes a number of changes. Some of the Unicode Standard Annexes have modifications for Unicode 10.0, often in coordination with changes to character properties. In particular, there are changes to UAX #14, Unicode Line Breaking Algorithm, UAX #29, Unicode Text Segmentation, and UAX #31, Unicode Identifier and Pattern Syntax. In addition, UAX #50, Unicode Vertical Text Layout, has been newly incorporated as a part of the standard.

The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated standards, and data form the foundation for CLDR and ICU releases.

Adopt-a-Character

All the additional 8,518 characters including 239 new emoji are now available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages.

[emoji image]

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.

The membership of the consortium represents a broad spectrum of corporations and organizations, many in the computer and information processing industry. Members include: Adobe, Apple, EmojiXpress, Facebook, Google, Government of Bangladesh, Government of India, Huawei, IBM, Microsoft, Monotype Imaging, Netflix, Sultanate of Oman MARA, Oracle, Rajya Marathi Vikas Sanstha, SAP, Symantec, Tamil Virtual University, The University of California (Berkeley), plus well over a hundred Associate, Liaison, and Individual members. For a complete member list go to http://www.unicode.org/consortium/members.html.

Monday, June 19, 2017

Gold Sponsor CSRA

The Unicode Consortium is pleased to announce that CSRA is now a gold sponsor for:

sponsor

CSRA’s sponsorship directly funds the work of the Unicode Consortium in enabling modern software and computing systems to support the widest range of human languages. There are approximately 7,000 living human languages. Fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. AAC donations are used to improve support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage.

CSRA is a leading provider of next-generation technology to its public-sector customers. The company’s sponsorship of the U.S. flag emoji is symbolic of the nexus between its IT services and its customers, as featured in an article by NextGov.

The Unicode Consortium thanks CSRA for their support!

All sponsors are listed on Sponsors of Adopted Characters. More than 128,000 other characters are available for adoption — see Adopt a Character.

Wednesday, June 14, 2017

Gold Sponsor ☮.com

The Unicode Consortium is pleased to announce that ☮.com is now a gold sponsor for:

☮

☮.com’s sponsorship directly funds the work of the Unicode Consortium in enabling modern software and computing systems to support the widest range of human languages. There are approximately 7,000 living human languages. Fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. AAC donations are used to improve support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage.

☮.com proudly supports Unicode's efforts because those efforts promote wider and clearer communication to prevent misunderstandings that can cause conflict, violence, and suffering anywhere around the world.

The Unicode Consortium thanks ☮.com for their support!

All sponsors are listed on Sponsors of Adopted Characters. More than 128,000 other characters are available for adoption — see Adopt a Character.

Tuesday, June 13, 2017

Feedback on Draft Additional Repertoire for Amendments to ISO/IEC 10646:2017 (5th edition)

chart image The Unicode Technical Committee is soliciting feedback on pending additions to the draft repertoire of characters, to help discover any errors in character names, incorrect glyphs, or other problems. There is a short window of opportunity to review and comment on the repertoire additions noted below.

Additional repertoire for two amendments to ISO/IEC 10646:2017 (5th Edition) is under review. See the associated repertoire in: Feedback on draft additional repertoire for Amendment 1.3 (PDAM) to ISO/IEC 10646:2017 (5th edition) and Feedback on draft additional repertoire for Amendment 2 (PDAM) to ISO/IEC 10646:2017 (5th edition).

Review of the Amendment 1.3 draft repertoire is especially urgent, as that content will be finalized by SC2 in September, and is scheduled for eventual publication in next year's Unicode 11.0. Note that the hentaigana and emoji portions of the amendment have already been accelerated for imminent publication in Unicode 10.0, so further comments on character names for those portions of the repertoire are no longer actionable.

There is more time to provide feedback on the Amendment 2 draft repertoire, but note that the addition of Mtavruli Georgian as part of that repertoire is also rather urgent.

The Unicode Standard is developed in synchrony with ISO/IEC 10646. After ISO balloting is completed on any repertoire additions, no further changes or corrections will be possible. (See the FAQ Standards Developing Organizations for additional information on the stages in ISO standards development.) Advance feedback on these repertoire additions will help inform the UTC discussions about its own contribution to the ISO balloting process.

Documents referenced in the draft repertoire with numbers such as L2/15-088 are available in the UTC Document Registry.

For information about how to discuss this Public Review Issue and how to supply formal feedback, please see the feedback and discussion instructions.

Monday, May 22, 2017

Unicode Emoji submission deadline now July 1

The next emoji character submission deadline has been moved up to July 1, 2017 to accommodate upcoming changes in the release schedule for Unicode versions. Emoji character proposals submitted before July 1 are eligible to be considered for the 2018 version of Unicode, those submitted after that date will be considered earliest for the 2019 version.

The change in deadline only affects proposals for new emoji characters; proposals that don’t involve new characters — such as for new ZWJ sequences or subdivision flags — are unaffected by the change in deadline.

The annual Unicode Standard release is being shifted from June to early March to to better align with product development schedules across the industry, especially for mobile products. This shift will not fully take effect until 2019, but in preparation for this change the submission date for emoji character proposals is being adjusted now.

The 239 new emoji are also now available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages.

[emoji 23f3 image]

Thursday, May 18, 2017

Unicode Emoji 5.0 specification now final

The new Emoji 5.0 set was finalized in March 2017, making it available for vendors to begin working on their emoji fonts and code ahead of the release of Unicode 10.0, scheduled for June 2017.

The Emoji 5.0 specification is now final as well. The specification has become a technical standard, adding conformance clauses and enhanced syntax definitions. A general mechanism for emoji tag sequences has been added, initially used for country subdivisions such as Scotland. The Emoji_Component property has been added, for filtering out characters from keyboard palettes. The design and usage guidelines have also been enhanced.

The 239 new emoji are also now available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages.

[emoji 1f92b image]

Wednesday, April 26, 2017

Last Call on Unicode 10.0 Beta Review

U10 beta image The beta review period for Unicode 10.0 and related technical standards will close on May 1, 2017. This is the last opportunity for technical comments before version 10.0 is released in Q2 2017. Implementers and interested parties are encouraged to download data files, review proposed updates, and submit comments soon.

In addition to the Unicode Standard proper, three other Unicode Technical Standards have significant text and data file updates that are correlated with the new additions for Unicode 10.0.0. Review of that text and data is also encouraged during the beta review period.

UTS #10, Unicode Collation Algorithm Data files
UTS #39, Unicode Security Mechanisms Data files
UTS #46, Unicode IDNA Compatibility Processing Data files

Additional documents are available for public review and will be discussed at the May UTC meeting, such as the final Emoji 5.0 text, and a proposed Unicode character property. For more information, see the open public review issues and the UTC document registry.

The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated standards, and data form the foundation for CLDR and ICU releases. Thus it is important to ensure a smooth transition to each new version of the standard.

Unicode 10.0 includes a number of changes. Some of the Unicode Standard Annexes have modifications for Unicode 10.0, often in coordination with changes to character properties. In particular, there are changes to UAX #14, Unicode Line Breaking Algorithm, UAX #29, Unicode Text Segmentation, and UAX #31, Unicode Identifier and Pattern Syntax. In addition, UAX #50, Unicode Vertical Text Layout, has been newly incorporated as a part of the standard. Four new scripts have been added in Unicode 10.0, including Nüshu. There are also 56 additional emoji characters, a major new extension of CJK ideographs, and 285 hentaigana, important historic variants for Hiragana syllables.

Please review the documentation, adjust your code, test the data files, and report errors and other issues to the Unicode Consortium by May 1, 2017. Feedback instructions are on the beta page.

See http://unicode.org/versions/beta-10.0.0.html for more information about testing the 10.0.0 beta.

See http://unicode.org/versions/Unicode10.0.0/ for the current draft summary of Unicode 10.0.0.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.

The membership of the consortium represents a broad spectrum of corporations and organizations, many in the computer and information processing industry. Members include: Adobe, Apple, EmojiXpress, Facebook, Google, Government of Bangladesh, Government of India, Huawei, IBM, Microsoft, Monotype Imaging, Netflix, Sultanate of Oman MARA, Oracle, Rajya Marathi Vikas Sanstha, SAP, Symantec, Tamil Virtual University, The University of California (Berkeley), plus well over a hundred Associate, Liaison, and Individual members. For a complete member list go to http://www.unicode.org/consortium/members.html.

Monday, April 17, 2017

ICU 59 Released

ICU LogoUnicode® ICU 59 has just been released! ICU is the main avenue for many software products and libraries to support the world's languages, implementing both the latest version of the Unicode encoding standard and of the Unicode locale data (CLDR).

ICU 59 upgrades to CLDR 31 and to emoji 5.0 data, together with segmentation and bidi updates from Unicode 10 beta. The Java code for number formatting has been completely rewritten for reliability and performance. There is also a new case mapping API for styled text, and a technology preview of enhanced language matching.

There are major changes for ICU4C that will make ICU easier to use but require changes in projects using ICU: C++11, char16_t, UTF-8 source files.

For details please see http://site.icu-project.org/download/59

Thursday, April 13, 2017

Call for Unicode 10.0 Cover Design Art

 [cover1] The Unicode Consortium is inviting artists and designers to submit cover design proposals for Version 10.0 of The Unicode Standard.

The cover design will appear on the Unicode Standard 10.0 web page, in the print-on-demand publication, and in associated promotional literature on the Unicode website. The chosen artist will receive full credit in the colophon of the publication, and wherever else the design appears, and receive $700. The two runner-up artists will receive $150 apiece.

Please see the announcement web page for requirements and more details.

Friday, April 7, 2017

PRI #351: Combined registration of the KRName collection and of sequences in that collection

PRI 351 The Unicode Consortium has posted a new issue for public review and comment.

Public Review Issue #351: A submission for the “Combined registration of the KRName collection and of sequences in that collection” has been received by the IVD Registrar.

This submission is currently under review according to the procedures of UTS #37, Unicode Ideographic Variation Database, with an expected close date of 2017-07-07. Please see the submission page for details and instructions on how to review this issue and provide comments:

http://www.unicode.org/ivd/pri/pri351/

The IVD (Ideographic Variation Database) establishes a registry for collections of unique, and sometimes shared, variation sequences for Ideographs, which enables standardized interchange in plain text, in accordance with UTS #37, Unicode Ideographic Variation Database.

Monday, March 27, 2017

Unicode Emoji 5.0 characters now final


Fifty-six new emoji characters are in the just released Emoji 5.0 data, including such characters as:

shushing face mage
flying saucerpie
T-Rexbroccoli*
* for healthy eaters!

The new Emoji 5.0 set is fixed, and available for vendors to begin working on their emoji fonts and code ahead of the release of Unicode 10.0, scheduled for June 2017.

The majority of these new emoji characters are the 34 Smileys & People, with 13 new Food & Drink, followed up by 6 Animals & Nature and a few others.

There are an additional 180 emoji sequences for gender and skin-tone in Smileys & People — such as woman in lotus position: medium skin tone — and new regional flags for England, Scotland, and Wales. This makes a total of 239 new emoji (characters and sequences). For a full list, see Emoji Recently Added.

The emoji charts have been updated to show the new characters and sequences. The draft Emoji 5.0 specification will be finalized in the May UTC meeting, and is still available for comment.
The 239 new emoji are also now available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages.

Adopt a Character