For corrigenda and errata after the release of Unicode There were no significant changes to the Stability Policy of the core specification between Unicode Seven new scripts were added with accompanying new block descriptions:.
Most character additions are in new blocks, but there are also character additions to a number of existing blocks. For details, see delta code charts. There are no significant new conformance requirements in Unicode The detailed listing of all changes to the contributory data files of the Unicode Character Database for Version The changes listed there include character additions and property revisions to existing characters that will affect implementations.
Some of the important impacts on implementations migrating from earlier versions of the standard are highlighted in Section M. In Version The most important of these changes are listed below. There are also significant revisions in the Unicode Technical Standards whose versions are synchronized with the Unicode Standard. There are a significant number of changes in Unicode The most important of these are listed and explained here, to help focus on the issues most likely to cause unexpected trouble during upgrades.
Some of these scripts have particular attributes which may cause issues for implementations. There are two new sets of vigesimal base 20 numerals, one for the Medefaidrin script, and another for Mayan. The Mayan numerals are added for specialty use, as for page numbers, in advance of the encoding of the full Mayan script. Indic Siyaq numerals have complex formatting requirements, when combined to represent large numbers.
Casing behavior for the Georgian script has changed significantly. Starting with Version This change will have major implications for Georgian implementations, including changes for input methods, fonts, casing, and string matching. Existing implementations have treated Mtavruli headlines and other uses for textual emphasis as a text style, so there will also be significant issues for document conversion and upgrade.
Another complication for Georgian is that the primary orthography does not use titlecasing, and the Mkhedruli Georgian letters do not have titlecase mappings to Mtavruli letters. This is unique among bicameral systems in the Unicode Standard, so casing implementations should be prepared for this exception. As a result, there is now a further deviation between the mappings defined in BidiMirroring.
Starting with Unicode This new convention is applicable to the two newly encoded cursive joining scripts: Hanifi Rohingya and Sogdian. Those values are still part of the enumeration of the property values, because stability constraints prevent removal of enumerated property values, even if obsolete; however, these are no longer assigned to any characters, and are no longer referred to explicitly by any rules in the algorithms.
That is a separate property relevant to emoji, rather than a particular class of the GCB or WB properties. A new rule in the WB algorithm makes use of that new property value to prevent word breaks within runs of whitespace characters.
Now there is simply no break opportunity following a ZWJ. This improves line breaking behavior for emoji sequences, in particular. Implementations which use hard coded ranges for ideographs will need updates for those values.
The short diagonal stroke form is included in the Adobe-Japan glyph set, which is used as the basis for numerous OpenType Japanese fonts.
That property, which provides equivalent ideograph mappings where possible for CJK radicals and CJK stroke characters, is intended to support tailorings of sorting and searching, which may need to include radicals and strokes in their scope, for completeness. There are numerous changes in the representative glyphs, some backed by explicit errata.
There are also glyph changes in the text presentation of a number of emoji and emoticons. Some of those changes reflect an attempt to make the text presentation glyphs for emoji converge on common practice among vendors for the emoji presentation glyphs. Such glyph changes are highlighted in violet in the delta code charts for Version The use of characters beyond the range of Latin-1 is now allowed in annotations in the names list.
See NamesList. Some other adaptations have been made in the use of fonts in the names list part of the code charts. Added a table of formal regex definitions to rationalize the definition of the classes used for grapheme cluster boundaries.
Documented extension of Unihan properties to non-Unihan characters. Updated the discussion of emoji variation sequences. Provided further clarification about the range of numeric values allowed for the Age property. Also refined the suggestions about checking certain kinds of combining sequences in spoof detection. An emoji ZWJ sequence mechanism was added for hinting at glyph facing direction for some emoji. Documentation was added regarding the use of the four new hair emoji components.
A discussion was added regarding the use of gender neutral emoji. Home Site Map Search. Full Text pdf for Viewing Title and Copyright. There are also 66 additional emoji characters.
Please review the documentation, adjust your code, test the data files, and report errors and other issues to the Unicode Consortium by April 23, Feedback instructions are on the beta page.
The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards. The membership of the consortium represents a broad spectrum of corporations and organizations, many in the computer and information processing industry.
By Diana T March 14th, Skip to content. As always, careful review of the updated code charts for Version Particular issues to take note of include:. For the most part, the additions for new scripts and other characters are unremarkable, but implementations should be checked to ensure the new additions do not cause problems. The following blocks are new in Unicode Check implementations carefully for any range or property value assumptions regarding these new blocks.
See also the single-block delta charts. Some blocks have also had font updates; see the single-block delta charts for details. In such cases, careful review of the blocks in question is advised, to ensure that there have not been any regressions in representative glyph display. For current proposed updates to the particular UAXes, see Proposed Updates for Standard Annexes or use the links in the navigation bar on this page.
Each proposed textual change in a UAX is highlighted, so that you can focus your review on those sections if you have limited time. The changes are also listed in detail in the Modifications sections linked from the table of contents of each document , and are summarized in UAX changes , so you can check on those areas that might be of most interest. Some links between beta documents and the proposed updates for UAXes will not work correctly during the beta review period.
This is a known problem which does not need to be reported, as such links point to the eventual final names or revision numbers for the released versions. Certain character properties for newly assigned characters cannot be changed after the formal release of each version of the standard, because of the Character Encoding Stability Policy. Such character property values need special attention during the beta review process, as they cannot be corrected after publication.
These include:. Note: The beta review period for Unicode Feedback received during the public review can be referred to from PRI This beta review page is left active, however, for convenience of access to the prepublication versions of the Unicode Summary description Unicode character database UCD Summary of beta charts Single-block delta charts with yellow highlighting for new characters Single-block charts for all of Unicode Range Block Name 1C The Unicode Standard.
Home Site Map Search. UTS 51, Unicode Emoji. Unicode Unicode 9.
0コメント