diff options
author | Fredrik Roubert <roubert@google.com> | 2018-05-11 16:34:25 +0200 |
---|---|---|
committer | Fredrik Roubert <roubert@google.com> | 2018-05-11 16:34:25 +0200 |
commit | 69a166403d566ea2635b08e8a408c5d40e31992f (patch) | |
tree | e05fbb88db606587506f1b9d56f7c71c81513e04 /specs | |
parent | 333a3d868716b770fa7dbada1f70e6cc2edaaa2c (diff) | |
download | android_external_cldr-69a166403d566ea2635b08e8a408c5d40e31992f.tar.gz android_external_cldr-69a166403d566ea2635b08e8a408c5d40e31992f.tar.bz2 android_external_cldr-69a166403d566ea2635b08e8a408c5d40e31992f.zip |
Copy CLDR 33 from unicode.org to aosp/cldr-release-33.
These files were exported from the CLDR Subversion repository by running
the following commands:
rm -rf *
svn export --force https://unicode.org/repos/cldr/tags/release-33 .
git clean -dfX
git add -A .
Bug: 79564536
Change-Id: I27dfd029083807659193cff211f35167a3a0922e
Diffstat (limited to 'specs')
-rw-r--r-- | specs/ldml/images/keycapHint.png | bin | 0 -> 2002 bytes | |||
-rw-r--r-- | specs/ldml/tr35-collation.html | 8 | ||||
-rw-r--r-- | specs/ldml/tr35-dates.html | 8 | ||||
-rw-r--r-- | specs/ldml/tr35-general.html | 75 | ||||
-rw-r--r-- | specs/ldml/tr35-info.html | 8 | ||||
-rw-r--r-- | specs/ldml/tr35-keyboards.html | 1600 | ||||
-rw-r--r-- | specs/ldml/tr35-numbers.html | 10 | ||||
-rw-r--r-- | specs/ldml/tr35.html | 350 |
8 files changed, 1705 insertions, 354 deletions
diff --git a/specs/ldml/images/keycapHint.png b/specs/ldml/images/keycapHint.png Binary files differnew file mode 100644 index 0000000..bd64f84 --- /dev/null +++ b/specs/ldml/images/keycapHint.png diff --git a/specs/ldml/tr35-collation.html b/specs/ldml/tr35-collation.html index 7ae48e7..169b265 100644 --- a/specs/ldml/tr35-collation.html +++ b/specs/ldml/tr35-collation.html @@ -90,11 +90,11 @@ h2, h3, h4, table { Unicode Locale Data Markup Language (LDML)<br>Part 5: Collation </h1> - <!-- This header table should be identical across the parts of this UTS. --> + <!-- At least the first row of this header table should be identical across the parts of this UTS. --> <table border="1" cellpadding="2" cellspacing="0" class="wide"> <tr> <td>Version</td> - <td>32</td> + <td>33</td> </tr> <tr> <td>Editors</td> @@ -132,7 +132,7 @@ h2, h3, h4, table { <i>Status</i> </h3> - <!-- NOT YET APPROVED + <!-- NOT YET APPROVED <p> <i class="changed">This is a<b><font color="#ff3333"> draft </font></b>document which may be updated, replaced, or superseded by @@ -3294,7 +3294,7 @@ FDD0 0065; [, B1, 09] # [1644.0020.0004] [0000.0061.0004] * U+A7A1 LAT <hr> <p class="copyright"> - Copyright © 2001–2017 Unicode, Inc. All + Copyright © 2001–2018 Unicode, Inc. All Rights Reserved. The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential diff --git a/specs/ldml/tr35-dates.html b/specs/ldml/tr35-dates.html index a82bc49..744a973 100644 --- a/specs/ldml/tr35-dates.html +++ b/specs/ldml/tr35-dates.html @@ -90,11 +90,11 @@ h2, h3, h4, table { Unicode Locale Data Markup Language (LDML)<br> Part 4: Dates </h1> - <!-- This header table should be identical across the parts of this UTS. --> + <!-- At least the first row of this header table should be identical across the parts of this UTS. --> <table border="1" cellpadding="2" cellspacing="0" class="wide"> <tr> <td>Version</td> - <td>32</td> + <td>33</td> </tr> <tr> <td>Editors</td> @@ -129,7 +129,7 @@ h2, h3, h4, table { <i>Status</i> </h3> - <!-- NOT YET APPROVED + <!-- NOT YET APPROVED <p> <i class="changed">This is a<b><font color="#ff3333"> draft </font></b>document which may be updated, replaced, or superseded by @@ -4750,7 +4750,7 @@ h2, h3, h4, table { <hr> <p class="copyright"> - Copyright © 2001–2017 Unicode, Inc. All + Copyright © 2001–2018 Unicode, Inc. All Rights Reserved. The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential diff --git a/specs/ldml/tr35-general.html b/specs/ldml/tr35-general.html index 1883496..954b3a6 100644 --- a/specs/ldml/tr35-general.html +++ b/specs/ldml/tr35-general.html @@ -90,11 +90,11 @@ h2, h3, h4, table { Unicode Locale Data Markup Language (LDML)<br> Part 2: General </h1> - <!-- This header table should be identical across the parts of this UTS. --> + <!-- At least the first row of this header table should be identical across the parts of this UTS. --> <table border="1" cellpadding="2" cellspacing="0" class="wide"> <tr> <td>Version</td> - <td>32</td> + <td>33</td> </tr> <tr> <td>Editors</td> @@ -130,7 +130,7 @@ h2, h3, h4, table { <i>Status</i> </h3> - <!-- NOT YET APPROVED + <!-- NOT YET APPROVED <p> <i class="changed">This is a<b><font color="#ff3333"> draft </font></b>document which may be updated, replaced, or superseded by @@ -295,6 +295,7 @@ h2, h3, h4, table { <ul class="toc"> <li>14.1 <a href="#SynthesizingNames">Synthesizing Sequence Names</a></li> <li>14.2 <a href="#Character_Labels">Annotations Character Labels</a></li> + <li>14.3 <a href="#Typographic_Names">Typographic Names</a></li> </ul> </li> </ul> @@ -1237,10 +1238,14 @@ h2, h3, h4, table { <li>The "US" value indicates the customary system of measurement as used in the United States: feet, inches, pints, quarts, degrees Fahrenheit, and so on.</li> - <li>The "UK" value indicates the customary system of measurement - as was used in the United Kingdom: feet, inches, pints, quarts, and - so on. It is also called the <em>Imperial system</em>: the pint, - quart, and so on are different sizes than in "US". + <li>The "UK" value indicates the mix of metric units and + Imperial units (feet, inches, pints, quarts, and so on) used in the + United Kingdom, in which Imperial volume units such + as pint, quart, and gallon are different sizes than in the "US" + customary system. For more detail about specific units + for various usages, see <strong>Part 6: Supplemental:</strong> <em>Section 2.4.1 + <a href="tr35-info.html#Preferred_Units_For_Usage">Preferred Units for + Specific Usages</a></em>. </li> </ul> <p>In some cases, it may be common to use different measurement @@ -1264,7 +1269,11 @@ h2, h3, h4, table { <p>The measurement information was formerly in the main LDML file, and had a somewhat different format.</p> - + + <p>Again, for finer-grained detail about specific units + for various usages, see <strong>Part 6: Supplemental:</strong> <em>Section 2.4.1 + <a href="tr35-info.html#Preferred_Units_For_Usage">Preferred Units for + Specific Usages</a></em>.</p> <h3> 5.1 <a name="Measurement_Elements" href="#Measurement_Elements">Measurement @@ -4463,10 +4472,58 @@ h2, h3, h4, table { <tr><th>nonspacing</th><td>nonspacing</td> <td>Uses for characters that occupy no width by themselves, such as the ¨ over the a in ä.</td></tr> </table> + <h3> + 14.3 <a name="Typographic_Names" href="#Typographic_Names">Typographic Names</a> + </h3> + + <p class='dtd'><!ELEMENT typographicNames ( alias | ( axisName*, styleName*, featureName*, special* ) ) ></p> + <p class='dtd'><!ELEMENT axisName ( #PCDATA ) ><br> + <!ATTLIST axisName type (ital | opsz | slnt | wdth | wght) #REQUIRED ><br> + <!ATTLIST axisName alt NMTOKENS #IMPLIED ></p> + <p class='dtd'><!ELEMENT styleName ( #PCDATA ) ><br> + <!ATTLIST styleName type (ital | opsz | slnt | wdth | wght) #REQUIRED ><br> + <!ATTLIST styleName subtype NMTOKEN #REQUIRED ><br> + <!ATTLIST styleName alt NMTOKENS #IMPLIED ></p> + <p class='dtd'><!ELEMENT featureName ( #PCDATA ) ><br> + <!ATTLIST featureName type (afrc | cpsp | dlig | frac | lnum | onum | ordn | pnum | smcp | tnum | zero) #REQUIRED ><br> + <!ATTLIST featureName alt NMTOKENS #IMPLIED ></p> + <p>The typographic names provide for names of font features for use in a UI. This is useful for apps that show the name of font styles and design axes according to the user’s languages. It would also be useful for system-level libraries.</p> + <p>The identifers (types) use the tags from the OpenType Feature Tag Registry. Given their large number, only the names of frequently-used OpenType feature names are available CLDR. (Many features are not user-visible settings, but instead serve as a data channel for sofware to pass information to the font). + The example below shows an approach for using the CLDR data. Of course, applications are free to implement their own algorithms depending on their specific needs.</p> +<p>To find a localized subfamily name such as “Extraleicht Schmal” for a font called “Extralight Condensed”, a system or application library might do the following: </p> + <ol> + <li> + <p>Determine the set of languages in which the subfamily name can potentially be returned.This is the union of the languages for which the font contains ‘name’ table entries with ID 2 or 17, plus the languages for which CLDR supplies typographic names. </p> + </li> + <li> + <p>Use a language matching algorithm such as in ICU to find the best available language given the user preferences. The resulting subfamily name will be localized to this language. </p> + </li> + <li> + <p>If the font’s ‘name’ table contains a typographic subfamily name (ID17) in this language and all font variation axes are set to their defaults, return this name. </p> + </li> + <li> + <p>If the font’s ‘name’ table contains a font subfamilyname (‘name’ID2) in this language and all font variation axes are set to their defaults, return this name. </p> + </li> + <li> + <p>If the font has a style attributes (STAT) table, lookup the design axis tags and their ordering. If the font has no STAT table, assume [Width, Weight, Slant] as axis ordering, and infer the font’s style atributes from other available data in the font (eg. the OS/2 table). </p> + </li> + <li>For each design axis, find a localized style name for its value. + <ol> + <li>If the font’s style attributes point to a ‘name’ table entry that is available the result language, use this name.</li> + <li>Otherwise, generate a fallback name from CLDR style Name data. + <ol> + <li>The type key is the OpenType axis tag ( ‘wght’). The subtype and alt keys are taken from the entry in English CLDR where the string is equal to the English name in the font. For example, when the font uses a weight whose English style name is “Extralight”, this will lead to subtype = “200” and alt = “variant”. If there is no match, take the axis value (“200”) for subtype and the empty string for alt. </li> + <li>Look up (type, subtype) in a data table derived from CLDR’s style names. If CLDR supplies multiple alternate names for this (type, subtype), use the one whose “alt” key is matching; otherwise, use the default alternate (which has no “alt” atribute in CLDR).</li> + </ol> + </li> + </ol> + </li> + <li>Concatenate the strings, with a separator between them.</li> + </ol> <hr> <p class="copyright"> - Copyright © 2001–2017 Unicode, Inc. All + Copyright © 2001–2018 Unicode, Inc. All Rights Reserved. The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential diff --git a/specs/ldml/tr35-info.html b/specs/ldml/tr35-info.html index 29a656f..4c8513f 100644 --- a/specs/ldml/tr35-info.html +++ b/specs/ldml/tr35-info.html @@ -91,11 +91,11 @@ h2, h3, h4, table { Supplemental </h1> - <!-- This header table should be identical across the parts of this UTS. --> + <!-- At least the first row of this header table should be identical across the parts of this UTS. --> <table border="1" cellpadding="2" cellspacing="0" class="wide"> <tr> <td>Version</td> - <td>32</td> + <td>33</td> </tr> <tr> <td>Editors</td> @@ -131,7 +131,7 @@ h2, h3, h4, table { <i>Status</i> </h3> - <!-- NOT YET APPROVED + <!-- NOT YET APPROVED <p> <i class="changed">This is a<b><font color="#ff3333"> draft </font></b>document which may be updated, replaced, or superseded by @@ -1788,7 +1788,7 @@ LY MA OM PS QA SA SD SY TN YE"/> <hr> <p class="copyright"> - Copyright © 2001–2017 Unicode, Inc. All + Copyright © 2001–2018 Unicode, Inc. All Rights Reserved. The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential diff --git a/specs/ldml/tr35-keyboards.html b/specs/ldml/tr35-keyboards.html index 6f88e9a..9480324 100644 --- a/specs/ldml/tr35-keyboards.html +++ b/specs/ldml/tr35-keyboards.html @@ -90,11 +90,11 @@ h2, h3, h4, table { Unicode Locale Data Markup Language (LDML)<br>Part 7: Keyboards </h1> - <!-- This header table should be identical across the parts of this UTS. --> + <!-- At least the first row of this header table should be identical across the parts of this UTS. --> <table border="1" cellpadding="2" cellspacing="0" class="wide"> <tr> <td>Version</td> - <td>32</td> + <td>33</td> </tr> <tr> <td>Editors</td> @@ -129,7 +129,7 @@ h2, h3, h4, table { <i>Status</i> </h3> - <!-- NOT YET APPROVED + <!-- NOT YET APPROVED <p> <i class="changed">This is a<b><font color="#ff3333"> draft </font></b>document which may be updated, replaced, or superseded by @@ -224,8 +224,34 @@ h2, h3, h4, table { </ul> </li> <li>5.8 <a href="#Element_map">Element: map</a></li> - <li>5.9 <a href="#Element_transforms">Element: transforms</a></li> - <li>5.10 <a href="#Element_transform">Element: transform</a></li> + <li>5.9 <a href="#Element_import">Element: + import</a></li> + <li>5.10 <a href="#Element_displayMap">Element: + displayMap</a></li> + <li>5.11 <a href="#Element_display">Element: + display</a></li> + <li>5.12 <a href="#Element_layer">Element: + layer</a></li> + <li>5.13 <a href="#Element_row">Element: + row</a></li> + <li>5.14 <a href="#Element_switch">Element: + switch</a></li> + <li>5.15 <a href="#Element_vkeys">Element: + vkeys</a></li> + <li>5.16 <a href="#Element_vkey">Element: + vkey</a></li> + <li>5.17 <a href="#Element_transforms">Element: + transforms</a></li> + <li>5.18 <a href="#Element_transform">Element: + transform</a></li> + <li>5.19 <a href="#Element_reorder">Element: + reorder</a></li> + <li>5.20 <a href="#Element_final">Element: + final</a></li> + <li>5.21 <a href="#Element_backspaces">Element: + backspaces</a></li> + <li>5.22 <a href="#Element_backspace">Element: + backspace</a></li> </ul> </li> <li>6 <a href="#Element_Heirarchy_Platform_File">Element @@ -318,29 +344,60 @@ h2, h3, h4, table { whether the modifier RAlt (=AltGr) should be merged with Option. In the end, they were kept separate, but for comparison across platforms implementers may choose to unify them.</p> + <p> + Note that in parts of this document, the format <strong>@x</strong> + is used to indicate the <em>attribute</em> <strong>x</strong>. + </p> <h2> 3 <a name="Definitions" href="#Definitions">Definitions</a> </h2> - <p>Keyboard: The physical keyboard.</p> - <p>Key: A key on a physical keyboard.</p> - <p>Modifier: A key that is held to change the behavior of a - keyboard. For example, the "Shift" key allows access to upper-case - characters on a US keyboard. Other modifier keys include but is not - limited to: Ctrl, Alt, Option, Command and Caps Lock.</p> - <p>Key code: The integer code sent to the application on pressing - a key.</p> <p> - ISO position: The corresponding position of a key using the ISO - layout convention where rows are identified by letters and columns - are identified by numbers. For example, "D01" corresponds to the "Q" - key on a US keyboard. For the purposes of this document, an ISO - layout position is depicted by a one-letter row identifier followed - by a two digit column number (like "B03", "E12" or "C00"). The - following diagram depicts a typical US keyboard layout superimposed - with the ISO layout indicators (it is important to note that the - number of keys and their physical placement relative to each-other in - this diagram is irrelevant, rather what is important is their logical - placement using the ISO convention):<img + <b>Arrangement</b> is the term used to describe the relative position + of the rectangles that represent keys, either physically or + virtually. A physical keyboard has a static arrangement while a + virtual keyboard may have a dynamic arrangement that changes per + language and/or layer. While the arrangement of keys on a keyboard + may be fixed, the mapping of those keys may vary. + </p> + <p> + <b>Base character:</b> The character emitted by a particular key when + no modifiers are active. In ISO terms, this is group 1, level 1. + </p> + <p> + <b>Base map:</b> A mapping from the ISO positions to the base + characters. There is only one base map per layout. The characters on + this map can be output by not using any modifier keys. + </p> + <p> + <b>Core keyboard layout:</b> also known as “alpha” block. The primary + set of key values on a keyboard that are used for typing the target + language of the keyboard. For example, the three rows of letters on a + standard US QWERTY keyboard (QWERTYUIOP, ASDFGHJKL, ZXCVBNM) together + with the most significant punctuation keys. Usually this equates to + the minimal keyset for a language as seen on mobile phone keyboards. + </p> + <p> + <b>Hardware map:</b> A mapping between key codes and ISO layout + positions. + </p> + <p> + <b>Input Method Editor (IME):</b> a component or program that + supports input of large character sets. Typically, IMEs employ + contextual logic and candidate UI to identify the Unicode characters + intended by the user. + </p> + <p> + <b>ISO position:</b> The corresponding position of a key using the + ISO layout convention where rows are identified by letters and + columns are identified by numbers. For example, "D01" corresponds to + the "Q" key on a US keyboard. For the purposes of this document, an + ISO layout position is depicted by a one-letter row identifier + followed by a two digit column number (like "B03", "E12" or "C00"). + The following diagram depicts a typical US keyboard layout + superimposed with the ISO layout indicators (it is important to note + that the number of keys and their physical placement relative to + each-other in this diagram is irrelevant, rather what is important is + their logical placement using the ISO convention):<img src="images/keyPositions.png" alt="keyboard layout example showing ISO key numbering"> </p> @@ -358,23 +415,70 @@ h2, h3, h4, table { the ISO layout to support keys that are located to the left of the "00" column by using negative column numbers "-01", "-02" and so on, or 100's complement "99", "98",...</p> - <p>Hardware map: A mapping between key codes and ISO layout - positions.</p> - <p>Base character: The character emitted by a particular key when - no modifiers are active. In ISO terms, this is group 1, level 1.</p> - <p>Base map: A mapping from the ISO positions to the base - characters. There is only one base map per layout. The characters on - this map can be output by not using any modifier keys.</p> - <p>Key map: The basic mapping between ISO positions and the output - characters for each set of modifier combinations associated with a - particular layout. There may be multiple key maps for each layout.</p> - <p>Transform: A transform is simply a combination of key presses - that gets transformed into one (or more) final characters. For - example, in most latin keyboards hitting the "^" dead-key followed by - the "e" key produces "ê".</p> - <p>Layout: A layout is the overall keyboard configuration for a - particular locale. Within a keyboard layout, there is a single base - map, one or more key maps and one or more transforms.</p> + <p> + <b>Key:</b> A key on a physical keyboard. + </p> + <p> + <b>Key code:</b> The integer code sent to the application on pressing + a key. + </p> + <p> + <b>Key map:</b> The basic mapping between ISO positions and the + output characters for each set of modifier combinations associated + with a particular layout. There may be multiple key maps for each + layout. + </p> + <p> + <b>Keyboard:</b> The physical keyboard. + </p> + <p> + <b>Keyboard layout:</b> A layout is the + overall keyboard configuration for a particular locale. Within a + keyboard layout, there is a single base map, one or more key maps and + zero or + more transforms. + </p> + <p> + <b>Layer</b> is an arrangement of keys on a virtual keyboard. Since + it is often not intended to use two hands on a visual keyboard to + allow the pressing of modifier keys. Modifier keys are made sticky in + that one presses one, the visual representation, and even + arrangement, of the keys change, and you press the key. This visual + representation is a layer. Thus a virtual keyboard is made up of a + set of layers. + </p> + <p> + <b>Long-press key:</b> also known as a “child key”. A secondary key + that is invoked from a top level key on a software keyboard. + Secondary keys typically provide access to variants of the top level + key, such as accented variants (a => á, à, ä, ã) + </p> + <p> + <b>Modifier:</b> A key that is held to change the behavior of a + keyboard. For example, the "Shift" key allows access to upper-case + characters on a US keyboard. Other modifier keys include but is not + limited to: Ctrl, Alt, Option, Command and Caps Lock. + </p> + <p> + <b>Physical keyboard</b> is a keyboard that has individual keys that + are pressed. Each key has a unique identifier and the arrangement + doesn't change, even if the mapping of those keys does. + </p> + <p> + <b>Transform:</b>A transform is an + element that specifies a set of conversions from sequences of code + points into one (or more) other code points. For example, in most + latin keyboards hitting the "^" dead-key followed by the "e" key + produces "ê". + </p> + <p> + <b>Virtual keyboard</b> is a keyboard that is rendered on a, + typically, touch surface. It has a dynamic arrangement and contrasts + with a physical keyboard. This term has many synonyms: touch + keyboard, software keyboard, SIP (Software Input Panel). This + contrasts with other uses of the term virtual keyboard as an + on-screen keyboard for reference or accessibility data entry. + </p> <h2> 4 <a name="File_and_Dir_Structure" href="#File_and_Dir_Structure">File and Directory Structure</a> @@ -416,15 +520,17 @@ h2, h3, h4, table { <p>{definition of the layout as described by the elements defined below}</p> <p></keyboard></p> - <p>Attribute: locale (required)</p> - <p> - This mandatory attribute represents the locale of the keyboard using - Unicode locale identifiers (see <a href="tr35.html">LDML</a>) - for - example 'el' for Greek. Sometimes, the locale may not specify - the base language. For example, a Devanagari keyboard for many - languages could be specified by BCP-47 code: 'und-Deva'. For details, - see <a href="#Keyboard_IDs">Keyboard IDs</a> . - </p> + <dl> + <dt>Attribute: locale (required)</dt> + <dd> + This mandatory attribute represents the locale of the keyboard using + Unicode locale identifiers (see <a href="tr35.html">LDML</a>) - for + example 'el' for Greek. Sometimes, the locale may not + specify the base language. For example, a Devanagari keyboard for + many languages could be specified by BCP-47 code: 'und-Deva'. For + details, see <a href="#Keyboard_IDs">Keyboard IDs</a> . + </dd> + </dl> <p>Examples (for illustrative purposes only, not indicative of the real data)</p> <pre><keyboard locale="ka-t-k0-qwerty-windows"> @@ -445,20 +551,26 @@ h2, h3, h4, table { <p> <version platform=".." revision=".."><br> </p> - <p>Attribute: platform (required)</p> - <p>The platform source version. Specifies what version of the - platform the data is from. For example, data from Mac OSX 10.4 would - be specified as platform="10.4". For platforms that have - unstable version numbers which change frequently (like Linux), this - field is set to an integer representing the iteration of the data - starting with "1". This number would only increase if there were any - significant changes in the keyboard data.</p> - <p>Attribute: number (required)</p> - <p>The data revision version.</p> - <p>Attribute: cldrVersion (fixed by DTD)</p> - <p>The CLDR specification version that is associated with this - data file. This value is fixed and is inherited from the DTD file and - therefore does not show up directly in the XML file.</p> + <dl> + <dt>Attribute: platform (required)</dt> + <dd>The platform source version. Specifies what version of the + platform the data is from. For example, data from Mac OSX 10.4 would + be specified as platform="10.4". For platforms that have + unstable version numbers which change frequently (like Linux), this + field is set to an integer representing the iteration of the data + starting with "1". This number would only increase if there were any + significant changes in the keyboard data.</dd> + </dl> + <dl> + <dt>Attribute: number (required)</dt> + <dd>The data revision version.</dd> + </dl> + <dl> + <dt>Attribute: cldrVersion (fixed by DTD)</dt> + <dd>The CLDR specification version that is associated with this + data file. This value is fixed and is inherited from the DTD file + and therefore does not show up directly in the XML file.</dd> + </dl> <p>Example</p> <p><keyboard locale="..-osx"></p> <p>…</p> @@ -472,21 +584,9 @@ h2, h3, h4, table { </h3> <p> The generation element is now deprecated. It was used to keep track - of the generation date of the data.<br> <br> Syntax + of the generation date of the data. </p> - <p> - <generation date=".."><br> - </p> - <p>Attribute: date (required)</p> - <p>The date the data was generated.</p> - <p>Example</p> - <p><keyboard locale=".."></p> - <p>…</p> - <!-- This appears to be intentionally left in, as a removed item for demonstration. --> - <p class="removed"><generation date="$Date: 2013-03-09 - 01:08:49 -0800 (Sat, 09 Mar 2013) $"/></p> - <p>…</p> - <p></keyboard></p> + <hr> <h3> 5.4 <a name="Element_names" href="#Element_names">Element: names</a> @@ -510,8 +610,10 @@ h2, h3, h4, table { <p> <name value=".."><br> </p> - <p>Attribute: value (required)</p> - <p>The name of the layout.</p> + <dl> + <dt>Attribute: value (required)</dt> + <dd>The name of the layout.</dd> + </dl> <p>Example</p> <p><keyboard locale="bg-t-k0-windows-phonetic-trad"></p> @@ -541,18 +643,22 @@ h2, h3, h4, table { [transformFailure="omit"] [transformPartial="hide"]><br> </p> - <p>Attribute: fallback="omit" (optional)</p> - <p>The presence of this attribute means that when a modifier key - combination goes unmatched, no output is produced. The default - behavior (when this attribute is not present) is to fallback to the - base map when the modifier key combination goes unmatched.</p> + <dl> + <dt>Attribute: fallback="omit" (optional)</dt> + <dd>The presence of this attribute means that when a modifier + key combination goes unmatched, no output is produced. The default + behavior (when this attribute is not present) is to fallback to the + base map when the modifier key combination goes unmatched.</dd> + </dl> <p>If this attribute is present, it must have a value of omit.</p> - <p>Attribute: transformFailure="omit" (optional)</p> - <p>This attribute describes the behavior of a transform when it is - escaped (see the transform element in the Layout file for more - information). A transform is escaped when it can no longer continue - due to the entry of an invalid key. For example, suppose the - following set of transforms are valid:</p> + <dl> + <dt>Attribute: transformFailure="omit" (optional)</dt> + <dd>This attribute describes the behavior of a transform when it + is escaped (see the transform element in the Layout file for more + information). A transform is escaped when it can no longer continue + due to the entry of an invalid key. For example, suppose the + following set of transforms are valid:</dd> + </dl> <blockquote> <p>^e → ê</p> <p>^a → â</p> @@ -567,11 +673,13 @@ h2, h3, h4, table { <p>The default behavior (when this attribute is not present) is to emit the contents of the buffer upon failure of a transform.</p> <p>If this attribute is present, it must have a value of omit.</p> - <p>Attribute: transformPartial="hide" (optional)</p> - <p>This attribute describes the behavior the system while in a - transform. When this attribute is present then don't show the values - of the buffer as the user is typing a transform (this behavior can be - seen on Windows or Linux platforms).</p> + <dl> + <dt>Attribute: transformPartial="hide" (optional)</dt> + <dd>This attribute describes the behavior the system while in a + transform. When this attribute is present then don't show the values + of the buffer as the user is typing a transform (this behavior can + be seen on Windows or Linux platforms).</dd> + </dl> <p>By default (when this attribute is not present), show the values of the buffer as the user is typing a transform (this behavior can be seen on the Mac OSX platform).</p> @@ -606,26 +714,28 @@ h2, h3, h4, table { Combinations}"]></p> <p>{a set of map elements}</p> <p></keyMap></p> - <p>Attribute: modifiers (optional)</p> - <p> - A set of modifier combinations that cause this key map to be - "active". Each combination is separated by a space. The - interpretation is that there is a match if any of the combinations - match, that is, they are ORed. Therefore, the order of the - combinations within this attribute does not matter.<br> <br> - A combination is simply a concatenation of words to represent the - simultaneous activation of one or more modifier keys. The order of - the modifier keys within a combination does not matter, although - don't care cases are generally added to the end of the string for - readability (see next paragraph). For example: "cmd+caps" represents - the Caps Lock and Command modifier key combination. Some keys have - right or left variant keys, specified by a 'R' or 'L' suffix. For - example: "ctrlR+caps" would represent the Right-Control and Caps Lock - combination. For simplicity, the presence of a modifier without a 'R' - or 'L' suffix means that either its left or right variants are valid. - So "ctrl+caps" represents the same as "ctrlL+ctrlR?+caps - ctrlL?+ctrlR+caps" - </p> + <dl> + <dt>Attribute: modifiers (optional)</dt> + <dd> + A set of modifier combinations that cause this key map to be + "active". Each combination is separated by a space. The + interpretation is that there is a match if any of the combinations + match, that is, they are ORed. Therefore, the order of the + combinations within this attribute does not matter.<br> <br> + A combination is simply a concatenation of words to represent the + simultaneous activation of one or more modifier keys. The order of + the modifier keys within a combination does not matter, although + don't care cases are generally added to the end of the string for + readability (see next paragraph). For example: "cmd+caps" represents + the Caps Lock and Command modifier key combination. Some keys have + right or left variant keys, specified by a 'R' or 'L' suffix. For + example: "ctrlR+caps" would represent the Right-Control and Caps + Lock combination. For simplicity, the presence of a modifier without + a 'R' or 'L' suffix means that either its left or right variants are + valid. So "ctrl+caps" represents the same as "ctrlL+ctrlR?+caps + ctrlL?+ctrlR+caps" + </dd> + </dl> <p>A modifier key may be further specified to be in a "don't care" state using the '?' suffix. The "don't care" state simply means that the preceding modifier key may be either ON or OFF. For example @@ -636,7 +746,7 @@ h2, h3, h4, table { combination to be active.</p> <p>Here is an exhaustive list of all possible modifier keys:</p> <p>Possible Modifier Keys</p> - <table cellpadding="0" cellspacing="0" border='1'> + <table> <caption> <a name="Possible_Modifier_Keys" href="#Possible_Modifier_Keys">Possible Modifier Keys</a> @@ -705,6 +815,7 @@ h2, h3, h4, table { key map.</p> <p>If the modifiers attribute is not present on a keyMap then that particular key map is the base map.</p> + <hr> <h3> 5.8 <a name="Element_map" href="#Element_map">Element: map</a> </h3> @@ -720,31 +831,79 @@ h2, h3, h4, table { [longPress="{long press keys}"] [transform="no"] /><!-- {Comment to improve readability (if needed)} --></pre> - <p>Attribute: iso (exactly one of base and iso is required)</p> - <p>The iso attribute represents the ISO layout position of the key - (see the definition at the beginning of the document for more - information).</p> - <p>Attribute: to (required)</p> - <p>The to attribute contains the output sequence of characters - that is emitted when pressing this particular key. Control - characters, whitespace (other than the regular space character) and - combining marks in this attribute are escaped using the \u{...} - notation.</p> - <p>Attribute: longPress (optional)</p> - <p>The longPress attribute contains any characters that can be - emitted by "long-pressing" a key, this feature is prominent in mobile - devices. The possible sequences of characters that can be emitted are - whitespace delimited. Control characters, combining marks and - whitespace (which is intended to be a long-press option) in this - attribute are escaped using the \u{...} notation.</p> - <p>Attribute: transform="no" (optional)</p> - <p>The transform attribute is used to define a key that never - participates in a transform but its output shows up as part of a - transform. This attribute is necessary because two different keys - could output the same characters (with different keys or modifier - combinations) but only one of them is intended to be a dead-key and - participate in a transform. This attribute value must be no if it is - present.</p> + <dl> + <dt>Attribute: iso (exactly one of base and iso is required)</dt> + <dd>The iso attribute represents the ISO layout position of the + key (see the definition at the beginning of the document for more + information).</dd> + </dl> + <dl> + <dt>Attribute: to (required)</dt> + <dd>The to attribute contains the output sequence of characters + that is emitted when pressing this particular key. Control + characters, whitespace (other than the regular space character) and + combining marks in this attribute are escaped using the \u{...} + notation.</dd> + </dl> + <dl> + <dt>Attribute: longPress (optional)</dt> + <dd>The longPress attribute contains any characters that can be + emitted by "long-pressing" a key, this feature is prominent in + mobile devices. The possible sequences of characters that can be + emitted are whitespace delimited. Control characters, combining + marks and whitespace (which is intended to be a long-press option) + in this attribute are escaped using the \u{...} notation.</dd> + </dl> + <dl> + <dt>Attribute: transform="no" (optional)</dt> + <dd>The transform attribute is used to define a key that never + participates in a transform but its output shows up as part of a + transform. This attribute is necessary because two different keys + could output the same characters (with different keys or modifier + combinations) but only one of them is intended to be a dead-key and + participate in a transform. This attribute value must be no if it is + present.</dd> + </dl> + <dl> + <dt>Attribute: multitap (optional)</dt> + <dd> + A space-delimited list of strings, where each successive element of the list is produced by the corresponding number of quick taps. For example, two taps on the key C01 will produce a “c” in the following example. <br> + <br> <em>Example:</em><br> <br> + <map iso="C01" to="a" multitap="bb c d"></dd> + </dl> + <dl> + <dt>Attribute: longPress-status (optional)</dt> + <dd> + Indicates optional longPress values. Must only occur with a + longPress value. May be suppressed or shown, depending on user + settings. There can be two map elements that differ only by + long-press-status, allowing two different sets of longpress values.<br> + <br> <em>Example:</em><br> <br> <map + iso="D01" to="a" longPress="à â % æ á ä ã å + ā ª"/><br> <map iso="D01" to="a" + longPress="à â á ä ã å ā" + longPress-status="optional"/> + + </dd> + </dl> + <dl> + + <dt>Attribute: optional (optional)</dt> + <dd>Indicates optional mappings. May be suppressed or shown, + depending on user settings.</dd> + </dl> + <dl> + <dt>Attribute: hint (optional)</dt> + <dd> + Indicates a hint as to long-press contents, such as the first + character of the longPress value, that can be displayed on the key. + May be suppressed or shown, depending on user Settings.<br> <br> + <i>Example:</i> where the hint is "{":<br> + <div style='text-align: center'> + <img alt="keycap hint" src='images/keycapHint.png'> + </div> + </dd> + </dl> <p>For example, suppose there are the following keys, their output and one transform:</p> <blockquote> @@ -766,48 +925,466 @@ h2, h3, h4, table { some of them may be obvious.</p> <p>Examples</p> <pre><keyboard locale="fr-BE-t-k0-windows"><br> …<br> <keyMap modifiers="shift"><br> <map iso="D01" to="A" /> <!-- key=Q --><br> <map iso="D02" to="Z" /> <!-- key=W --><br> <map iso="D03" to="E" /><br> <map iso="D04" to="R" /><br> <map iso="D05" to="T" /><br> <map iso="D06" to="Y" /><br> …<br> </keyMap><br> …<br></keyboard><br><keyboard locale="ps-t-k0-windows"><br> …<br> <keyMap modifiers='altR+caps? ctrl+alt+caps?'><br> <map iso="D04" to="\u{200e}" /> <!-- key=R base=ق --><br> <map iso="D05" to="\u{200f}" /> <!-- key=T base=ف --><br> <map iso="D08" to="\u{670}" /> <!-- key=I base=ه to= ٰ --><br> …<br> </keyMap><br> …<br></keyboard></pre> + <h4> + 5.8.1 <a name="Element_flicks" href="#Element_flicks">Elements: + flicks, flick</a></h4> + <p class='dtd'><!ELEMENT keyMap ( map | flicks )+ ><br> + <!ELEMENT flick EMPTY><br> + <!ATTLIST flick directions NMTOKENS><br> + <!ATTLIST flick to CDATA><br> + <!--@VALUE--></p> + <p>The flicks element is used to generate results from a "flick" of the finger on a mobile device. The <strong>directions</strong> attribute value is a space-delimited list of keywords, that describe a path, currently restricted to the cardinal and intercardinal directions {n e s w ne nw se sw}. The <strong>to</strong> attribute value is the result of (one or more) flicks.</p> + <p>Example: where a flick to the Northeast then South produces two code points.</p> + <pre><flicks iso="C01"> + <flick directions=“ne s” to=“\uABCD\uDCBA”> +</flicks></pre> + <hr> + <h3> + 5.9 <a name="Element_import" href="#Element_import">Element: + import</a> + </h3> + <p>The import element references another file of + the same type and includes all the subelements of the top level + element as though the import element were being replaced by those + elements, in the appropriate section of the XML file. For example:</p> + <pre> <import path="standard_transforms.xml"></pre> + <dl> + <dt>Attribute: path (required)</dt> + <dd>The value is contains a relative path to the included ldml + file. There is a standard set of directories to be searched that an + application may provide. This set is always prepended with the + directory in which the current file being read, is stored.</dd> + </dl> + <p>If two identical elements, as described below, + are defined, the later element will take precedence. Thus if a + hardwareMap/map for the same keycode on the same page is defined + twice (for example once in an included file), the later one will be + the resulting mapping.</p> + <p>Elements are considered to have three + attributes that make them unique: the tag of the element, the parent + and the identifying attribute. The parent in its turn is a unique + element and so on up the chain. If the distinguishing attribute is + optional, its non-existence is represented with an empty value. Here + is a list of elements and their defining attributes. If an element is + not listed then if it is a leaf element, only one occurs and it is + merely replaced. If it has children, then the sub elements are + considered, in effect merging the element in question.</p> + <table> + <!-- nocaption --> + <tbody> + <tr> + <td><p>Element</p></td> + <td><p>Parent</p></td> + <td><p>Distinguishing attribute</p></td> + </tr> + <tr> + <td><p>keyMap</p></td> + <td><p>keyboard</p></td> + <td><p>@modifiers</p></td> + </tr> + <tr> + <td><p>map</p></td> + <td><p>keyMap</p></td> + <td><p>@iso</p></td> + </tr> + <tr> + <td><p>display</p></td> + <td><p>displayMap</p></td> + <td><p>@char (new)</p></td> + </tr> + <tr> + <td><p>layout</p></td> + <td><p>layouts</p></td> + <td><p>@modifier</p></td> + </tr> + </tbody> + </table> + <p>In order to help identify mistakes, it is an + error if a file contains two elements that override each other. All + element overrides must come as a result of an <include> element + either for the element overridden or the element overriding.</p> + <p>The following elements are not imported from + the source file:</p> + <ul> + <li>version</li> + <li>generation</li> + <li>names</li> + <li>settings</li> + </ul> + <hr> + <h3> + 5.10 <a name="Element_displayMap" href="#Element_displayMap">Element: + displayMap</a> + </h3> + <p>The displayMap can be used to describe what is + to be displayed on the keytops for various keys. For the most part, + such explicit information is unnecessary since the @char element from + the keyMap/map element can be used. But there are some characters, + such as diacritics, that do not display well on their own and so + explicit overrides for such characters can help. The displayMap + consists of a list of display sub elements.</p> + <p>DisplayMaps are designed to be shared across + many different keyboard layout descriptions, and included in where + needed.</p> + <hr> + <h3> + 5.11 <a name="Element_display" href="#Element_display">Element: + display</a> + </h3> + <p>The display element describes how a character, + that has come from a keyMap/map element, should be displayed on a + keyboard layout where such display is possible.</p> + <dl> + <dt>Attribute: mapOutput (required)</dt> + <dd>Specifies the character or character sequence from the + keyMap/map element that is to have a special display.</dd> + </dl> + <dl> + <dt>Attribute: display (required)</dt> + <dd>Required and specifies the character sequence that should be + displayed on the keytop for any key that generates the @mapOutput + sequence. (It is an error if the value of the display attribute is + the same as the value of the char attribute.)</dd> + </dl> + <pre> <keyboard > + <keyboardMap> + <map iso="C01" to="a" longpress="\u0301 \u0300"/> + </keyboardMap> + <displayMap> + <display mapOutput="\u0300" display="u\u02CB"/> + <display mapOutput="\u0301" display="u\u02CA"/> + </displayMap><br> </keyboard ></pre> + <p>To allow displayMaps to be shared across + descriptions, there is no requirement that @mapOutput matches any @to + in any keyMap/map element in the keyboard description.</p> + <hr> + <h3> + 5.12 <a name="Element_layer" href="#Element_layer">Element: layer</a> + </h3> + <p>A layer element describes the configuration of + keys on a particular layer of a keyboard. It contains row elements to + describe which keys exist in each row and also switch elements that + describe how keys in the layer switch the layer to another. In + addition, for platforms that require a mapping from a key to a + virtual key (for example Windows or Mac) there is also a vkeys + element to describe the mapping.</p> + <dl> + <dt>Attribute: modifier (required)</dt> + <dd>This has two roles. It acts as an identifier for the layer + element and also provides the linkage into a keyMap. A modifier is a + single modifier combination such that it is matched by one of the + modifier combinations in one of the keyMap/@modifiers attribute. To + indicate that no modifiers apply the reserved name of "none" is + used. For the purposes of fallback vkey mapping, the following + modifier components are reserved: "shift", "ctrl", "alt", "caps", + "cmd", "opt" along with the "L" and "R" optional single suffixes for + the first 3 in that list. There must be a keyMap whose @modifiers + attribute matches the @modifier attribute of the layer element. It + is an error if there is no such keyMap.</dd> + </dl> + <p>The keymap/@modifier often includes multiple + combinations that match. It is not necessary (or prefered) to include + all of these. Instead a minimal matching element should be used, such + that exactly one keymap is matched.</p> + <p>The following are examples of situations where + the @modifiers and @modifier do not match, with a different keymap + definition than above.</p> + <table> + <!-- nocaption --> + <tbody> + <tr> + <th><p>keyMap/@modifiers</p></th> + <th><p>layer/@modifier</p></th> + </tr> + <tr> + <td><p>shiftL</p></td> + <td><p>shift (ambiguous)</p></td> + </tr> + <tr> + <td><p>altR</p></td> + <td><p>alt</p></td> + </tr> + <tr> + <td><p>shiftL?+shiftR</p></td> + <td><p>shift</p></td> + </tr> + </tbody> + </table> + <p>And these do match:</p> + <table> + <!-- nocaption --> + <tbody> + <tr> + <th><p>keyMap/@modifiers</p></th> + <th><p>layer/@modifier</p></th> + </tr> + <tr> + <td><p>shiftL shiftR</p></td> + <td><p>shift</p></td> + </tr> + </tbody> + </table> + <p>The use of @modifier as an identifier for a + layer, is sufficient since it is always unique among the set of layer + elements in a keyboard.</p> <hr> <h3> - 5.9 <a name="Element_transforms" href="#Element_transforms">Element: + 5.13 <a name="Element_row" href="#Element_row">Element: row</a> + </h3> + <p>A row element describes the keys that are + present in the row of a keyboard. Row elements are ordered within a + layout element with the top visual row being stored first. The row + element introduces the keyId which may be an ISOKey or a specialKey. + More formally:</p> + <pre> keyId = ISOKey | specialKey<br> ISOKey = [A-Z][0-9][0-9]<br> specialKey = [a-z][a-zA-Z0-9]{2,7}</pre> + <p> + ISOKey denotes a key having an <a href="#Definitions">ISO + Position</a>. SpecialKey is used to identify functional keys occurring + on a virtual keyboard layout. + </p> + <dl> + <dt>Attribute: keys (required)</dt> + <dd>This is a string that lists the keyId for each of the keys + in a row. Key ranges may be contracted to firstkey-lastkey but only + for ISOKey type keyIds. The interpolation between the first and last + keys names is entirely numeric. Thus D00-D03 is equivalent to D00 + D01 D02 D03. It is an error if the first and last keys do not have + the same alphabetic prefix or the last key numeric component is less + than or equal to the first key numeric component.</dd> + </dl> + <p>specialKey type keyIds may take any value + within their syntactic constraint. But the following specialKeys are + reserved to allow applications to identify them and give them special + handling:</p> + <ul> + <li>"bksp", "enter", "space", "tab", "esc", "sym", "num"</li> + <li>all the reserved modifier names</li> + <li>specialKeys starting with the letter "x" for future reserved + names.</li> + </ul> + <p>Here is an example of a row element:</p> + <pre> <layer modifier="none"> + <row keys="D01-D10"/> + <row keys="C01-C09"/> + <row keys="shift B01-B07 bksp"/> + <row keys="sym A01 smilies A02-A03 enter"/> + </layer> + </pre> + <hr> + <h3> + 5.14 <a name="Element_switch" href="#Element_switch">Element: + switch</a> + </h3> + <p>The switch element describes a function key + that has been included in the layout. It specifies which layer + pressing the key switches you to and also what the key looks like.</p> + <dl> + <dt>Attribute: iso (required)</dt> + <dd>The keyId as specified in one of the row elements. This must + be a specialKey and not an ISOKey.</dd> + </dl> + <dl> + <dt>Attribute: layout (required)</dt> + <dd>The modifier attribute of the resulting layout element that + describes the layer the user gets switched to.</dd> + </dl> + <dl> + <dt>Attribute: display (required)</dt> + <dd>A string to be displayed on the key.</dd> + </dl> + <p>Here is an example of a switch element for a + shift key:</p> + <pre> <layer modifier="none"> + <row keys="D01-D10"/> + <row keys="C01-C09"/> + <row keys="shift B01-B07 bksp"/> + <row keys="sym A01 smilies A02-A03 enter"/> + <switch iso="shift" layout="shift" display="&#x21EA;"/> + </layer> + <layer modifier="shift"> + <row keys="D01-D10"/> + <row keys="C01-C09"/> + <row keys="shift B01-B07 bksp"/> + <row keys="sym A01 smilies A02-A03 enter"/> + <switch iso="shift" layout="none" display="&#x21EA;"/> + </layer></pre> + <hr> + <h3> + 5.15 <a name="Element_vkeys" href="#Element_vkeys">Element: vkeys</a> + </h3> + <p>On some architectures, applications may + directly interact with keys before they are converted to characters. + The keys are identified using a virtual key identifier or vkey. The + mapping between a physical keyboard key and a vkey is keyboard-layout + dependent. For example, a French keyboard would identify the D01 key + as being an 'a' with a vkey of 'a' as opposed to 'q' on a US English + keyboard. While vkeys are layout dependent, they are not modifier + dependent. A shifted key always has the same vkey as its unshifted + counterpart. In effect, a key is identified by its vkey and the + modifiers active at the time the key was pressed.</p> + <p>For a physical keyboard there is a layout + specific default mapping of keys to vkeys. These are listed in a + vkeys element which takes a list of vkey element mappings and is + identified by a type. There are different vkey mappings required for + different platforms. While type="windows" vkeys are very similar to + type="osx" vkeys, they are not identical and require their own + mapping.</p> + <p>The most common model for specifying vkeys is + to import a standard mapping, say to the US layout, and then to add a + vkeys element to change the mapping appropriately for the specific + layout.</p> + <p>In addition to describing physical keyboards, + vkeys also get used in virtual keyboards. Here the vkey mapping is + local to a layer and therefore a vkeys element may occur within a + layout element. In the case where a layout element has no vkeys + element then the resulting mapping may either be empty (none of the + keys represent keys that have vkey identifiers) or may fallback to + the layout wide vkeys mapping. Fallback only occurs if the layout's + modifier attribute consists only of standard modifiers as listed as + being reserved in the description of the layout/@modifier attribute, + and if the modifiers are standard for the platform involved. So for + Windows, 'cmd' is a reserved modifier but it is not standard for + Windows. Therefore on Windows the vkey mapping for a layout with + @modifier="cmd" would be empty.</p> + <p>A vkeys element consists of a list of vkey + elements.</p> + <hr> + <h3> + 5.16 <a name="Element_vkey" href="#Element_vkey">Element: vkey</a> + </h3> + <p>A vkey element describes a mapping between a + key and a vkey for a particular platform.</p> + <dl> + <dt>Attribute: iso (required)</dt> + <dd>The ISOkey being mapped.</dd> + </dl> + <dl> + <dt>Attribute: type</dt> + <dd>Current values: android, chromeos, osx, und, windows.</dd> + </dl> + <dl> + <dt>Attribute: vkey (required)</dt> + <dd>The resultant vkey identifier.</dd> + </dl> + <dl> + <dt>Attribute: modifier</dt> + <dd>This attribute may only be used if the parent vkeys element + is a child of a layout element. If present it allows an unmodified + key from a layer to represent a modified virtual key.</dd> + </dl> + <p>This example shows some of the mappings for a + French keyboard layout:</p> + <pre> <i>shared/win-vkey.xml</i> + <keyboard> + <vkeys type="windows"> + <vkey iso="D01" vkey="VK_Q"/> + <vkey iso="D02" vkey="VK_W"/> + <vkey iso="C01" vkey="VK_A"/> + <vkey iso="B01" vkey="VK_Z"/> + </vkeys> + </keyboard><br> + <i>shared/win-fr.xml</i> + <keyboard> + <import path="shared/win-vkey.xml"> + <keyMap> + <map iso="D01" to="a"/> + <map iso="D02" to="z"/> + <map iso="C01" to="q"/> + <map iso="B01" to="w"/> + </keyMap><br> + <keyMap modifiers="shift"> + <map iso="D01" to="A"/> + <map iso="D02" to="Z"/> + <map iso="C01" to="Q"/> + <map iso="B01" to="W"/> + </keyMap><br> + <vkeys type="windows"> + <vkey iso="D01" vkey="VK_A"/> + <vkey iso="D02" vkey="VK_Z"/> + <vkey iso="C01" vkey="VK_Q"/> + <vkey iso="B01" vkey="VK_W"/> + </vkeys> + </keyboard></pre> + <p>In the context of a virtual keyboard there + might be a symbol layer with the following layout:</p> + <pre> <keyboard> + <keyMap> + <map iso="D01" to="1"/> + <map iso="D02" to="2"/> + ... + <map iso="D09" to="9"/> + <map iso="D10" to="0"/> + <map iso="C01" to="!"/> + <map iso="C02" to="@"/> + ... + <map iso="C09" to="("/> + <map iso="C10" to=")"/> + </keyMap><br> + <layer modifier="sym"> + <row keys="D01-D10"/> + <row keys="C01-C09"/> + <row keys="shift B01-B07 bksp"/> + <row keys="sym A00-A03 enter"/> + <switch iso="sym" layout="none" display="ABC"/> + <switch iso="shift" layout="sym+shift" display="&=/<"/> + <vkeys type="windows"> + <vkey iso="D01" vkey="VK_1"/> + ... + <vkey iso="D10" vkey="VK_0"/> + <vkey iso="C01" vkey="VK_1" modifier="shift"/> + ... + <vkey iso="C10" vkey="VK_0" modifier="shift"/> + </vkeys> + </layer> + </keyboard></pre> + <hr> + <h3> + 5.17 <a + name="Element_transforms" href="#Element_transforms">Element: transforms</a> </h3> <p>This element defines a group of one or more transform elements associated with this keyboard layout. This is used to support - dead-keys using a straightforward structure that works for all the + such as dead-keys using a straightforward structure that works for all the keyboards tested, and that results in readable source data.</p> - <p>There can be multiple <transforms> elements; at this - point the "simple" one is defined.</p> + <p> + There can be multiple <transforms> elements</p> <p>Syntax</p> <p><transforms type="..."></p> <p>{a set of transform elements}</p> <p></transforms></p> - <p>Attribute: type (required)</p> - <p>The value is "simple" for the transforms listed below. People - have legitimate needs for more complex transforms, and more - sophisticated types of transforms may be added over time. (Doing the - more sophisticated transforms would take much more time, since it - would require a thorough survey of the major keyboard mechanisms that - use them, development of a unified mechanism that handles all the - requirements, and coding to ensure sure programmatically mapping - those mechanisms into the standard is possible, and so on.)</p> + <dl> + <dt>Attribute: type (required)</dt> + <dd>Current values: simple, final.</dd> + </dl> + <hr> <h3> - 5.10 <a name="Element_transform" href="#Element_transform">Element: + 5.18 <a + name="Element_transform" href="#Element_transform">Element: transform</a> </h3> - <p>This element must have the transforms element as its parent. - This element represents a single transform that may be performed - using the keyboard layout. A transform is simply a combination of key - presses that gets transformed into one (or more) final characters. - For example, in most French keyboards hitting the "^" dead-key - followed by the "e" key produces "ê".</p> + <p> + This element must have the transforms element as its parent. This + element represents a single transform that may be performed using the + keyboard layout. A transform is an + element that specifies a set of conversions from sequences of code + points into one (or more) other code points.. For example, in most + French keyboards hitting the "^" dead-key followed by the "e" key + produces "ê". + </p> <p>Syntax</p> <p><transform from="{combination of characters}" to="{output}"></p> - <p>Attribute: from (required)</p> - <p>This is the combination of keys that must be pressed in order - to activate this transform. Each character in this series of - characters must match a character that is located in some chars - attribute in the document.</p> + <dl> + <dt>Attribute: from (required)</dt> + <dd> + The from attribute consists of a sequence of elements. Each element + matches one character and may consist of a codepoint or a UnicodeSet + (both as defined in <a + href="http://www.unicode.org/reports/tr35/#Unicode_Sets">UTS#35 + section 5.3.3</a>). + </dd> + </dl> <p>For example, suppose there are the following transforms:</p> <blockquote> <p>^e → ê</p> @@ -847,7 +1424,7 @@ h2, h3, h4, table { </blockquote> <p>Here's what happens when the user types various sequence characters:</p> - <table cellpadding="0" cellspacing="0" border='1'> + <table> <!-- nocaption --> <tbody> <tr> @@ -888,15 +1465,646 @@ h2, h3, h4, table { </table> <p>Control characters, combining marks and whitespace in this attribute are escaped using the \u{...} notation.</p> - <p>Attribute: to (required)</p> - <p>This attribute represents the characters that are output from - the transform. This may be more than one, so you could have - <transform from="´A" to="Fred"/></p> + <dl> + <dt>Attribute: to (required)</dt> + <dd> + This attribute represents the characters that are output from the + transform. The output can contain more than one + character, so you could have <transform from="´A" + to="Fred"/> + </dd> + </dl> <p>Control characters, whitespace (other than the regular space character) and combining marks in this attribute are escaped using the \u{...} notation.</p> <p>Examples</p> <pre><keyboard locale="fr-CA-t-k0-CSA-osx"><br> <transforms type="simple"><br> <transform from="´a" to="á" /><br> <transform from="´A" to="Á" /><br> <transform from="´e" to="é" /><br> <transform from="´E" to="É" /><br> <transform from="´i" to="í" /><br> <transform from="´I" to="Í" /><br> <transform from="´o" to="ó" /><br> <transform from="´O" to="Ó" /><br> <transform from="´u" to="ú" /><br> <transform from="´U" to="Ú" /><br> </transforms><br> ...<br></keyboard><br><keyboard locale="nl-BE-t-k0-chromeos"><br> <transforms type="simple"><br> <transform from="\u{30c}a" to="ǎ" /> <!-- ̌a → ǎ --><br> <transform from="\u{30c}A" to="Ǎ" /> <!-- ̌A → Ǎ --><br> <transform from="\u{30a}a" to="å" /> <!-- ̊a → å --><br> <transform from="\u{30a}A" to="Å" /> <!-- ̊A → Å --><br> </transforms><br> ...<br></keyboard></pre> + <dl> + <dt>Attribute: before (optional)</dt> + <dd>This attribute consists of a sequence of elements (codepoint + or UnicodeSet) to match the text up to the current position in the + text (this is similar to a regex "look behind" assertion: + (?<=a)b matches a "b" that is preceded by an + "a"). The attribute must match for the transform to apply. + If missing, no before constraint is applied. The attribute value + must not be empty.</dd> + </dl> + <dl> + <dt>Attribute: after (optional)</dt> + <dd>This attribute consists of a sequence of elements (codepoint + or UnicodeSet) and matches as a zero-width assertion after the @from + sequence. The attribute must match for the transform to apply. If + missing, no after constraint is applied. The attribute value must + not be empty. When the transform is applied, the string matched by + the @from attribute is replaced by the string in the @to attribute, + with the text matched by the @after attribute left unchanged. After + the change, the current position is reset to just after the text + output from the @to attribute and just before the text matched by + the @after attribute. Warning: some legacy implementations may not + be able to make such an adjustment and will place the current + position after the @after matched string.</dd> + </dl> + <dl> + <dt>Attribute: error (optional)</dt> + <dd>If set this attribute indicates that the keyboarding + application may indicate an error to the user in some way. + Processing may stop and rewind to any state before the key was + pressed. If processing does stop, no further transforms on the same + input are applied. The @error attribute takes the value "fail", or + must be absent. If processing continues, the @to is used for output + as normal. It thus should contain a reasonable value.</dd> + </dl> + <p>For example:</p> + <blockquote><transform + from="\u037A\u037A" to="\u037A" + error="fail" /></blockquote> + <p>This indicates that it is an error to type two + iota subscripts immediately after each other.</p> + <p>In terms of how these different attributes work + in processing a sequences of transforms, consider the transform:</p> + <blockquote><transform + before="X" from="Y" after="Y" + to="B"/></blockquote> + <p>This would transform the string:</p> + <blockquote>XYZ → XBZ</blockquote> + <p>If we mark where the current match position is + before and after the transform we see:</p> + <blockquote>X | Y Z → X B | Z</blockquote> + <p>And a subsequent transform could transform the + Z string, looking back (using @before) to match the B.</p> + <p>There are other keying behaviors that are + needed particularly in handling languages and scripts from various + parts of the world. The behaviors intended to be covered by the + transforms are:</p> + <ul> + <li>Reordering combining marks. The order required for + underlying storage may differ considerably from the desired typing + order. In addition, a keyboard may want to allow for different + typing orders.</li> + <li>Error indication. Sometimes a keyboard layout will want to + specify to the application that a particular keying sequence in a + context is in error and that the application should indicate that + that particular keypress is erroneous.</li> + <li>Backspace handling. There are various approaches to handling + the backspace key. An application may treat it as an undo of the + last key input, or it may simply delete the last character in the + currently output text, or it may use transform rules to tell it how + much to delete.</li> + </ul> + <p>We consider each transform type in turn and + consider attributes to the <transforms> element pertinent to + that type.</p> + <hr> + <h3> + 5.19 <a name="Element_reorder" href="#Element_reorder">Element: + reorder</a> + </h3> + <p>The reorder transform is applied after all + transform except for those with type=“final”.</p> + <p>This transform has the job of reordering + sequences of characters that have been typed, from their typed order + to the desired output order. The primary concern in this transform is + to sort combining marks into their correct relative order after a + base, as described in this section. The reorder transforms can be + quite complex, keyboard layouts will almost always import them.</p> + <p>The reordering algorithm consists of four + parts:</p> + <ol> + <li>Create a sort key for each character in the input string. A + sort key has 4 parts: (primary, index, tertiary). + <ul> + <li>The <b>primary weight</b> is the primary order value. + </li> + <li>The <b>secondary weight</b> is the index, a position in + the input string, usually of the character itself, but it may be + of a character earlier in the string. + </li> + <li>The <b>tertiary weight</b> is a tertiary order value + (defaulting to 0). + </li> + <li>The <b>quaternary weight</b> is the index of the character + in the string. This ensures a stable sort for sequences of + characters with the same tertiary weight. + </li> + </ul> + </li> + <li>Mark each character as to whether it is a prebase character, + one that is typed before the base and logically stored after. Thus + it will have a primary order > 0.</li> + <li>Use the sort key and the prebase mark to identify runs. A + run starts with a prefix that contains any prebase characters and a + single base character whose primary and tertiary key is 0. The run + extends until, but not including, the start of the prefix of the + next run or end of the string. + <ul> + <li>run := prebase* (primary=0 && tertiary=0) ((primary≠0 || + tertiary≠0) && !prebase)*</li> + </ul> + </li> + <li>Sort the character order of each character in the run based + on its sort key.</li> + </ol> + <p>The primary order of a character with the + Unicode property Combining_Character_Class (ccc) of 0 may well not be + 0. In addition, a character may receive a different primary order + dependent on context. For example, in the Devanagari sequence ka + halant ka, the first ka would have a primary order 0 while the halant + ka sequence would give both halant and the second ka a primary order + > 0, for example 2. Note that “base” character in this discussion + is not a Unicode base character. It is instead a character with + primary=0.</p> + <p>In order to get the characters into the correct + relative order, it is necessary not only to order combining marks + relative to the base character, but also to order some combining + marks in a subsequence following another combining mark. For example + in Devanagari, a nukta may follow consonant character, but it may + also follow a conjunct consisting of a consonant, halant, consonant. + Notice that the second consonant is not, in this model, the start of + a new run because some characters may need to be reordered to before + the first base, for example repha. The repha would get primary < + 0, and be sorted before the character with order = 0, which is, in + the case of Devanagari, the initial consonant of the orthographic + syllable.</p> + <p>The reorder transform consists of a single + element type: <reorder> encapsulated in a <reorders> + element. Each is a rule that matches against a string of characters + with the action of setting the various ordering attributes (primary, + tertiary, tertiary_base, prebase) for the matched characters in the + string.</p> + <blockquote> + <p> + <strong>from</strong> This attribute follows the transform/@from + attribute and contains a string of elements. Each element matches + one character and may consist of a codepoint or a UnicodeSet (both + as defined in UTS#35 section 5.3.3). This attribute is required. + </p> + <p> + <strong>before</strong> This attribute follows the transform/@before + attribute and contains the element string that must match the string + immediately preceding the start of the string that the @from + matches. + </p> + <p> + <strong>after</strong> This attribute follows the transform/@after + attribute and contains the element string that must match the string + immediately following the end of the string that the @from matches. + </p> + <p> + <strong>order</strong> This attribute gives the primary order for + the elements in the matched string in the @from attribute. The value + is a simple integer between -128 and +127 inclusive, or a space + separated list of such integers. For a single integer, it is applied + to all the elements in the matched string. Details of such list type + attributes are given after all the attributes are described. If + missing, the order value of all the matched characters is 0. We + consider the order value for a matched character in the string. + </p> + <ul> + <li>If the value is 0 and its tertiary value is + 0, then the character is the base of a new run.</li> + <li>If the value is 0 and its tertiary value is + non-zero, then it is a normal character in a run, with ordering + semantics as described in the @tertiary attribute.</li> + <li>If the value is negative, then the + character is a primary character and will reorder to be before the + base of the run.</li> + <li>If the value is positive, then the + character is a primary character and is sorted based on the order + value as the primary key following a previous base character.</li> + </ul> + <p>A character with a zero tertiary value is a + primary character and receives a sort key consisting of:</p> + <ul> + <li>Primary weight is the order value</li> + <li>Secondary weight is the index of the + character. This may be any value (character index, codepoint index) + such that its value is greater than the character before it and + less than the character after it.</li> + <li>Tertiary weight is 0.</li> + <li>Quaternary weight is the same as the + secondary weight.</li> + </ul> + <p> + <strong>tertiary</strong> This attribute gives the tertiary order + value to the characters matched. The value is a simple integer + between -128 and +127 inclusive, or a space separated list of such + integers. If missing, the value for all the characters matched is 0. + We consider the tertiary value for a matched character in the + string. + </p> + <ul> + <li>If the value is 0 then the character is + considered to have a primary order as specified in its order value + and is a primary character.</li> + <li>If the value is non zero, then the order + value must be zero otherwise it is an error. The character is + considered as a tertiary character for the purposes of ordering.</li> + </ul> + <p>A tertiary character receives its primary + order and index from a previous character, which it is intended to + sort closely after. The sort key for a tertiary character consists + of:</p> + <ul> + <li>Primary weight is the primary weight of the + primary character</li> + <li>Secondary weight is the index of the + primary character, not the tertiary character</li> + <li>Tertiary weight is the tertiary value for + the character.</li> + <li>Quaternary weight is the index of the + tertiary character.</li> + </ul> + <p> + <strong>tertiary_base</strong> This attribute is a space separated + list of "true" or "false" values corresponding + to each character matched. It is illegal for a tertiary character to + have a true tertiary_base value. For a primary character it marks + that this character may have tertiary characters moved after it. + When calculating the secondary weight for a tertiary character, the + most recently encountered primary character with a true + tertiary_base attribute is used. Primary characters with an @order + value of 0 automatically are treated as having tertiary_base true + regardless of what is specified for them. + </p> + <p> + <strong>prebase</strong> This attribute gives the prebase attribute + for each character matched. The value may be "true" or + "false" or a space separated list of such values. If + missing the value for all the characters matched is false. It is + illegal for a tertiary character to have a true prebase value. + </p> + <p>If a primary character has a true prebase + value then the character is marked as being typed before the base + character of a run, even though it is intended to be stored after + it. The primary order gives the intended position in the order after + the base character, that the prebase character will end up. Thus + @primary may not be 0. These characters are part of the run prefix. + If such characters are typed then, in order to give the run a base + character after which characters can be sorted, an appropriate base + character, such as a dotted circle, is inserted into the output run, + until a real base character has been typed. A value of + "false" indicates that the character is not a prebase.</p> + </blockquote> + <p>There is no @error attribute.</p> + <p>For @from attributes with a match string length + greater than 1, the sort key information (@order, @tertiary, + @tertiary_base, @prebase) may consist of a space separated list of + values, one for each element matched. The last value is repeated to + fill out any missing values. Such a list may not contain more values + than there are elements in the @from attribute:</p> + <pre> if len(@from) < len(@list) then error<br> else + while len(@from) > len(@list)<br> append lastitem(@list) to @list<br> endwhile + endif</pre> + <p>For example, consider the word Northern Thai + (nod-Lana) word: ᨡ᩠ᩅᩫ᩶ 'roasted'. This is ideally encoded as the + following:</p> + <table class='simple'> + <tr> + <th>name</th> + <td><em>ka</em></td> + <td><em>asat</em></td> + <td><em>wa</em></td> + <td><em>o</em><em></em></td> + <td><em>t2</em></td> + </tr> + <tr> + <th>code</th> + <td>1A21</td> + <td>1A60</td> + <td>1A45</td> + <td>1A6B<em></em></td> + <td>1A76</td> + </tr> + <tr> + <th>ccc</th> + <td>0</td> + <td>9</td> + <td>0</td> + <td>0<em></em></td> + <td>230</td> + </tr> + + </table> + <p>(That sequence is already in NFC format.)</p> + <p>Some users may type the upper component of the + vowel first, and the tone before or after the lower component. Thus + someone might type it as:</p> + <table class='simple'> + <tr> + <th>name</th> + <td><em>ka</em></td> + <td><em>o</em><em></em></td> + <td><em>t2</em></td> + <td><em>asat</em></td> + <td><em>wa</em></td> + </tr> + <tr> + <th>code</th> + <td>1A21</td> + <td>1A6B<em></em></td> + <td>1A76</td> + <td>1A60</td> + <td>1A45</td> + </tr> + <tr> + <th>ccc</th> + <td>0</td> + <td>0<em></em></td> + <td>230</td> + <td>9</td> + <td>0</td> + </tr> + </table> + <p>The Unicode NFC format of that typed value + reorders to:</p> + <table class='simple'> + <tr> + <th>name</th> + <td><em>ka</em></td> + <td><em>o</em><em></em></td> + <td><em>asat</em></td> + <td><em>t2</em></td> + <td><em>wa</em></td> + </tr> + <tr> + <th>code</th> + <td>1A21</td> + <td>1A6B<em></em></td> + <td>1A60</td> + <td>1A76</td> + <td>1A45</td> + </tr> + <tr> + <th>ccc</th> + <td>0</td> + <td>0<em></em></td> + <td>9</td> + <td>230</td> + <td>0</td> + </tr> + </table> + <p> + Finally, the user might also type in the sequence with the tone <em>after</em> + the lower component. + </p> + <table class='simple'> + <tr> + <th>name</th> + <td><em>ka</em></td> + <td><em>o</em><em></em></td> + <td><em>asat</em></td> + <td><em>wa</em></td> + <td><em>t2</em></td> + </tr> + <tr> + <th>code</th> + <td>1A21</td> + <td>1A6B<em></em></td> + <td>1A60</td> + <td>1A45</td> + <td>1A76</td> + </tr> + <tr> + <th>ccc</th> + <td>0</td> + <td>0<em></em></td> + <td>9</td> + <td>0</td> + <td>230</td> + </tr> + </table> + <p>(That sequence is already in NFC format.)</p> + <p>We want all of these sequences to end up + ordered as the first. To do this, we use the following rules:</p> + <pre> <reorder from="\u1A60" order="127"/> <!-- max possible order --> + <reorder from="\u1A6B" order="42"/> + <reorder from="[\u1A75-\u1A7C]" order="55"/><br> <reorder before="\u1A6B" from="\u1A60\u1A45" order="10"/><br> <reorder before="\u1A6B[\u1A75-\u1A7C]" from="\u1A60\u1A45" order="10"/><br> <reorder before="\u1A6B" from="\u1A60[\u1A75-\u1A7C]\u1A45" order="10 55 10"/></pre> + <p> + The first reorder is the default ordering for the <i>asat</i> which + allows for it to be placed anywhere in a sequence, but moves any + non-consonants that may immediately follow it, back before it in the + sequence. The next two rules give the orders for the top vowel + component and tone marks respectively. The next three rules give the + <i>asat</i> and <i>wa</i> characters a primary order that places them + before the <em>o</em>. Notice particularly the final reorder rule + where the <i>asat</i>+<i>wa</i> is split by the tone mark. This rule + is necessary in case someone types into the middle of previously + normalized text. + </p> + <p><reorder> elements are priority ordered + based first on the length of string their @from attribute matches and + then the sum of the lengths of the strings their @before and @after + attributes match.</p> + <p>If a layout has two <transforms> elements + of type reorder, e.g. from importing one and specifying the second, + then <transform> elements are merged. The @from string in a + <reorder> element describes a set of strings that it matches. + This also holds for the @before and @after attributes. The + intersection of two <reorder> elements consists of the + intersections of their @from, @before and @after string sets. It is + illegal for the intersection between any two <reorder> elements + in the same <transforms> element to be non empty, although + implementors are encouraged to have pity on layout authors when + reporting such errors, since they can be hard to track down.</p> + <p>If two <reorder> elements in two + different <transforms> elements have a non empty intersection, + then they are split and merged. They are split such that where there + were two <reorder> elements, there are, in effect (but not + actuality), three elements consisting of:</p> + <ul> + <li>@from, @before, @after that match the + intersection of the two rules. The other attributes are merged, as + described below.</li> + <li>@from, @before, @after that match the set of + strings in the first rule not in the intersection with the other + attributes from the first rule.</li> + <li>@from, @before, @after that match the set of + strings in the second rule not in the intersection, with the other + attributes from the second rule.</li> + </ul> + <p>When merging the other attributes, the second + rule is taken to have priority (occurring later in the layout + description file). Where the second rule does not define the value + for a character but the first does, it is taken from the first rule, + otherwise it is taken from the second rule.</p> + <p>Notice that it is possible for two rules to + match the same string, but for them not to merge because the + distribution of the string across @before, @from, and @after is + different. For example:</p> + <pre> <reorder before="ab" from="cd" after="e"/></pre> + <p>would not merge with:</p> + <pre> <reorder before="a" from="bcd" after="e"/></pre> + <p>When two <reorders> elements merge as the + result of an import, the resulting reorder elements are sorted into + priority order for matching.</p> + <p>Consider this fragment from a shared reordering + for the Myanmar script:</p> + <pre><!-- medial-r --> + <reorder from="\u103C" order="20"/> + +<!-- [medial-wa or shan-medial-wa] --> + <reorder from="[\u103D\u1082]" order="25"/> + +<!-- [medial-ha or shan-medial-wa]+asat = Mon <i>asat</i> --><br> <reorder from="[\u103E\u1082]\u103A" order="27"/> + +<!-- [medial-ha or mon-medial-wa] --><br> <reorder from="[\u103E\u1060]" order="27"/> + +<!-- [e-vowel or shan-e-vowel] --><br> <reorder from="[\u1031\u1084]" order="30"/> +<br> <reorder from="[\u102D\u102E\u1033-\u1035\u1071-\u1074\u1085\u109D\uA9E5]" order="35"/></pre> + <p>A particular Myanmar keyboard layout can have + this reorders element:</p> + <pre><reorders type="reorder"><br><!-- Kinzi --> + <reorder from="\u1004\u103A\u1039" order="-1"/> + +<!-- e-vowel --> + <reorder from="\u1031" prebase="1"/> + +<!-- medial-r --> + <reorder from="\u103C" prebase="1"/><br></reorders></pre> + <p>The effect of this that the <em>e-vowel</em> will be identified as a prebase and will have an order of 30. + Likewise a <em>medial-r</em> will be identified as a prebase and will have an + order of 20. Notice that a <em>shan-e-vowel</em> will not be identified as a prebase + (even if it should be!). The <em>kinzi</em> is described in the layout since + it moves something across a run boundary. By separating such + movements (prebase or moving to in front of a base) from the shared + ordering rules, the shared ordering rules become a self-contained + combining order description that can be used in other keyboards or + even in other contexts than keyboarding. </p> + <hr> + <h3> + 5.20 <a name="Element_final" href="#Element_final">Element: final</a> + </h3> + <p>The final transform is applied after the + reorder transform. It executes in a similar way to the simple + transform with the settings ignored, as if there were no settings in + the <settings> element.</p> + <p>This is an example from Khmer where split + vowels are combined after reordering.</p> + <pre> + <transforms type="final"> + <transform from="\u17C1\u17B8" to="\u17BE"/> + <transform from="\u17C1\u17B6" to="\u17C4"/> + </transforms></pre> + <p>Another example allows a keyboard + implementation to alert or stop people typing two lower vowels in a + Burmese cluster:</p> + <pre> <transform from="[\u102F\u1030\u1048\u1059][\u102F\u1030\u1048\u1059]" error="fail"/></pre> + <hr> + <h3> + 5.21 <a name="Element_backspaces" href="#Element_backspaces">Element: + backspaces</a> + </h3> + <p>The backspace transform is an optional + transform that is not applied on input of normal characters, but is + only used to perform extra backspace modifications to previously + committed text.</p> + <p>Keyboarding applications typically, but are not + required, to work in one of two modes:</p> + <dl> + <dt> + <b>text entry</b> + </dt> + <dd>text entry happens while a user is typing new text. A user + typically wants the backspace key to undo whatever they last typed, + whether or not they typed things in the 'right' order.</dd> + </dl> + <dl> + <dt> + <b>text editing</b> + </dt> + <dd>text editing happens when a user moves the cursor into some + previously entered text which may have been entered by someone else. + As such, there is no way to know in which order things were typed, + but a user will still want appropriate behaviour when they press + backspace. This may involve deleting more than one character or + replacing a sequence of characters with a different sequence.</dd> + </dl> + <p>In the text entry mode, there is no need for + any special description of backspace behaviour. A keyboarding + application will typically keep a history of previous output states + and just revert to the previous state when backspace is hit.</p> + <p>In text editing mode, different keyboard + layouts may behave differently in the same textual context. The + backspace transform allows the keyboard layout to specify the effect + of pressing backspace in a particular textual context. This is done + by specifying a set of backspace rules that match a string before the + cursor and replace it with another string. The rules are expressed as + backspace elements encapsulated in a backspaces element.</p> + <hr> + <h3> + 5.22 <a name="Element_backspace" href="#Element_backspace">Element: + backspace</a> + </h3> + <p>The backspace element has the same @before, + @from, @after, @to, @errors of the transform element. The @to is + optional with backspace.</p> + <p>For example, consider deleting a Devanagari + ksha:</p> + <pre> + <backspaces> + <backspace from="\u0915\u094D\u0936"/> + </backspaces></pre> + <p>Here there is no @to attribute since the whole + string is being deleted. This is not uncommon in the backspace + transforms.</p> + <p>A more complex example comes from a Burmese + visually ordered keyboard:</p> + <pre> <backspaces> +<!-- Kinzi --><br> <backspace from="[\u1004\u101B\u105A]\u103A\u1039"/> + +<!-- subjoined consonant --><br> <backspace from="\u1039[\u1000-\u101C\u101E\u1020\u1021\u1050\u1051\u105A-\u105D]"/> +<br><!-- tone mark --> + <backspace from="\u102B\u103A"/> +<br><!-- Handle prebases --> +<!-- diacritics stored before e-vowel --><br> <backspace from="[\u103A-\u103F\u105E-\u1060\u1082]\u1031" to="\u1031"/> + +<!-- diacritics stored before medial r --><br> <backspace from="[\u103A-\u103B\u105E-\u105F]\u103C" to="\u103C"/> +<br><!-- subjoined consonant before e-vowel --> + <backspace from="\u1039[\u1000-\u101C\u101E\u1020\u1021]\u1031" to="\u1031"/> +<br><!-- base consonant before e-vowel --> + <backspace from="[\u1000-\u102A\u103F-\u1049\u104E]\u1031" to="\uFDDF\u1031"/> +<br><!-- subjoined consonant before medial r --> + <backspace from="\u1039[\u1000-\u101C\u101E\u1020\u1021]\u103C" to="\u103C"/> +<br><!-- base consonant before medial r --> + <backspace from="[\u1000-\u102A\u103F-\u1049\u104E]\u103C" to="\uFDDF\u103C"/> +<br><!-- delete lone medial r or e-vowel --> + <backspace from="\uFDDF[\u1031\u103C]"/><br></backspaces></pre> + <p>The above example is simplified, and doesn't fully handle the interaction between medial-r and e-vowel.</p> + <p>The character \uFDDF does not represent a + literal character, but is instead a special placeholder, a + "filler string". When a keyboard implementation handles a + user pressing a key that inserts a prebase character, it also has to + insert a special filler string before the prebase to ensure that the + prebase character does not combine with the previous cluster. See the + reorder transform for details. The precise filler string is + implementation dependent. Rather than requiring keyboard layout + designers to know what the filler string is, we reserve a special + character that the keyboard layout designer may use to reference this + filler string. It is up to the keyboard implementation to, in effect, + replace that character with the filler string.</p> + <p>The first three transforms above delete various + ligatures with a single keypress. The other transforms handle prebase + characters. There are two in this Burmese keyboard. The transforms + delete the characters preceding the prebase character up to base + which gets replaced with the prebase filler string, which represents + a null base. Finally the prebase filler string + prebase is deleted + as a unit.</p> + <p>The backspace transform is much like other + transforms except in its processing model. If we consider the same + transform as in the simple transform example, but as a backspace:</p> + <blockquote><backspace + before="X" from="Y" after="Z" + to="B"/></blockquote> + <p>This would transform the string:</p> + <blockquote>XYZ → XBZ</blockquote> + <p>If we mark where the current match position is + before and after the transform we see:</p> + <blockquote>X Y | Z → X B | Z</blockquote> + <p>Whereas a simple or final transform would then + run other transforms in the transform list, advancing the processing + position until it gets to the end of the string, the backspace + transform only matches a single backspace rule and then finishes.</p> + <hr> <h2> 6 <a name="Element_Heirarchy_Platform_File" href="#Element_Heirarchy_Platform_File">Element Hierarchy - @@ -940,15 +2148,19 @@ h2, h3, h4, table { <p>Syntax</p> <p><map keycode="{hardware keycode}" iso="{ISO layout position}"/></p> - <p>Attribute: keycode (required)</p> - <p>The hardware key code value of the key. This value is an - integer which is provided by the keyboard driver.</p> - <p>Attribute: iso (required)</p> - <p>The corresponding position of a key using the ISO layout - convention where rows are identified by letters and columns are - identified by numbers. For example, "D01" corresponds to the "Q" key - on a US keyboard. (See the definition at the beginning of the - document for a diagram).</p> + <dl> + <dt>Attribute: keycode (required)</dt> + <dd>The hardware key code value of the key. This value is an + integer which is provided by the keyboard driver.</dd> + </dl> + <dl> + <dt>Attribute: iso (required)</dt> + <dd>The corresponding position of a key using the ISO layout + convention where rows are identified by letters and columns are + identified by numbers. For example, "D01" corresponds to the "Q" key + on a US keyboard. (See the definition at the beginning of the + document for a diagram).</dd> + </dl> <p>Examples</p> <pre><platform><br> <hardwareMap><br> <map keycode="2" iso="E01" /><br> <map keycode="3" iso="E02" /><br> <map keycode="4" iso="E03" /><br> <map keycode="5" iso="E04" /><br> <map keycode="6" iso="E05" /><br> <map keycode="7" iso="E06" /><br> <map keycode="41" iso="E00" /><br> </hardwareMap><br></platform></pre> <h2> @@ -991,7 +2203,7 @@ h2, h3, h4, table { the following simplification rules. <br> </li> </ol> - <table cellpadding="0" cellspacing="0" border='1'> + <table> <!-- nocaption --> <tbody> <tr> @@ -1027,7 +2239,7 @@ h2, h3, h4, table { </tr> </tbody> </table> - <table cellpadding="0" cellspacing="0" border='1'> + <table> <!-- nocaption --> <tbody> <tr> @@ -1089,7 +2301,7 @@ h2, h3, h4, table { </h2> <p>Here is a list of the data sources used to generate the initial key map layouts:</p> - <table cellpadding="0" cellspacing="0" border='1'> + <table> <caption> <a name="Key_Map_Data_Sources" href="#Key_Map_Data_Sources">Key Map Data Sources</a> @@ -1199,7 +2411,7 @@ h2, h3, h4, table { href="#Platform_Behaviors_in_Edge_Cases">Platform Behaviors in Edge Cases</a> </h2> - <table cellpadding="0" cellspacing="0" border="1"> + <table> <!-- nocaption --> <tbody> <tr> @@ -1239,19 +2451,17 @@ h2, h3, h4, table { <hr> <p class="copyright"> - Copyright © 2001–2017 Unicode, Inc. All - Rights Reserved. The Unicode Consortium makes no expressed or implied - warranty of any kind, and assumes no liability for errors or - omissions. No liability is assumed for incidental and consequential - damages in connection with or arising out of the use of the - information or programs contained or accompanying this technical - report. The Unicode <a href="http://unicode.org/copyright.html">Terms - of Use</a> apply. + Copyright © 2001–2018 Unicode, Inc. All Rights Reserved. The Unicode + Consortium makes no expressed or implied warranty of any kind, and + assumes no liability for errors or omissions. No liability is assumed + for incidental and consequential damages in connection with or + arising out of the use of the information or programs contained or + accompanying this technical report. The Unicode <a + href="http://unicode.org/copyright.html">Terms of Use</a> apply. </p> <p class="copyright">Unicode and the Unicode logo are trademarks of Unicode, Inc., and are registered in some jurisdictions.</p> </div> </body> - -</html> +</html>
\ No newline at end of file diff --git a/specs/ldml/tr35-numbers.html b/specs/ldml/tr35-numbers.html index da13168..6f58a6b 100644 --- a/specs/ldml/tr35-numbers.html +++ b/specs/ldml/tr35-numbers.html @@ -90,11 +90,11 @@ h2, h3, h4, table { Unicode Locale Data Markup Language (LDML)<br> Part 3: Numbers </h1> - <!-- This header table should be identical across the parts of this UTS. --> + <!-- At least the first row of this header table should be identical across the parts of this UTS. --> <table border="1" cellpadding="2" cellspacing="0" class="wide"> <tr> <td>Version</td> - <td>32</td> + <td>33</td> </tr> <tr> <td>Editors</td> @@ -130,7 +130,7 @@ h2, h3, h4, table { <i>Status</i> </h3> - <!-- NOT YET APPROVED + <!-- NOT YET APPROVED <p> <i class="changed">This is a<b><font color="#ff3333"> draft </font></b>document which may be updated, replaced, or superseded by @@ -392,7 +392,7 @@ h2, h3, h4, table { </h3> <p> <span class="dtd"><!ELEMENT defaultNumberingSystem ( - #PCDATA )></span> + #PCDATA )></span> </p> <p>This element indicates which numbering system should be used for presentation of numeric quantities in the given locale.</p> @@ -2846,7 +2846,7 @@ decimalValue = value ('.' value)?<br>digit = 0|1|2|3|4|5|6|7|8| <hr> <p class="copyright"> - Copyright © 2001–2017 Unicode, Inc. All + Copyright © 2001–2018 Unicode, Inc. All Rights Reserved. The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential diff --git a/specs/ldml/tr35.html b/specs/ldml/tr35.html index 8627e1c..bb5bc7c 100644 --- a/specs/ldml/tr35.html +++ b/specs/ldml/tr35.html @@ -88,17 +88,15 @@ h5 { </table> <div class="body"> <h2 style="text-align: center"> - Unicode Technical - Standard #35 + Unicode Technical Standard #35 </h2> - <h1 style="text-align: center">Unicode Locale Data Markup - Language (LDML)</h1> + <h1 style="text-align: center">Unicode Locale Data Markup Language (LDML)</h1> - <!-- This header table should be identical across the parts of this UTS. --> + <!-- At least the first row of this header table should be identical across the parts of this UTS. --> <table border="1" cellpadding="2" cellspacing="0" class="wide"> <tr> <td>Version</td> - <td>32</td> + <td>33</td> </tr> <tr> <td>Editors</td> @@ -110,18 +108,18 @@ h5 { </tr> <tr> <td>Date</td> - <td>2017-11-01</td> + <td>2018-03-25</td> </tr> <tr> <!-- This link must be made live when posting the final version but is disabled during proposed update stage. --> <td>This Version</td> <td> - <a href="http://www.unicode.org/reports/tr35/tr35-49/tr35.html">http://www.unicode.org/reports/tr35/tr35-49/tr35.html</a></td> + <a href="http://www.unicode.org/reports/tr35/tr35-51/tr35.html">http://www.unicode.org/reports/tr35/tr35-51/tr35.html</a></td> </tr> <tr> <td>Previous Version</td> <td> - <a href="http://www.unicode.org/reports/tr35/tr35-47/tr35.html">http://www.unicode.org/reports/tr35/tr35-47/tr35.html</a></td> + <a href="http://www.unicode.org/reports/tr35/tr35-49/tr35.html">http://www.unicode.org/reports/tr35/tr35-49/tr35.html</a></td> </tr> <tr> <td>Latest Version</td> @@ -141,12 +139,12 @@ h5 { </tr> <tr> <td>DTDs</td> - <td><a href="http://unicode.org/cldr/dtd/32/"> - http://unicode.org/cldr/dtd/32/</a></td> + <td><a href="http://unicode.org/cldr/dtd/33/"> + http://unicode.org/cldr/dtd/33/</a></td> </tr> <tr> <td>Revision</td> - <td><a href="#Modifications">49</a></td> + <td><a href="#Modifications">51</a></td> </tr> </table> <h3> @@ -163,7 +161,7 @@ h5 { <i>Status</i> </h3> - <!-- NOT YET APPROVED + <!-- NOT YET APPROVED <p> <i class="changed">This is a<b><font color="#ff3333"> draft </font></b>document which may be updated, replaced, or superseded by @@ -465,6 +463,7 @@ h5 { <ul class="toc"> <li>6.1 <a href="#Script_Metadata">Script Metadata</a></li> <li>6.2 <a href="#Extended_Pictographic">Extended Pictographic</a></li> + <li>6.3 <a href="#Labels.txt">Labels.txt</a></li> </ul> </li> <li>7 <a href="#Format_Parse_Issues">Issues in Formatting @@ -5963,6 +5962,79 @@ root</pre> href="http://www.unicode.org/reports/tr41/#UTS18">UTS18</a>]. A UnicodeSet may be used in specifications outside of the domain of LDML. In such a case, the specification may support a subset of the syntax provided here.</p> + <p>The following provides EBNF syntax for a UnicodeSet:</p> + <div align='center'> + <table class='simple'> +<tr> + <th>Symbol</th> + <th>Expression</th> + <th>Examples</th> +</tr> +<tr><th>root</th> + <td><code>= prop <br>| '[-]' <br>| '[' [\-\^]? s seq+ ']'</code></td> + <td>\p{x=y},<br> + [abc]</td> +</tr> +<tr><th>seq</th> + <td><code>= root (s [\&\-] s root)* s <br>| range s</code></td> + <td>[abc]-[cde], a <br></td> +</tr> +<tr><th>range</th> + <td><code>= char ('-' char)? <br>| '{' (s char)+ s '}'</code></td> + <td>a, a-c, {abc}</td> +</tr> +<tr><th>prop</th> + <td><code>= '\\' [pP] '{' propName ([≠=] s value1+)? '}' <br>| '[:' '^'? propName ([≠=] s value2+)? ':]'</code></td> + <td>\p{x=y}, [:x=y:]<br></td> +</tr> +<tr><th>propName</th> + <td><code>= s [A-Za-z0-9] [A-Za-z0-9_\x20]* s</code></td> + <td>General_Category,<br> + General Category</td> +</tr> +<tr><th>value1</th> + <td><code>= [^\}] <br> + | '\\' quoted </code></td> + <td>Lm,<br> + \n,<br> + \}</td> +</tr> +<tr><th>value2</th> + <td><code>= [^:] <br> + | '\\' quoted</code></td> + <td>Lm,<br> + \n,<br> + \:</td> +</tr> +<tr><th>char</th> + <td><code>= [^\& \- \[ \[ \] \\ \} \{ [:Pat_WS:]] <br> + | '\\' quoted</code></td> + <td>a, b, c, \n</td> +</tr> +<tr><th>quoted</th> +<td><code>= 'u' (hex{4} | bracketedHex) <br> + | 'x' (hex{2} | bracketedHex) <br> | 'U00' ('0' hex{5} | '10' hex{4}) <br>| 'N{' propName '}' <br>| [\u0000-\U00010FFFF]</code></td> +<td> </td> +</tr> +<tr><th>bracketedHex</th> + <td><code>= '{' s hexCodePoint (s hexCodePoint)* s '}'</code></td> + <td>{61 2019 62}</td> +</tr> +<tr><th>hexCodePoint</th> + <td><code>= hex{1,5} | '10' hex{4}</code></td> + <td> </td> +</tr> +<tr><th>hex</th> + <td><code>= [0-9A-Fa-f]</code></td> + <td> </td> +</tr> +<tr><th>s</th> + <td><code>= [:Pattern_White_Space:]*</code></td> + <td>optional whitespace</td> +</tr> + </table> +</div> + <p>Some constraints on UnicodeSet syntax are not captured by this EBNF. Notably, property names and values are restricted to those supported by the implementation.</p> <p>The syntax characters are listed in the table below:</p> <table> <tbody> @@ -6076,9 +6148,14 @@ root</pre> </p> <table class='simple'> <tr> - <td>\x{h...h}</td> - <td>1-6 hex digits ([0-9A-Fa-f])</td> - </tr> + <td>\x{h...h}<br> + \u{h...h}</td> + <td>list of 1-6 hex digits ([0-9A-Fa-f]), separated by spaces</td> + </tr> + <tr> + <td>\xhh</td> + <td>1-2 hex digits</td> + </tr> <tr> <td>\uhhhh</td> <td>Exactly 4 hex digits</td> @@ -6088,10 +6165,6 @@ root</pre> <td>Exactly 8 hex digits</td> </tr> <tr> - <td>\xhh</td> - <td>1-2 hex digits</td> - </tr> - <tr> <td>\a</td> <td>U+0007 (BEL / ALERT)</td> </tr> @@ -6296,6 +6369,10 @@ root</pre> "ab" and "ac"</td> </tr> <tr> + <td nowrap>[x\u{61 2019 62}y]</td> + <td>Equivalent to [x\u0061\u201\u0062y] (= [xa’by])</td> + </tr> + <tr> <td nowrap>[{ax}-{bz}]</td> <td>The set containing [{ax} {ay} {az} {bx} {by} {bz}], using the range syntax to get all the strings from {ax} to {bz} as @@ -6544,21 +6621,23 @@ japanese arabic civil-arabic thai-buddhist persian roc</variable></pre> blockingItems elements NMTOKENS #IMPLIED > </p> <p> - The blockingItems indicate which elements (and their child elements) + The blockingItems were used to indicate which elements (and their child elements) do not inherit. For example, because supplementalData is a blocking item, all paths containing the element <span class="element">supplementalData</span> - do not inherit. + do not inherit. However, <strong>the <blockingItems> element is now deprecated,</strong> + having been replaced by the annotations in the DTD and the DTDData classes in CLDR tooling. </p> <pre class="dtd"><!ELEMENT distinguishingItems EMPTY > <!ATTLIST distinguishingItems exclude ( true | false ) #IMPLIED > <!ATTLIST distinguishingItems elements NMTOKENS #IMPLIED > <!ATTLIST distinguishingItems attributes NMTOKENS #IMPLIED ></pre> <p> - The distinguishing items indicate which combinations of elements and + The distinguishing items were used to indicate which combinations of elements and attributes (in unblocked environments) are <i>distinguishing</i> in performing inheritance. For example, the attribute type is distinguishing <i>except</i> in combination with certain elements, - such as in: + such as in the following. However, <strong>the <distinguishingItems> element is now deprecated,</strong> + having been replaced by the annotations in the DTD and the DTDData classes in CLDR tooling. </p> <pre><distinguishingItems exclude="true" @@ -6779,20 +6858,33 @@ decimal?, group?, special*)) ></pre> <p>Each file has a header that explains the format and usage of the data.</p> <h3><a name="Script_Metadata" href="#Script_Metadata">6.1 Script Metadata</a></h3> - <p><code>scriptMetadata.txt</code>: This file provides general information about scripts that may be useful to implementations processing text. The information is the best currently available, and may change between versions of CLDR.</p> - <h3><a name="Extended_Pictographic" href="#Extended_Pictographic">6.2 Extended Pictographic</a></h3> - <p><code>ExtendedPictographic.txt</code>: This file defines the Extended_Pictographic (EP) property, a binary code point property for characters that are pictographic (or otherwise similar in kind to characters with the Emoji property) and used to customize segmentation so that possible future emoji zwj sequences will not break grapheme clusters, words, or lines. It also includes unassigned codepoints that are in blocks intended for use for emoji characters, added to the Unicode 9.0 Linebreak property.</p> - <p>It is used in the following customized rules for UAX #29 and UAX #14: </p> - <blockquote> - <p>Let Extended_Pictographic be defined as in <strong>ExtendedPictographic.txt</strong> </p> - <p>Let EmojiRK = [\p{GCB=Regional_Indicator}[*#0-9©®™〰〽]]</p> - <p>Let EmojiNRK = [\p{Emoji=Yes}-EmojiRK]</p> - </blockquote> -<p>The customized rules replacing GB11, WB3c, and LB8a are:</p> -<p> </p> -<p>GB11′ (Extended_Pictographic | EmojiNRK) Extend* ZWJ × (Extended_Pictographic |EmojiNRK)</p> - <p>WB3c′ ZWJ × (Extended_Pictographic | EmojiNRK)</p> - <p>LB8a′ ZWJ × (ID | Extended_Pictographic | EmojiNRK) </p> + <p><code>scriptMetadata.txt</code>: </p> + <p>This file provides general information about scripts that may be useful to implementations processing text. The information is the best currently available, and may change between versions of CLDR. The format is similar to Unicode Character Database property file, and is documented in the header of the data file.</p> + <h3><a name="Extended_Pictographic" href="#Extended_Pictographic">6.2 Extended Pictographic</a> </h3> + <p><code>ExtendedPictographic.txt</code></p> + <p>This file was used to define the ExtendedPictographic data used for “future-proofing” emoji behavior, especially in segmentation. As of Emoji version 11.0, the set of Extended_Pictographic is incorporated into the emoji data files found at <a href="https://unicode.org/Public/emoji/">unicode.org/Public/emoji/</a>.</p> + + + + + + + + + + + + + + + + + + <h3><a name="Labels.txt" href="#Labels.txt">6.3 Labels.txt</a> </h3> + <p><code>labels.txt</code>: </p> + <p>This file provides general information about associations of labels to characters that may be useful to implementations of character-picking applications. The information is the best currently available, and may change between versions of CLDR. The format is similar to Unicode Character Database property file, and is documented in the header of the data file.</p> + <p>Initially, the contents are focused on emoji, but may be expanded in the future to other types of characters. Note that a character may have multiple labels.</p> + <h2> <a name="Format_Parse_Issues" href="#Format_Parse_Issues">7 Issues in Formatting and Parsing</a> @@ -8534,106 +8626,98 @@ decimal?, group?, special*)) ></pre> <a name="Modifications" href="#Modifications">Modifications</a> </h2> -<p><b>Revision 49</b></p> -<ul><li>Revised for CLDR Version 32.</li></ul> - <p><strong>Part 1: <a href="tr35.html#Contents">Core</a></strong></p> - <ul> - <li><strong>Section 5.3.3 <a href="#Unicode_Sets">Unicode Sets</a></strong> - <ul> - <li>Added table of syntax characters; removed statement about escaping with '.</li> - </ul> - </li> - </ul> - <p><strong>Part 2: <a href="tr35-general.html#Contents">General</a></strong></p> - <ul> - <li><strong>Section - 3.1 <a href="tr35-general.html#Exemplars">Exemplars</a></strong> - <ul> - <li>Added numbers exemplars. [<a - href="http://unicode.org/cldr/trac/ticket/6783">#6783</a>]</li> - </ul> - </li> - <li><strong>Section 11 <a href="tr35-general.html#ListPatterns">List Patterns</a></strong> - <ul> - <li>Described the listPattern <strong>type</strong> values. [<a - href="http://unicode.org/cldr/trac/ticket/8107">#8107</a>]</li> - </ul> - </li> - </ul> - <p><strong>Part 3: <a href="tr35-numbers.html#Contents">Numbers</a></strong></p> - <ul> - <li><strong>5.1.2 <a href="#Relations">Relations</a></strong><a href="#Relations"></a> - <ul> - <li>Fixed typo in example: 3 = 2..4,15</li> - </ul> - </li> - </ul> - <p><strong>Part 4: <a href="tr35-dates.html#Contents">Dates</a></strong></p> - <ul> - <li><strong>Section 2.6.2 <a - href="tr35-dates.html#availableFormats_appendItems">Elements availableFormats, appendItems</a></strong> - <ul> - <li>Added information on the use of day period pattern characters a, b, B - in availableFormats skeletons and patterns. [<a - href="http://unicode.org/cldr/trac/ticket/10233">#10233</a>] - </li> - </ul> - </li> - - <li><strong>Section 2.6.2.1 <a - href="tr35-dates.html#Matching_Skeletons">Matching Skeletons</a></strong> - <ul> - <li>Clarified that automatic expansion of an alphabetic skeleton field does - not expand corresponding pattern fields that are numeric. [<a - href="http://unicode.org/cldr/trac/ticket/10540">#10540</a>] - </li> - </ul> - </li> - - <li><strong>Section 7.1 <a - href="tr35-dates.html#Time_Zone_Format_Terminology">Time Zone Format Terminology</a></strong> - <ul> - <li>In the Note under <strong>Generic location format</strong>, correct - the pattern to VVVV, adjust wording for clarity. [<a - href="http://unicode.org/cldr/trac/ticket/10173">#10173</a>]</li> - <li>Delete the misleading claim that the Generic formats sort well.</li> - </ul> - </li> - - <li><strong>Date Field Symbol Table <a - href="tr35-dates.html#dfst-period">period</a> section</strong> - <ul> - <li>Separated examples of dayperiods themselves from usage examples. [<a - href="http://unicode.org/cldr/trac/ticket/10170">#10170</a>] - </li> - </ul> - </li> - </ul> - <p><strong>Part 5: <a href="tr35-collation.html#Contents">Collation</a></strong></p> - <ul> - <li><strong>Section 1.1.5 <a href="#Combining_Rules">Combining Rules</a></strong> - <ul> - <li>Added information on combining rules, showing how to get proper emoji ordering. [<a - href="http://unicode.org/cldr/trac/ticket/10097">#10097</a>] </li> - </ul> - </li> - </ul> - <p><strong>Part 6: <a href="tr35-info.html#Contents">Supplemental</a></strong><a href="tr35-info.html#Contents"></a></p> - <ul> - <li><strong>Section 3.1 <a - href="#Supplemental_Language_Grouping">Supplemental Language Grouping</a></strong><a - href="#Supplemental_Language_Grouping"></a> - <ul> - <li>Added section describing new structure for language groupings</li> - </ul> - </li> - </ul> + +<p><b>Revision 51</b></p> +<p><strong>Part 1: <a href="tr35.html#Contents">Core</a></strong></p> +<ul> + <li>Section 5.3.3 <a href="#Unicode_Sets">Unicode Sets</a> + <ul> + <li>Added EBNF notation, as in <a href="http://www.unicode.org/reports/tr18/#Hex_notation">Hex notation</a>. [<a + href="http://unicode.org/cldr/trac/ticket/10670">#10670</a>]</li> + <li>Added \u{xxx} notation. [<a + href="http://unicode.org/cldr/trac/ticket/10669">#10669</a>]</li> + </ul> + </li> + + <li>Section 5.5 <a href="#Valid_Attribute_Values">Valid Attribute Values</a> + <ul> + <li>Noted that the <blockingItems> and <distinguishingItems> elements are now + deprecated. [<a + href="http://unicode.org/cldr/trac/ticket/10194">#10194</a>]</li> + </ul> + </li> + + <li>Sections 6.2 <a href="#Extended_Pictographic">Extended Pictographic</a> + <ul> + <li>Retracted ExtendedPictographic.txt, since it is now part of the emoji data. [<a + href="http://unicode.org/cldr/trac/ticket/10931">#10931</a>]</li> + </ul> + </li> + + <li>Section 6.3 <a href="#Labels.txt">Labels.txt</a> + <ul> + <li>Added new section on character label data. [<a + href="http://unicode.org/cldr/trac/ticket/10931">#10931</a>]</li> + </ul> + </li> +</ul> +<p><strong>Part 2: <a href="tr35-general.html#Contents">General</a></strong></p> +<ul> + <li>Section 5 <a href="tr35-general.html#Measurement_System_Data">Measurement System Data</a> + <ul> + <li>Improved the description of the "UK" measurement system, and referred for + finer-grained detail on measurement unit usage to <strong>Part 6: Supplemental:</strong> + <em>Section 2.4.1 <a href="tr35-info.html#Preferred_Units_For_Usage">Preferred Units + for Specific Usages</a></em>. [<a + href="http://unicode.org/cldr/trac/ticket/10850">#10850</a>]</li> + </ul> + </li> + <li>Section 14.3 <a href="tr35-general.html#Typographic_Names">Typographic Names</a> + <ul> + <li>Added description of new elements for names of typographic features. [<a + href="http://unicode.org/cldr/trac/ticket/10948">#10948</a>] </li> + </ul> + </li> +</ul> +<p><strong>Part 7</strong>: <a href="tr35-keyboards.html#Contents">Keyboards</a> (keyboard mappings) </p> + <p><em>Major extensions to handle more complex scripts, including the following:</em> [<a + href="http://unicode.org/cldr/trac/ticket/10926">#10926</a>]</p> +<ul> + <li>Section 5.8 <a href="tr35-keyboards.html#Element_map">Element: map</a> + <ul> + <li>Added 3 attributes to control appearance and behavior.</li> + </ul> + </li> + <li>The sections on <transforms> and <transform> were also renumbered to match the ordering of the elements in the DTD, and significantly extended. + <ul> + <li>5.17 <a href="tr35-keyboards.html#Element_transforms">Element: transforms</a></li> + <li>5.18 <a href="tr35-keyboards.html#Element_transform">Element: transform</a></li> + </ul> + </li> + <li>The following new sections were added: + <ul> + <li>5.9 <a href="tr35-keyboards.html#Element_import">Element: import</a></li> + <li>5.10 <a href="tr35-keyboards.html#Element_displayMap">Element: displayMap</a></li> + <li>5.11 <a href="tr35-keyboards.html#Element_display">Element: display</a></li> + <li>5.12 <a href="tr35-keyboards.html#Element_layer">Element: layer</a></li> + <li>5.13 <a href="tr35-keyboards.html#Element_row">Element: row</a></li> + <li>5.14 <a href="tr35-keyboards.html#Element_switch">Element: switch</a></li> + <li>5.15 <a href="tr35-keyboards.html#Element_vkeys">Element: vkeys</a></li> + <li>5.16 <a href="tr35-keyboards.html#Element_vkey">Element: vkey</a></li> + <li>5.19 <a href="tr35-keyboards.html#Element_reorder">Element: reorder</a></li> + <li>5.20 <a href="tr35-keyboards.html#Element_final">Element: final</a></li> + <li>5.21 <a href="tr35-keyboards.html#Element_backspaces">Element: backspaces</a></li> + <li>5.22 <a href="tr35-keyboards.html#Element_backspace">Element: backspace</a></li> + </ul> + </ul> + + <p>Modifications in previous versions are listed in those respective versions. Click on <strong>Previous Version</strong> in the header until you get to the desired version.</p> <hr> <p class="copyright"> - Copyright © 2001–2017 Unicode, Inc. All + Copyright © 2001–2018 Unicode, Inc. All Rights Reserved. The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential |