Skip to content

Commit 66ba7c4

Browse files
committed
Do not allow duplicate variants within the tlang component of a transformed content extension.
1 parent 97c16d6 commit 66ba7c4

File tree

1 file changed

+26
-6
lines changed

1 file changed

+26
-6
lines changed

spec/locales-currencies-tz.html

+26-6
Original file line numberDiff line numberDiff line change
@@ -40,25 +40,45 @@ <h1>Unicode Locale Extension Sequences</h1>
4040
<h1>IsStructurallyValidLanguageTag ( _locale_ )</h1>
4141

4242
<p>
43-
The IsStructurallyValidLanguageTag abstract operation verifies that the _locale_ argument (which must be a String value)
43+
The IsStructurallyValidLanguageTag abstract operation determines whether the _locale_ argument (which must be a String value) is a language tag recognized by this specification. (It does not consider whether the language tag conveys any meaningful semantics, differentiate between aliased subtags and their preferred replacement subtags, or require canonical casing or subtag ordering.)
44+
</p>
45+
46+
<p>
47+
IsStructurallyValidLanguageTag returns *true* if all of the following conditions hold, *false* otherwise:
4448
</p>
4549

4650
<ul>
47-
<li>represents a well-formed "Unicode BCP 47 locale identifier" as specified in <a href="https://unicode.org/reports/tr35/#Unicode_locale_identifier">Unicode Technical Standard 35 section 3.2</a>,</li>
48-
<li>does not include duplicate variant subtags, and</li>
49-
<li>does not include duplicate singleton subtags.</li>
51+
<li>_locale_ can be generated from the EBNF grammar for `unicode_locale_id` in <a href="https://unicode.org/reports/tr35/#Unicode_locale_identifier">Unicode Technical Standard #35 LDML § 3.2 Unicode Locale Identifier</a>;</li>
52+
<li>_locale_ does not use any of the backwards compatibility syntax described in <a href="https://unicode.org/reports/tr35/#BCP_47_Conformance">Unicode Technical Standard #35 LDML § 3.3 BCP 47 Conformance</a>;</li>
53+
<li>the `unicode_language_id` within _locale_ contains no duplicate `unicode_variant_subtag` subtags; and</li>
54+
<li>if _locale_ contains an `extensions*` component, that component
55+
<ul>
56+
<li>does not contain any `other_extensions` components with duplicate `[alphanum-[tTuUxX]]` subtags,</li>
57+
<li>contains at most one `unicode_locale_extensions` component,</li>
58+
<li>contains at most one `transformed_extensions` component, and</li>
59+
<li>if a `transformed_extensions` component that contains a `tlang` component is present, then
60+
<ul>
61+
<li>the `tlang` component contains no duplicate `unicode_variant_subtag` subtags.</li>
62+
</ul>
63+
</li>
64+
</ul>
65+
</li>
5066
</ul>
5167

5268
<p>
53-
The abstract operation returns true if _locale_ can be generated from the EBNF grammar in section 3.2 of the Unicode Technical Standard 35, starting with `unicode_locale_id`, and does not contain duplicate variant or singleton subtags (other than as a private use subtag). It returns false otherwise. Terminal value characters in the grammar are interpreted as the Unicode equivalents of the ASCII octet values given.
69+
When evaluating each condition, terminal value characters in the grammar are interpreted as the corresponding ASCII code points. Subtags are duplicates if they are ASCII case-insensitively equivalent.
5470
</p>
71+
72+
<emu-note>
73+
Every string for which this function returns *true* is both a "Unicode BCP 47 locale identifier", consistent with <a href="https://unicode.org/reports/tr35/#Unicode_locale_identifier">Unicode Technical Standard #35 LDML § 3.2 Unicode Locale Identifier</a> and <a href="https://unicode.org/reports/tr35/#BCP_47_Conformance">Unicode Technical Standard #35 LDML § 3.3 BCP 47 Conformance</a>, and a valid <a href="http://www.rfc-editor.org/rfc/bcp/bcp47.txt">BCP 47</a> language tag.
74+
</emu-note>
5575
</emu-clause>
5676

5777
<emu-clause id="sec-canonicalizeunicodelocaleid" aoid="CanonicalizeUnicodeLocaleId">
5878
<h1>CanonicalizeUnicodeLocaleId ( _locale_ )</h1>
5979

6080
<p>
61-
The CanonicalizeUnicodeLocaleId abstract operation returns the canonical and case-regularized form of the _locale_ argument (which must be a String value that is a structurally valid Unicode BCP 47 locale identifier as verified by the IsStructurallyValidLanguageTag abstract operation).
81+
The CanonicalizeUnicodeLocaleId abstract operation returns the canonical and case-regularized form of the _locale_ argument (which must be a String value for which IsStructurallyValidLanguageTag(_locale_) equals *true*).
6282
The following steps are taken:
6383
</p>
6484

0 commit comments

Comments
 (0)