1Identification of Locales, Currencies, Time Zones, and Measurement Units

This clause describes the String values used in the ECMAScript 2020 Internationalization API Specification to identify locales, currencies, time zones, and measurement units.

1.1Case Sensitivity and Case Mapping

The String values used to identify locales, currencies, and time zones are interpreted in a case-insensitive manner, treating the Unicode Basic Latin characters "A" to "Z" (U+0041 to U+005A) as equivalent to the corresponding Basic Latin characters "a" to "z" (U+0061 to U+007A). No other case folding equivalences are applied. When mapping to upper case, a mapping shall be used that maps characters in the range "a" to "z" (U+0061 to U+007A) to the corresponding characters in the range "A" to "Z" (U+0041 to U+005A) and maps no other characters to the latter range.

EXAMPLES "ß" (U+00DF) must not match or be mapped to "SS" (U+0053, U+0053). "ı" (U+0131) must not match or be mapped to "I" (U+0049).

1.2Language Tags

The ECMAScript 2020 Internationalization API Specification identifies locales using language tags as by the Unicode BCP 47 locale identifiers, which may include extensions such as those registered through RFC 6067. Their canonical form is that of a Unicode BCP 47 Locale Identifier, as specified in Unicode Technical Standard #35 LDML § 3.3 BCP 47 Conformance.

Unicode BCP 47 Locale Identifiers are structurally valid when they match those syntactical formatting criteria of Unicode Technical Standard 35, section 3.2, or successor, but it is not required to validate them according to the Unicode validation data. All structurally valid language tags are valid for use with the APIs defined by this standard. However, the set of locales and thus language tags that an implementation supports with adequate localizations is implementation dependent. The constructors Collator, NumberFormat, DateTimeFormat, and PluralRules map the language tags used in requests to locales supported by their respective implementations.

1.2.1Unicode Locale Extension Sequences

This standard uses the term "Unicode locale extension sequence" - as described in unicode_locale_extensions in Unicode BCP 47 - for any substring of a language tag that is not part of a private use subtag sequence, starts with a separator "-" and the singleton "u", and includes the maximum sequence of following non-singleton subtags and their preceding "-" separators.

1.2.2IsStructurallyValidLanguageTag ( `locale` )

The IsStructurallyValidLanguageTag abstract operation verifies that the locale argument (which must be a String value)

represents a well-formed Unicode BCP 47 Locale Identifier" as specified in Unicode Technical Standard 35 section 3.2, or successor,
does not include duplicate variant subtags, and
does not include duplicate singleton subtags.

The abstract operation returns true if locale can be generated from the EBNF grammar in section 3.2 of the Unicode Technical Standard 35, or successor, starting with unicode_locale_id, and does not contain duplicate variant or singleton subtags (other than as a private use subtag). It returns false otherwise. Terminal value characters in the grammar are interpreted as the Unicode equivalents of the ASCII octet values given.

1.2.3CanonicalizeLanguageTag ( `locale` )

The CanonicalizeLanguageTag abstract operation returns the canonical and case-regularized form of the locale argument (which must be a String value that is a structurally valid Unicode BCP 47 Locale Identifier as verified by the IsStructurallyValidLanguageTag abstract operation). A conforming implementation shall take the steps specified in the “BCP 47 Language Tag to Unicode BCP 47 Locale Identifier” algorithm, from Unicode Technical Standard #35 LDML § 3.3.1 BCP 47 Language Tag Conversion.

1.2.4DefaultLocale ()

The DefaultLocale abstract operation returns a String value representing the structurally valid (1.2.2) and canonicalized (1.2.3) BCP 47 language tag for the host environment's current locale.

1.3Currency Codes

The ECMAScript 2020 Internationalization API Specification identifies currencies using 3-letter currency codes as defined by ISO 4217. Their canonical form is upper case.

All well-formed 3-letter ISO 4217 currency codes are allowed. However, the set of combinations of currency code and language tag for which localized currency symbols are available is implementation dependent. Where a localized currency symbol is not available, the ISO 4217 currency code is used for formatting.

1.3.1IsWellFormedCurrencyCode ( `currency` )

The IsWellFormedCurrencyCode abstract operation verifies that the currency argument (which must be a String value) represents a well-formed 3-letter ISO currency code. The following steps are taken:

Let normalized be the result of mapping currency to upper case as described in 1.1.
If the number of elements in normalized is not 3, return false.
If normalized contains any character that is not in the range "A" to "Z" (U+0041 to U+005A), return false.
Return true.

1.4Time Zone Names

The ECMAScript 2020 Internationalization API Specification identifies time zones using the Zone and Link names of the IANA Time Zone Database. Their canonical form is the corresponding Zone name in the casing used in the IANA Time Zone Database.

All registered Zone and Link names are allowed. Implementations must recognize all such names, and use best available current and historical information about their offsets from UTC and their daylight saving time rules in calculations. However, the set of combinations of time zone name and language tag for which localized time zone names are available is implementation dependent.

1.4.1IsValidTimeZoneName ( `timeZone` )

The IsValidTimeZoneName abstract operation verifies that the timeZone argument (which must be a String value) represents a valid Zone or Link name of the IANA Time Zone Database.

The abstract operation returns true if timeZone, converted to upper case as described in 1.1, is equal to one of the Zone or Link names of the IANA Time Zone Database, converted to upper case as described in 1.1. It returns false otherwise.

1.4.2CanonicalizeTimeZoneName

The CanonicalizeTimeZoneName abstract operation returns the canonical and case-regularized form of the timeZone argument (which must be a String value that is a valid time zone name as verified by the IsValidTimeZoneName abstract operation). The following steps are taken:

Let ianaTimeZone be the Zone or Link name of the IANA Time Zone Database such that timeZone, converted to upper case as described in 1.1, is equal to ianaTimeZone, converted to upper case as described in 1.1.
If ianaTimeZone is a Link name, let ianaTimeZone be the corresponding Zone name as specified in the "backward" file of the IANA Time Zone Database.
If ianaTimeZone is "Etc/UTC" or "Etc/GMT", return "UTC".
Return ianaTimeZone.

The Intl.DateTimeFormat constructor allows this time zone name; if the time zone is not specified, the host environment's current time zone is used. Implementations shall support UTC and the host environment's current time zone (if different from UTC) in formatting.

1.4.3DefaultTimeZone ()

The DefaultTimeZone abstract operation returns a String value representing the valid (1.4.1) and canonicalized (1.4.2) time zone name for the host environment's current time zone.

1.5Measurement Unit Identifiers

The ECMAScript 2020 Internationalization API Specification identifies measurement units using a core unit identifier as defined by Unicode Technical Standard #35, Part 2, Section 6. Their canonical form is a string containing all lowercase letters with zero or more hyphens.

Only a limited set of core unit identifiers are allowed. An illegal core unit identifier results in a RangeError.

1.5.1IsWellFormedUnitIdentifier ( `unitIdentifier` )

The IsWellFormedUnitIdentifier abstract operation verifies that the unitIdentifier argument (which must be a String value) represents a well-formed core unit identifier as defined in UTS #35, Part 2, Section 6. In addition to obeying the UTS #35 core unit identifier syntax, unitIdentifier must be one of the identifiers sanctioned by UTS #35 or be a compound unit composed of two sanctioned simple units. The following steps are taken:

If the result of IsSanctionedSimpleUnitIdentifier(unitIdentifier) is true, then
1. Return true.
If the substring "-per-" does not occur exactly once in unitIdentifier, then
1. Return false.
Let numerator be the substring of unitIdentifier from the beginning to just before "-per-".
If the result of IsSanctionedSimpleUnitIdentifier(numerator) is false, then
1. Return false.
Let denominator be the substring of unitIdentifier from just after "-per-" to the end.
If the result of IsSanctionedSimpleUnitIdentifier(denominator) is false, then
1. Return false.
Return true.

1.5.2IsSanctionedSimpleUnitIdentifier ( `unitIdentifier` )

The IsSanctionedSimpleUnitIdentifier abstract operation verifies that the given core unit identifier is among the simple units sanctioned in the current version of the ECMAScript standard, a subset of the Validity Data as described in UTS #35, Part 1, Section 3.11; the list may grow over time. As discussed in UTS #35, a simple unit is one that does not have a numerator and denominator. The following steps are taken:

If unitIdentifier is listed in Table 1 below, return true.
Else, Return false.

Table 1: Simple units sanctioned for use in ECMAScript

Simple Unit
acre
bit
byte
celsius
centimeter
day
degree
fahrenheit
fluid-ounce
foot
gallon
gigabit
gigabyte
gram
hectare
hour
inch
kilobit
kilobyte
kilogram
kilometer
liter
megabit
megabyte
meter
mile
mile-scandinavian
milliliter
millimeter
millisecond
minute
month
ounce
percent
petabyte
pound
second
stone
terabit
terabyte
week
yard
year