1Identification of Locales, Currencies, Time Zones, and Measurement Units

This clause describes the String values used in the ECMAScript 2019 Internationalization API Specification to identify locales, currencies, time zones, and measurement units.

1.1Case Sensitivity and Case Mapping

The String values used to identify locales, currencies, and time zones are interpreted in a case-insensitive manner, treating the Unicode Basic Latin characters "A" to "Z" (U+0041 to U+005A) as equivalent to the corresponding Basic Latin characters "a" to "z" (U+0061 to U+007A). No other case folding equivalences are applied. When mapping to upper case, a mapping shall be used that maps characters in the range "a" to "z" (U+0061 to U+007A) to the corresponding characters in the range "A" to "Z" (U+0041 to U+005A) and maps no other characters to the latter range.

EXAMPLES "ß" (U+00DF) must not match or be mapped to "SS" (U+0053, U+0053). "ı" (U+0131) must not match or be mapped to "I" (U+0049).

1.2Language Tags

The ECMAScript 2019 Internationalization API Specification identifies locales using language tags as defined by IETF BCP 47 (RFCs 5646 and 4647 or their successors), which may include extensions such as those registered through RFC 6067. Their canonical form is specified in RFC 5646 section 4.5 or its successor.

BCP 47 language tags that meet those validity criteria of RFC 5646 section 2.2.9 that can be verified without reference to the IANA Language Subtag Registry are considered structurally valid. All structurally valid language tags are valid for use with the APIs defined by this standard. However, the set of locales and thus language tags that an implementation supports with adequate localizations is implementation dependent. The constructors Collator, NumberFormat, DateTimeFormat, and PluralRules map the language tags used in requests to locales supported by their respective implementations.

1.2.1Unicode Locale Extension Sequences

This standard uses the term "Unicode locale extension sequence" for any substring of a language tag that is not part of a private use subtag sequence, starts with a separator "-" and the singleton "u", and includes the maximum sequence of following non-singleton subtags and their preceding "-" separators.

1.2.2IsStructurallyValidLanguageTag ( locale )

The IsStructurallyValidLanguageTag abstract operation verifies that the locale argument (which must be a String value)

  • represents a well-formed BCP 47 language tag as specified in RFC 5646 section 2.1, or successor,
  • does not include duplicate variant subtags, and
  • does not include duplicate singleton subtags.

The abstract operation returns true if locale can be generated from the ABNF grammar in section 2.1 of the RFC, starting with Language-Tag, and does not contain duplicate variant or singleton subtags (other than as a private use subtag). It returns false otherwise. Terminal value characters in the grammar are interpreted as the Unicode equivalents of the ASCII octet values given.

1.2.3CanonicalizeLanguageTag ( locale )

The CanonicalizeLanguageTag abstract operation returns the canonical and case-regularized form of the locale argument (which must be a String value that is a structurally valid BCP 47 language tag as verified by the IsStructurallyValidLanguageTag abstract operation). A conforming implementation shall take the steps specified in RFC 5646 section 4.5, or successor, to bring the language tag into canonical form, and to regularize the case of the subtags. Furthermore, a conforming implementation shall not take the steps to bring a language tag into "extlang form", nor shall it reorder variant subtags.

The specifications for extensions to BCP 47 language tags, such as RFC 6067, may include canonicalization rules for the extension subtag sequences they define that go beyond the canonicalization rules of RFC 5646 section 4.5. Implementations are allowed, but not required, to apply these additional rules.

1.2.4DefaultLocale ()

The DefaultLocale abstract operation returns a String value representing the structurally valid (1.2.2) and canonicalized (1.2.3) BCP 47 language tag for the host environment's current locale.

1.3Currency Codes

The ECMAScript 2019 Internationalization API Specification identifies currencies using 3-letter currency codes as defined by ISO 4217. Their canonical form is upper case.

All well-formed 3-letter ISO 4217 currency codes are allowed. However, the set of combinations of currency code and language tag for which localized currency symbols are available is implementation dependent. Where a localized currency symbol is not available, the ISO 4217 currency code is used for formatting.

1.3.1IsWellFormedCurrencyCode ( currency )

The IsWellFormedCurrencyCode abstract operation verifies that the currency argument (which must be a String value) represents a well-formed 3-letter ISO currency code. The following steps are taken:

  1. Let normalized be the result of mapping currency to upper case as described in 1.1.
  2. If the number of elements in normalized is not 3, return false.
  3. If normalized contains any character that is not in the range "A" to "Z" (U+0041 to U+005A), return false.
  4. Return true.

1.4Time Zone Names

The ECMAScript 2019 Internationalization API Specification identifies time zones using the Zone and Link names of the IANA Time Zone Database. Their canonical form is the corresponding Zone name in the casing used in the IANA Time Zone Database.

All registered Zone and Link names are allowed. Implementations must recognize all such names, and use best available current and historical information about their offsets from UTC and their daylight saving time rules in calculations. However, the set of combinations of time zone name and language tag for which localized time zone names are available is implementation dependent.

1.4.1IsValidTimeZoneName ( timeZone )

The IsValidTimeZoneName abstract operation verifies that the timeZone argument (which must be a String value) represents a valid Zone or Link name of the IANA Time Zone Database.

The abstract operation returns true if timeZone, converted to upper case as described in 1.1, is equal to one of the Zone or Link names of the IANA Time Zone Database, converted to upper case as described in 1.1. It returns false otherwise.

1.4.2CanonicalizeTimeZoneName

The CanonicalizeTimeZoneName abstract operation returns the canonical and case-regularized form of the timeZone argument (which must be a String value that is a valid time zone name as verified by the IsValidTimeZoneName abstract operation). The following steps are taken:

  1. Let ianaTimeZone be the Zone or Link name of the IANA Time Zone Database such that timeZone, converted to upper case as described in 1.1, is equal to ianaTimeZone, converted to upper case as described in 1.1.
  2. If ianaTimeZone is a Link name, let ianaTimeZone be the corresponding Zone name as specified in the "backward" file of the IANA Time Zone Database.
  3. If ianaTimeZone is "Etc/UTC" or "Etc/GMT", return "UTC".
  4. Return ianaTimeZone.

The Intl.DateTimeFormat constructor allows this time zone name; if the time zone is not specified, the host environment's current time zone is used. Implementations shall support UTC and the host environment's current time zone (if different from UTC) in formatting.

1.4.3DefaultTimeZone ()

The DefaultTimeZone abstract operation returns a String value representing the valid (1.4.1) and canonicalized (1.4.2) time zone name for the host environment's current time zone.

1.5Measurement Unit Identifiers

The ECMAScript 2019 Internationalization API Specification identifies measurement units using a core unit identifier as defined by Unicode Technical Standard #35, Part 2, Section 6. Their canonical form is a string containing all lowercase letters with zero or more hyphens.

Only a limited set of core unit identifiers are allowed. An illegal core unit identifier results in a RangeError.

1.5.1IsWellFormedUnitIdentifier ( unitIdentifier )

The IsWellFormedUnitIdentifier abstract operation verifies that the unitIdentifier argument (which must be a String value) represents a well-formed core unit identifier as defined in UTS #35, Part 2, Section 6. In addition to obeying the UTS #35 core unit identifier syntax, unitIdentifier must be one of the identifiers sanctioned by UTS #35 or be a compound unit composed of two sanctioned simple units. The following steps are taken:

  1. If the result of IsSanctionedSimpleUnitIdentifier(unitIdentifier) is true, then
    1. Return true.
  2. If the substring "-per-" does not occur exactly once in unitIdentifier, then
    1. Return false.
  3. Let numerator be the substring of unitIdentifier from the beginning to just before "-per-".
  4. If the result of IsSanctionedSimpleUnitIdentifier(numerator) is false, then
    1. Return false.
  5. Let denominator be the substring of unitIdentifier from just after "-per-" to the end.
  6. If the result of IsSanctionedSimpleUnitIdentifier(denominator) is false, then
    1. Return false.
  7. Return true.

1.5.2IsSanctionedSimpleUnitIdentifier ( unitIdentifier )

The IsSanctionedSimpleUnitIdentifier abstract operation verifies that the given core unit identifier is among the simple units sanctioned in the current version of the ECMAScript standard, a subset of the Validity Data as described in UTS #35, Part 1, Section 3.11; the list may grow over time. As discussed in UTS #35, a simple unit is one that does not have a numerator and denominator. The following steps are taken:

  1. If unitIdentifier is listed in Table 1 below, return true.
  2. Else, Return false.
Table 1: Simple units sanctioned for use in ECMAScript
Simple Unit
acre
bit
byte
celsius
centimeter
day
degree
fahrenheit
fluid-ounce
foot
gallon
gigabit
gigabyte
gram
hectare
hour
inch
kilobit
kilobyte
kilogram
kilometer
liter
megabit
megabyte
meter
mile
mile-scandinavian
millimeter
milliliter
millisecond
minute
month
ounce
percent
petabyte
pound
second
stone
terabit
terabyte
week
yard
year