Regular Expression '\R' Escape for ECMAScript

[?UnicodeMode, ?N]

)

(

[?UnicodeMode, ?N]

)

(

[?UnicodeMode, ?N]

)

(

[?UnicodeMode, ?N]

)

Quantifier

QuantifierPrefix

{

[~Sep]

}

{

[~Sep]

}

{

[~Sep]

[~Sep]

}

[UnicodeMode, N]

[?UnicodeMode, ?N]

[?UnicodeMode]

(

GroupSpecifier

[?UnicodeMode]

[?UnicodeMode, ?N]

)

(

[?UnicodeMode, ?N]

)

SyntaxCharacter

one of

(

)

[

]

{

}

PatternCharacter

but not SyntaxCharacter

[UnicodeMode, N]

[+UnicodeMode]

[?UnicodeMode]

[?UnicodeMode]

[+N]

[?UnicodeMode]

[UnicodeMode]

[lookahead ∉ DecimalDigit]

HexEscapeSequence

[?UnicodeMode]

[?UnicodeMode]

one of

one of

[UnicodeMode]

[empty]

[?UnicodeMode]

[UnicodeMode]

[?UnicodeMode]

[UnicodeMode]

RegExpIdentifierStart

[?UnicodeMode]

RegExpIdentifierName

[?UnicodeMode]

RegExpIdentifierPart

[?UnicodeMode]

RegExpIdentifierStart

[UnicodeMode]

IdentifierStartChar

[+UnicodeMode]

[~UnicodeMode]

UnicodeLeadSurrogate

UnicodeTrailSurrogate

RegExpIdentifierPart

[UnicodeMode]

IdentifierPartChar

[+UnicodeMode]

[~UnicodeMode]

UnicodeLeadSurrogate

UnicodeTrailSurrogate

[UnicodeMode]

[+UnicodeMode]

[+UnicodeMode]

[+UnicodeMode]

[+UnicodeMode]

[~UnicodeMode]

[+UnicodeMode]

}

any Unicode code point in the inclusive range 0xD800 to 0xDBFF

UnicodeTrailSurrogate

any Unicode code point in the inclusive range 0xDC00 to 0xDFFF

Each \u HexTrailSurrogate for which the choice of associated u HexLeadSurrogate is ambiguous shall be associated with the nearest possible u HexLeadSurrogate that would otherwise have no corresponding \u HexTrailSurrogate.

HexLeadSurrogate

Hex4Digits

but only if the MV of Hex4Digits is in the inclusive range 0xD800 to 0xDBFF

HexTrailSurrogate

Hex4Digits

but only if the MV of Hex4Digits is in the inclusive range 0xDC00 to 0xDFFF

HexNonSurrogate

Hex4Digits

but only if the MV of Hex4Digits is not in the inclusive range 0xD800 to 0xDFFF

IdentityEscape

[UnicodeMode]

[+UnicodeMode]

SyntaxCharacter

[+UnicodeMode]

[~UnicodeMode]

but not UnicodeIDContinue

DecimalEscape

NonZeroDigit

UnicodePropertyValueExpression

[~Sep]

opt

[lookahead ∉ DecimalDigit]

CharacterClassEscape

[UnicodeMode]

[+UnicodeMode]

}

[+UnicodeMode]

UnicodePropertyValueExpression

}

UnicodePropertyValueExpression

UnicodePropertyName

UnicodePropertyValue

LoneUnicodePropertyNameOrValue

UnicodePropertyName

UnicodePropertyNameCharacters

UnicodePropertyNameCharacter

UnicodePropertyNameCharacters

opt

UnicodePropertyValue

UnicodePropertyValueCharacters

LoneUnicodePropertyNameOrValue

UnicodePropertyValueCharacters

UnicodePropertyValueCharacter

UnicodePropertyValueCharacters

opt

UnicodePropertyValueCharacter

UnicodePropertyNameCharacter

DecimalDigit

UnicodePropertyNameCharacter

ControlLetter

CharacterClass

[UnicodeMode]

[

[lookahead ≠ ^]

[?UnicodeMode]

]

[

[?UnicodeMode]

]

NonemptyClassRangesNoDash

[UnicodeMode]

[empty]

[?UnicodeMode]

[UnicodeMode]

[?UnicodeMode]

[?UnicodeMode]

[?UnicodeMode]

ClassAtom

[?UnicodeMode]

ClassAtom

[?UnicodeMode]

NonemptyClassRangesNoDash

[?UnicodeMode]

[UnicodeMode]

ClassAtom

[?UnicodeMode]

ClassAtomNoDash

[?UnicodeMode]

NonemptyClassRangesNoDash

[?UnicodeMode]

[?UnicodeMode]

[?UnicodeMode]

[?UnicodeMode]

[UnicodeMode]

[?UnicodeMode]

[UnicodeMode]

but not one of \ or ] or -

[?UnicodeMode]

[UnicodeMode]

[+UnicodeMode]

CharacterClassEscape

[?UnicodeMode]

CharacterEscape

[?UnicodeMode]

Note

A number of productions in this section are given alternative definitions in section A.1.1.

1.1.2 Pattern Semantics

1.1.2.1 Runtime Semantics: CompileAtom

The syntax-directed operation CompileAtom takes argument direction (forward or backward). It returns a Matcher.

AtomEscape

Return a new Matcher with parameters (x, c) that captures direction and performs the following steps when called:
1. Assert: x is a State.
2. Assert: c is a Continuation.
3. Let e be x's endIndex.
4. If direction is forward, let f be e + 1.
5. Else, let f be e - 1.
6. If f < 0 or f > InputLength, return failure.
7. Let index be min(e, f).
8. Let ch be the character Input[index].
9. Let cc be Canonicalize(ch).
10. Let A be a CharSet containing the characters <LF>, <VT>, <FF>, <CR>, <NL>, <LS>, and <PS>.
11. If there does not exist a member a of A such that Canonicalize(a) is cc, return failure.
12. If direction is forward and cc is the character <CR> and index + 1 < InputLength, then
  1. Let nextCh be the character Input[index + 1].
  2. Let nextCc be Canonicalize(nextCh).
  3. If nextCc is the character <LF>, set f to f + 1.
13. Else, if direction is backward and cc is the character <LF> and index - 1 > 0, then
  1. Let prevCh be the character Input[index - 1].
  2. Let prevCc be Canonicalize(prevCh).
  3. If prevCc is the character <CR>, set f to f - 1.
14. Let cap be x's captures List.
15. Let y be the State(f, cap).
16. Return c(y).

A Additional ECMAScript Features for Web Browsers

A.1 Additional Syntax

A.1.1 Regular Expressions Patterns

The syntax of 1.1.1 is modified and extended as follows. These changes introduce ambiguities that are broken by the ordering of grammar productions and by contextual information. When parsing using the following grammar, each alternative is considered only if previous production alternatives do not match.

This alternative pattern grammar and semantics only changes the syntax and semantics of BMP patterns. The following grammar extensions include productions parameterized with the [UnicodeMode] parameter. However, none of these extensions change the syntax of Unicode patterns recognized when parsing with the [UnicodeMode] parameter present on the goal symbol.

Syntax

Term

[UnicodeMode, N]

[+UnicodeMode]

Assertion

[+UnicodeMode, ?N]

[+UnicodeMode]

Atom

[+UnicodeMode, ?N]

Quantifier

[+UnicodeMode]

Atom

[+UnicodeMode, ?N]

[~UnicodeMode]

QuantifiableAssertion

[?N]

Quantifier

[~UnicodeMode]

Assertion

[~UnicodeMode, ?N]

[~UnicodeMode]

[?N]

[~UnicodeMode]

[?N]

[UnicodeMode, N]

[+UnicodeMode]

(

[+UnicodeMode, ?N]

)

[+UnicodeMode]

(

[+UnicodeMode, ?N]

)

[~UnicodeMode]

QuantifiableAssertion

[?N]

(

[?UnicodeMode, ?N]

)

(

[?UnicodeMode, ?N]

)

QuantifiableAssertion

[N]

(

[~UnicodeMode, ?N]

)

(

[~UnicodeMode, ?N]

)

ExtendedAtom

[N]

AtomEscape

[~UnicodeMode, ?N]

[lookahead = c]

CharacterClass

[~UnicodeMode]

(

[~UnicodeMode, ?N]

)

(

[~UnicodeMode, ?N]

)

InvalidBracedQuantifier

ExtendedPatternCharacter

InvalidBracedQuantifier

{

[~Sep]

}

{

[~Sep]

}

{

[~Sep]

[~Sep]

}

ExtendedPatternCharacter

but not one of ^

(

)

[

AtomEscape

[UnicodeMode, N]

[+UnicodeMode]

DecimalEscape

[~UnicodeMode]

DecimalEscape

but only if the CapturingGroupNumber of DecimalEscape is ≤ NcapturingParens

[+UnicodeMode]

CharacterClassEscape

[?UnicodeMode]

CharacterEscape

[?UnicodeMode, ?N]

[+N]

[?UnicodeMode]

[UnicodeMode, N]

[lookahead ∉ DecimalDigit]

HexEscapeSequence

LegacyOctalEscapeSequence

[?UnicodeMode]

[~UnicodeMode]

IdentityEscape

[?UnicodeMode, ?N]

IdentityEscape

[UnicodeMode, N]

[+UnicodeMode]

SyntaxCharacter

[+UnicodeMode]

[~UnicodeMode]

SourceCharacterIdentityEscape

[?N]

SourceCharacterIdentityEscape

[N]

[~N]

but not c

[+N]

but not one of c or k

ClassAtomNoDash

[UnicodeMode, N]

but not one of \ or ] or -

[?UnicodeMode, ?N]

[lookahead = c]