Stage 3 Draft / September 6, 2024

RegExp.escape

22 Text Processing

22.2 RegExp (Regular Expression) Objects

22.2.5 Properties of the RegExp Constructor

22.2.5.1 RegExp.escape ( S )

This function returns a copy of S in which characters that are potentially special in a regular expression Pattern have been replaced by equivalent escape sequences.

It performs the following steps when called:

  1. If S is not a String, throw a TypeError exception.
  2. Let escaped be the empty String.
  3. Let cpList be StringToCodePoints(S).
  4. For each code point c of cpList, do
    1. If escaped is the empty String and c is matched by either DecimalDigit or AsciiLetter, then
      1. NOTE: Escaping a leading digit ensures that output corresponds with pattern text which may be used after a \0 character escape or a DecimalEscape such as \1 and still match S rather than be interpreted as an extension of the preceding escape sequence. Escaping a leading ASCII letter does the same for the context after \c.
      2. Let numericValue be the numeric value of c.
      3. Let hex be Number::toString(𝔽(numericValue), 16).
      4. Assert: The length of hex is 2.
      5. Set escaped to the string-concatenation of the code unit 0x005C (REVERSE SOLIDUS), "x", and hex.
    2. Else,
      1. Set escaped to the string-concatenation of escaped and EncodeForRegExpEscape(c).
  5. Return escaped.
Note

Despite having similar names, EscapeRegExpPattern and RegExp.escape do not perform similar actions. The former escapes a pattern for representation as a string, while this function escapes a string for representation inside a pattern.

22.2.5.1.1 EncodeForRegExpEscape ( c )

The abstract operation EncodeForRegExpEscape takes argument c (a code point) and returns a String. It returns a string representing a Pattern for matching c. If c is white space or an ASCII punctuator, the returned value is an escape sequence. Otherwise, the returned value is a string representation of c itself. It performs the following steps when called:

  1. If c is matched by SyntaxCharacter or c is U+002F (SOLIDUS), then
    1. Return the string-concatenation of 0x005C (REVERSE SOLIDUS) and UTF16EncodeCodePoint(c).
  2. Else if c is the code point listed in some cell of the “Code Point” column of Table 63, then
    1. Return the string-concatenation of 0x005C (REVERSE SOLIDUS) and the string in the “ControlEscape” column of the row whose “Code Point” column contains c.
  3. Let otherPunctuators be the string-concatenation of ",-=<>#&!%:;@~'`" and the code unit 0x0022 (QUOTATION MARK).
  4. Let toEscape be StringToCodePoints(otherPunctuators).
  5. If toEscape contains c, c is matched by either WhiteSpace or LineTerminator, or c has the same numeric value as a leading surrogate or trailing surrogate, then
    1. Let cNum be the numeric value of c.
    2. If cNum ≤ 0xFF, then
      1. Let hex be Number::toString(𝔽(cNum), 16).
      2. Return the string-concatenation of the code unit 0x005C (REVERSE SOLIDUS), "x", and StringPad(hex, 2, "0", start).
    3. Let escaped be the empty String.
    4. Let codeUnits be UTF16EncodeCodePoint(c).
    5. For each code unit cu of codeUnits, do
      1. Set escaped to the string-concatenation of escaped and UnicodeEscape(cu).
    6. Return escaped.
  6. Return UTF16EncodeCodePoint(c).

A Copyright & Software License

Copyright Notice

© 2024 Jordan Harband,Kevin Gibbons

Software License

All Software contained in this document ("Software") is protected by copyright and is being made available under the "BSD License", included below. This Software may be subject to third party rights (rights from parties other than Ecma International), including patent rights, and no licenses under such third party rights are granted under this license even if the third party concerned is a member of Ecma International. SEE THE ECMA CODE OF CONDUCT IN PATENT MATTERS AVAILABLE AT https://ecma-international.org/memento/codeofconduct.htm FOR INFORMATION REGARDING THE LICENSING OF PATENT CLAIMS THAT ARE REQUIRED TO IMPLEMENT ECMA INTERNATIONAL STANDARDS.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
  3. Neither the name of the authors nor Ecma International may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE ECMA INTERNATIONAL "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL ECMA INTERNATIONAL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.