Lookarounds are zero-width assertions that match a string without consuming anything. ECMAScript has lookahead assertions that does this in forward direction, but the language is missing a way to do this backward which the lookbehind assertions provide. With lookbehind assertions, one can make sure that a pattern is or isn't preceded by another, e.g. matching a dollar amount without capturing the dollar sign.
See the proposal repository for background material and discussion.
Each \u
u
u
\u
The production
A Pattern evaluates (“compiles”) to an internal procedure value.
With argument direction.
The production
The production
The |
regular expression operator separates two alternatives. The pattern first tries to match the left |
produce
/a|ab/.exec("abc")
returns the result "a"
and not "ab"
. Moreover,
/((a)|(ab))((c)|(bc))/.exec("abc")
returns the array
["abc", "a", "a", undefined, "bc", undefined, "bc"]
and not
["abc", "ab", undefined, "ab", "c", "c", undefined]
The order in which the two alternatives are tried is independent of the value of direction.
With argument direction.
The production
The production
Consecutive
With argument direction.
The production
The AssertionTester is independent of direction.
The production
The production
The abstract operation RepeatMatcher takes eight parameters, a Matcher m, an integer min, an integer (or ∞) max, a Boolean greedy, a State x, a Continuation c, an integer parenIndex, and an integer parenCount, and performs the following steps:
An
If the
Compare
/a[a-z]{2,4}/.exec("abcdefghi")
which returns "abcde"
with
/a[a-z]{2,4}?/.exec("abcdefghi")
which returns "abc"
.
Consider also
/(aa|aabaac|ba|b|c)*/.exec("aabaac")
which, by the choice point ordering above, returns the array
["aaba", "ba"]
and not any of:
["aabaac", "aabaac"]
["aabaac", "c"]
The above ordering of choice points can be used to write a regular expression that calculates the greatest common divisor of two numbers (represented in unary notation). The following example calculates the gcd of 10 and 15:
"aaaaaaaaaa,aaaaaaaaaaaaaaa".replace(/^(a+)\1*,\1+$/,"$1")
which returns the gcd in unary notation "aaaaa"
.
Step 4 of the RepeatMatcher clears
/(z)((a+)?(b+)?(c))*/.exec("zaacbbbcac")
which returns the array
["zaacbbbcac", "z", "ac", "a", undefined, "c"]
and not
["zaacbbbcac", "z", "ac", "a", "bbb", "c"]
because each iteration of the outermost *
clears all captured Strings contained in the quantified
Step 1 of the RepeatMatcher's d closure states that, once the minimum number of repetitions has been satisfied, any more expansions of
/(a*)*/.exec("b")
or the slightly more complicated:
/(a*)b\1+/.exec("baaaac")
which returns the array
["b", ""]
The production
Even when the y
flag is used with a pattern, ^
always matches only at the beginning of Input, or (if Multiline is
The production
The production
The production
The production
The production
The production
The production
The abstract operation WordCharacters performs the following steps:
a
|
b
|
c
|
d
|
e
|
f
|
g
|
h
|
i
|
j
|
k
|
l
|
m
|
n
|
o
|
p
|
q
|
r
|
s
|
t
|
u
|
v
|
w
|
x
|
y
|
z
|
A
|
B
|
C
|
D
|
E
|
F
|
G
|
H
|
I
|
J
|
K
|
L
|
M
|
N
|
O
|
P
|
Q
|
R
|
S
|
T
|
U
|
V
|
W
|
X
|
Y
|
Z
|
0
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
_
|
The abstract operation IsWordChar takes an integer parameter e and performs the following steps:
With argument direction.
The production
The production
The production
The production
The production
The production
With argument direction.
The abstract operation CharacterSetMatcher takes twothree arguments, a CharSet A, a Boolean flag invert and an integer direction, and performs the following steps:
The abstract operation Canonicalize takes a character parameter ch and performs the following steps:
String.prototype.toUpperCase
using s as the Parentheses of the form (
)
serve both to group the components of the \
followed by a nonzero decimal number), referenced in a replace String, or returned as part of an array from the regular expression matching internal procedure. To inhibit the capturing behaviour of parentheses, use the form (?:
)
instead.
The form (?=
)
specifies a zero-width positive lookahead. In order for it to succeed, the pattern inside (?=
form (this unusual behaviour is inherited from Perl). This only matters when the
For example,
/(?=(a+))/.exec("baaabac")
matches the empty String immediately after the first b
and therefore returns the array:
["", "aaa"]
To illustrate the lack of backtracking into the lookahead, consider:
/(?=(a+))a*b\1/.exec("baaabac")
This expression returns
["aba", "a"]
and not:
["aaaba", "a"]
The form (?!
)
specifies a zero-width negative lookahead. In order for it to succeed, the pattern inside
/(.*?)a(?!(a+)b\2c)\2(.*)/.exec("baaabaac")
looks for an a
not immediately followed by some positive number n of a
's, a b
, another n a
's (specified by the first \2
) and a c
. The second \2
is outside the negative lookahead, so it matches against
["baaabaac", "ba", undefined, "abaac"]
In case-insignificant matches when Unicode is "ß"
(U+00DF) to "SS"
. It may however map a code point outside the Basic Latin range to a character within, for example, "ſ"
(U+017F) to "s"
. Such characters are not mapped if Unicode is /[a-z]/i
, but they will match /[a-z]/ui
.
With argument direction.
The production
The production
The production
An escape sequence of the form \
followed by a nonzero decimal number n matches the result of the nth set of capturing parentheses (