javascript - Regex infinite backtracking in Eloquent JS -
why parser jump straight b
after comes out of braces, according excerpt, instead of visiting +
operator first?
"for example, if confused while writing binary-number regular expression, might accidentally write
/([01]+)+b/
.""if tries match long series of zeros , ones no trailing b character, matcher first go through inner loop until runs out of digits. notices there no b, backtracks 1 position, goes through outer loop once, , gives again, trying backtrack out of inner loop once more. continue try every possible route through these 2 loops. means amount of work doubles each additional character. few dozen characters, resulting match take practically forever."
from http://eloquentjavascript.net/09_regexp.html#backtracking
you correct, repetion first. text bit misleading, omits part jump outer loop, first retry of outer loop before attempting match b
.
so happens step-by-step? let's first match /([01]+)+b/
against "01b"
:
- the first term
([01]+)+
matched against01b
.- this matches inner term
([01]+)
against01b
, (we ignore capture group)- matches
[01]
against01b
, succeeds. - it goes on match
[01]
again, on1b
, again succeeds. - it goes on match
[01]
again, onb
, fails. backtracks , continuesb
.
- matches
- after
([01]+)
suceeded against01
, goes on match([01]+)
again, onb
.- which matches
[01]
againstb
, fails. backtracks , continuesb
.
- which matches
- this matches inner term
- after
([01]+)+
succeeded against01
, matches second termb
againstb
. succeeds. - it reaches end of regex, , returns match.
ok far? we've seen 2 times backtracking here, each +
repetion failed make round.
let's match /([01]+)+b/
against "01"
:
- the first term
([01]+)+
matched against01
.- this matches inner term
([01]+)
against01
, (we ignore capture group)- matches
[01]
against01
, succeeds. - it goes on match
[01]
again, on1
, again succeeds. - it goes on match
[01]
again, on empty string, fails. reports back.
- matches
- after
([01]+)
suceeded against01
, goes on match([01]+)
again, on empty string.- which tries match
[01]
, fails , reports back.
- which tries match
- this matches inner term
- after
([01]+)+
succeeded against01
, matches second termb
against empty string. fails. it backtracks outer loop. - the outer loop backtracks inner loop.
- the inner loop innermost, , have matched more items minimum requires accepts give last one. we're @ having matched
([01]+)
against0
only, continue on1
. - the outer loop matches inner term
([01]+)
- second time - against1
,- matches
[01]
against1
, suceedds. - it goes on match
[01]
again, on empty string, fails. reports back.
- matches
- after
([01]+){2}
suceeded against01
, goes on match([01]+)
again, on empty string.- which tries match
[01]
, fails , reports back.
- which tries match
- the inner loop innermost, , have matched more items minimum requires accepts give last one. we're @ having matched
- after
([01]+)+
succeeded against01
, matches second termb
against empty string. fails. it backtracks outer loop. - the outer loop backtracks latter match of inner loop
- which has reached minimum of matches , fails give them up.
- the outer loop gives second match. we're @ matching
b
against1
.- which fails, , backtracks.
- the outer loop backtracks first match of inner loop
- which has reached minimum of matches well, , fails give them up.
- the outer loop fails give minimum of 1 match.
- the
b
not matched anywhere, match fails.
Comments
Post a Comment