javascript - Regex infinite backtracking in Eloquent JS -
why parser jump straight b after comes out of braces, according excerpt, instead of visiting + operator first?
"for example, if confused while writing binary-number regular expression, might accidentally write
/([01]+)+b/."
"if tries match long series of zeros , ones no trailing b character, matcher first go through inner loop until runs out of digits. notices there no b, backtracks 1 position, goes through outer loop once, , gives again, trying backtrack out of inner loop once more. continue try every possible route through these 2 loops. means amount of work doubles each additional character. few dozen characters, resulting match take practically forever."
from http://eloquentjavascript.net/09_regexp.html#backtracking
you correct, repetion first. text bit misleading, omits part jump outer loop, first retry of outer loop before attempting match b.
so happens step-by-step? let's first match /([01]+)+b/ against "01b":
- the first term
([01]+)+matched against01b.- this matches inner term
([01]+)against01b, (we ignore capture group)- matches
[01]against01b, succeeds. - it goes on match
[01]again, on1b, again succeeds. - it goes on match
[01]again, onb, fails. backtracks , continuesb.
- matches
- after
([01]+)suceeded against01, goes on match([01]+)again, onb.- which matches
[01]againstb, fails. backtracks , continuesb.
- which matches
- this matches inner term
- after
([01]+)+succeeded against01, matches second termbagainstb. succeeds. - it reaches end of regex, , returns match.
ok far? we've seen 2 times backtracking here, each + repetion failed make round.
let's match /([01]+)+b/ against "01":
- the first term
([01]+)+matched against01.- this matches inner term
([01]+)against01, (we ignore capture group)- matches
[01]against01, succeeds. - it goes on match
[01]again, on1, again succeeds. - it goes on match
[01]again, on empty string, fails. reports back.
- matches
- after
([01]+)suceeded against01, goes on match([01]+)again, on empty string.- which tries match
[01], fails , reports back.
- which tries match
- this matches inner term
- after
([01]+)+succeeded against01, matches second termbagainst empty string. fails. it backtracks outer loop. - the outer loop backtracks inner loop.
- the inner loop innermost, , have matched more items minimum requires accepts give last one. we're @ having matched
([01]+)against0only, continue on1. - the outer loop matches inner term
([01]+)- second time - against1,- matches
[01]against1, suceedds. - it goes on match
[01]again, on empty string, fails. reports back.
- matches
- after
([01]+){2}suceeded against01, goes on match([01]+)again, on empty string.- which tries match
[01], fails , reports back.
- which tries match
- the inner loop innermost, , have matched more items minimum requires accepts give last one. we're @ having matched
- after
([01]+)+succeeded against01, matches second termbagainst empty string. fails. it backtracks outer loop. - the outer loop backtracks latter match of inner loop
- which has reached minimum of matches , fails give them up.
- the outer loop gives second match. we're @ matching
bagainst1.- which fails, , backtracks.
- the outer loop backtracks first match of inner loop
- which has reached minimum of matches well, , fails give them up.
- the outer loop fails give minimum of 1 match.
- the
bnot matched anywhere, match fails.

Comments
Post a Comment