javascript - Regex infinite backtracking in Eloquent JS -


why parser jump straight b after comes out of braces, according excerpt, instead of visiting + operator first?

"for example, if confused while writing binary-number regular expression, might accidentally write /([01]+)+b/."

enter image description here

"if tries match long series of zeros , ones no trailing b character, matcher first go through inner loop until runs out of digits. notices there no b, backtracks 1 position, goes through outer loop once, , gives again, trying backtrack out of inner loop once more. continue try every possible route through these 2 loops. means amount of work doubles each additional character. few dozen characters, resulting match take practically forever."

from http://eloquentjavascript.net/09_regexp.html#backtracking

you correct, repetion first. text bit misleading, omits part jump outer loop, first retry of outer loop before attempting match b.

so happens step-by-step? let's first match /([01]+)+b/ against "01b":

  1. the first term ([01]+)+ matched against 01b.
    1. this matches inner term ([01]+) against 01b , (we ignore capture group)
      1. matches [01] against 01b, succeeds.
      2. it goes on match [01] again, on 1b, again succeeds.
      3. it goes on match [01] again, on b, fails. backtracks , continues b.
    2. after ([01]+) suceeded against 01, goes on match ([01]+) again, on b.
      1. which matches [01] against b, fails. backtracks , continues b.
  2. after ([01]+)+ succeeded against 01, matches second term b against b. succeeds.
  3. it reaches end of regex, , returns match.

ok far? we've seen 2 times backtracking here, each + repetion failed make round.

let's match /([01]+)+b/ against "01":

  1. the first term ([01]+)+ matched against 01.
    1. this matches inner term ([01]+) against 01 , (we ignore capture group)
      1. matches [01] against 01, succeeds.
      2. it goes on match [01] again, on 1, again succeeds.
      3. it goes on match [01] again, on empty string, fails. reports back.
    2. after ([01]+) suceeded against 01, goes on match ([01]+) again, on empty string.
      1. which tries match [01], fails , reports back.
  2. after ([01]+)+ succeeded against 01, matches second term b against empty string. fails. it backtracks outer loop.
  3. the outer loop backtracks inner loop.
    1. the inner loop innermost, , have matched more items minimum requires accepts give last one. we're @ having matched ([01]+) against 0 only, continue on 1.
    2. the outer loop matches inner term ([01]+) - second time - against 1 ,
      1. matches [01] against 1, suceedds.
      2. it goes on match [01] again, on empty string, fails. reports back.
    3. after ([01]+){2} suceeded against 01, goes on match ([01]+) again, on empty string.
      1. which tries match [01], fails , reports back.
  4. after ([01]+)+ succeeded against 01, matches second term b against empty string. fails. it backtracks outer loop.
  5. the outer loop backtracks latter match of inner loop
    1. which has reached minimum of matches , fails give them up.
  6. the outer loop gives second match. we're @ matching b against 1.
    1. which fails, , backtracks.
  7. the outer loop backtracks first match of inner loop
    1. which has reached minimum of matches well, , fails give them up.
  8. the outer loop fails give minimum of 1 match.
  9. the b not matched anywhere, match fails.

Comments

Popular posts from this blog

css - SVG using textPath a symbol not rendering in Firefox -

Java 8 + Maven Javadoc plugin: Error fetching URL -

datatable - Matlab struct computations -