regex - R: (*SKIP)(*FAIL) for multiple patterns -
given test <- c('met','meet','eel','elm')
, need single line of code matches e not in 'me' or 'ee'. wrote (ee|me)(*skip)(*f)|e
, exclude 'met' , 'eel', not 'meet'. because |
exclusive or? @ rate, there solution returns 'elm'?
for record, know can (?<![me])e(?!e)
, know solution (*skip)(*f)
, why line wrong.
this correct solution (*skip)(*f)
:
(?:me+|ee+)(*skip)(*fail)|e
demo on regex101, using following test cases:
met meet eel elm degree zookeeper meee
only e
in elm
, first e
in degree
, last e
in zookeeper
matched.
since e
in ee
forbidden, e
in after m
forbidden, , e
in substring of consecutive e
forbidden. explains sub-pattern (?:me+|ee+)
.
while aware method not extensible, @ least logically correct.
analysis of other solutions
solution 0
(ee|me)(*skip)(*f)|e
let's use meet
example:
meet # (ee|me)(*skip)(*f)|e ^ # ^ meet # (ee|me)(*skip)(*f)|e ^ # ^ meet # (ee|me)(*skip)(*f)|e ^ # ^ # forbid backtracking pattern left # set index of bump along advance current position meet # (ee|me)(*skip)(*f)|e ^ # ^ # pattern failed. no choice left. bump along. # note backtracking before (*skip) forbidden, # e in second branch not tried meet # (ee|me)(*skip)(*f)|e ^ # ^ # can't match ee or me. try other branch meet # (ee|me)(*skip)(*f)|e ^ # ^ # found match `e`
the problem due fact me
consumes first e
, ee
fails match, leaving second e
available matching.
solution 1
\w*(ee|me)\w*(*skip)(*fail)|e
this skips words ee
, me
, means fail match in degree
, zookeeper
.
solution 2
(?:ee|mee?)(*skip)(?!)|e
similar problem solution 0. when there 3 e
in row, first 2 e
matched mee?
, leaving third e
available matching.
solution 3
(?:^.*[me]e)(*skip)(*fail)|e
this throws away input last me
or ee
, means valid e
before last me
or ee
not matched, first e
in degree
.
Comments
Post a Comment