Re: Regex question - alternation

Tech-Archive recommends: Speed Up your PC by fixing your registry

From: Niki Estner (niki.estner_at_cube.net)
Date: 07/23/04


Date: Fri, 23 Jul 2004 12:39:46 +0200

I think you are right if the expression you are matching is well formed.
Then the parens should be matched first as they are left in the alternation.
However, if you have a text like "(()" which shouldn't be matched at all,
the regex engine would probably give the second "("'s to a "." if that was
in the alternation.

Hope this is in any way understandable...

Niki

"Jon Shemitz" <jon@midnightbeach.com> wrote in
news:41004DA6.784F67DE@midnightbeach.com...
> I have a regex that matches nested parens (it's a slightly rewritten
> version of the one in Friedl's "Mastering Regular Expressions, 2e"):
>
> #[IgnorePatternWhitespace]
> \( # a literal (
>
> (?: # non-capture group
> \( (?<Stack>) # on nested (, push empty capture
> | \) (?<-Stack>) # on nested ), pop empty capture
> | [^()] # anything except ( or )
> )* # any number of chars between parens
>
> (?(Stack) # if stack not empty:
> ^ # then, match beginning of string (ie, fail)
> | \) ) # else, match literal )
>
> Now, this works just fine, even on empty parens "()". What I don't
> understand is why I have to use /[^()]/ instead of "." in the clause
> that matches anything except parens.
>
> As I understand alternation, the left alternative is matched first. If
> it matches, the right alternative is skipped. (This certainly seems to
> explain the capture behavior when I match /that|th([a-z])t/ against
> "that thought": on the first Match, the second Group fails to match;
> while on the second Match, the second Group matches "ough".)
>
> Why, then, doesn't /./ work instead of /[^()]/? Since a left or right
> paren should have already been matched ....
>
> --
>
> programmer, author http://www.midnightbeach.com
> and father http://www.midnightbeach.com/hs



Relevant Pages

  • RE: regex alternation question
    ... but the parens aren't necessary. ... In this case you need the parens to group the alternation part of the ... Say I want to find either "foo" or "bar" within a string. ...
    (perl.beginners)
  • Regex question - alternation
    ... Now, this works just fine, even on empty parens "". ... As I understand alternation, the left alternative is matched first. ... on the first Match, the second Group fails to match; ...
    (microsoft.public.dotnet.framework)
  • Re: If statement - multiple conditions
    ... You can use the ksh syntax for "exactly one matching pattern ... @and combine it with alternation within the parens. ...
    (comp.unix.shell)
  • Re: Strange behavior by regex with variable
    ... didn't realize that after variable interpolation, the regex engine ... still treats parens as capturing. ...
    (comp.lang.perl.misc)