Re: Regex Matches

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance




Gabriel Lozano-Morán wrote:
> It would match two times if you put an extra G at index 4 in the matches
> string:
> GGATGGGATG
>
> Gabriel Lozano-Morán

Well, yes, but I think that what the OP wanted to know is why Regex
doesn't re-scan after a match. That is, in the string GGATGGATG, the
Regex will match the initial string: GGATG. After that, where does the
Regex processor look to start matching next? Does it start with the
part of the string after the first matched character, so does it begin
matching the substring GATGGATG, in which case it would find a second
match in the fifth character of the original string (the fourth
character of the substring)? Or does it start looking for another match
after the last character matched in the first match, therefore matching
against GATG, which will result in no second match?

Regex appears to display the latter behaviour, according to the OP.

I checked the RegexOptions enumeration, and don't see any flag for
Rescan. I have seen this option for other Regex pattern matchers, but
it doesn't appear to be in the .NET one.

One thing the OP could do is use Match instead of Matches:

string dna = "GGATGGATG";
int matchIndex = 0;
Regex r = new Regex("GGATG");
Match sequence = r.Match(dna, matchIndex);
while (sequence != Match.Empty)
{
matchIndex = sequence.Index;
Console.WriteLine("Sequence matched at index {0}", matchIndex);
matchIndex++;
sequence = r.Match(dna, matchIndex);
}

Or something like that. Then he could determine where Regex should
start searching again after it finds a match.

.



Relevant Pages

  • Re: RegEx issues
    ... The problem is it appears that python is escaping the \ in the regex ... character within a string. ... This flag allows you to write regular expressions that look nicer. ...
    (comp.lang.python)
  • Re: Matching parentheses with Regular Expressions
    ... you probably want this regex: ... You also might get rid of some of those backslashes by substituting another character, then using replaceon the string before compiling it. ... This just allows Sun to make new keywords or operators, with out breaking any existing code. ...
    (comp.lang.java.programmer)
  • A question about regexes
    ... In Java, the following regex: ... matches any string that starts with a then any character and then b. ...
    (comp.theory)
  • Re: How to Parse a string with Embedded Double Quotes
    ... Do you mean how long does it take to parse the string using RegEx against ... parsing it character by character, or how long did it take to come up with ... I think the RegEx solution is by far the neatest, ... I do not have a solution parsing ...
    (microsoft.public.dotnet.languages.vb)
  • Re: Fastest way to search a string for the occurance of a word??
    ... but the OP's question was what's the "Fastest way to search a string ... in all the tests I did here, the Regex was by far superior. ... However, of course, if you've got new regular expressions all ... Sure - but just that extra Match object could be relevant if the search ...
    (microsoft.public.dotnet.languages.csharp)