Re: RegEx problem



Thank you, it works nice and it was a very good description how to read a
regex.


"Jesse Houwing" wrote:

* jac wrote, On 28-6-2007 17:26:
Hi,


I have problems with following code and don’t find the bug :

// Set [8,9,54]
ArrayList aArray = new ArrayList();
regStr = new Regex(@"\[(?:(\d+)[,]?)*(\d+)\]");
if(text != null && regStr.IsMatch(text))
{
Match m = regStr.Match(text);
GroupCollection groups = m.Groups;
number = 0;
for(int i=1;i < groups.Count;i++)
{
foreach(Capture c in groups[i].Captures)
{
aArray.Add(c.Value.ToString());
number++;
}
}

}

[8,9] : thats working in my aArray I have 8 and 9
[16,5] : OK I have 16 and 5
[16,34] : That is nok I have 3 items in my array 16 and 3 and 4
[16] : that’s is nok I have 2 items in my array 1 and 6

Why m.groups has 3 groups for [16,34]? The same for [16] why m.groups has 2
groups.
I think it must be the last part of my regex expression (\d+). This is one
group even if there are more numbers in it. How can I solve this?

Thanks in advance,
jac



\[(?<number>\d+)(?:,(?<number>\d+))*\]

should do the trick. Currently there are too many options as both the ,
as well as the whole first group are optional (which they're not).

The new expression reads

find a [
find a number (one or more digits)
optionally find a comma followed by a number
repeat optional group if possible
find a ]

both number are captured in the same named group, which makes it easier
to extract the values:

Match m = regStr.Match(text);
foreach (Capture c in m.Groups["number"].Captures)
{
aArray.Add(c.Value);
}

number = aArray.Count;

Optionally you could also do a string.Split with '[', ',' and ']' as
separator characters which would probably be faster as well. You can
instruct string.Split to ignore empty groups.

string[] results = "[16,23,1]".Split(new char[] { ',', '[', ']' },
StringSplitOptions.RemoveEmptyEntries);
int number = results.Length;

I'd prefer this solution over the regex one.

Jesse

.



Relevant Pages

  • Re: best design for parse
    ... Dim _regex As New ... Although the application does not exactly know before hand what format the ... format and identifier I can use regex,replace to normalize the date. ... relevant regex expression to be used for date normalization later in part ...
    (microsoft.public.dotnet.languages.vb)
  • Re: need help with regex
    ... >i have a regex expression: ... The problem with your regex is that the first ".*" originally matches ... backtrack in order to match the rest of the pattern--but it only ... You do need to add a quantifier to the subexpression, ...
    (comp.lang.java.programmer)
  • Re: RegEx problem
    ... A quick test with a loop and two timestamps will show you why! ... Regex can do beautiful things, but isn't the best tool for every problem. ... I'm not sure if a int.TryParse would impact the loop you tried enough to make is slower than a regex though, my guess is that it's still faster than a regex. ... I think it must be the last part of my regex expression. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: best design for parse
    ... 1.copy the date format string regex string holder and then derive the ... relevant regex expression to be used for date normalization later in part 2: ...
    (microsoft.public.dotnet.languages.vb)
  • Return Data Regex Doesnt Isolate - Yikes
    ... I'm having a bad regex day and can sure use your help, ... I have a Regex expression that works fine. ... data from the start of a string begining with 200~ to the end of the string ... Here's some test data ...
    (microsoft.public.dotnet.languages.csharp)