Re: Regex repeating capture

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance





"Jay" <JaythePCguy@xxxxxxxxx> wrote in message news:1170180040.876508.261640@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
I know what the identifiers are, so I'm okay with replacing the .{4}
with (Identifier1|Identifier2|...|IdentifierN) at run time. However, I
cannot blindly end the data capture on an asterisk. "*CZ1 2.3 4*A56
*fuuuS24364 08 23 72" is also valid provide *A6 is not a valid
identifier. The data capture can only end if it encounters another
valid identifier.

On Jan 30, 12:52 pm, "Jay" <JaythePC...@xxxxxxxxx> wrote:
The identifier is at least 2 character, but has no upper limit.

Thanks,
Jay

On Jan 30, 12:36 pm, "Mythran" <kip_pot...@xxxxxxxxxxx> wrote:

> "Mythran" <kip_pot...@xxxxxxxxxxx> wrote in > messagenews:406FDAFC-735E-433F-A47A-478A660F0679@xxxxxxxxxxxxxxxx

> > <jayluc...@xxxxxxxxx> wrote in message
> >news:1170174574.488763.29890@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> >> Howdy,

> >> I'm trying to break an input string into multpile pieces using a
> >> series of delimiters that start with an asterisk. Following the
> >> asterisk is a mulitple character identifier immediately followed by > >> a
> >> data string of variable length. The input string may contain more > >> than
> >> one identifier anywhere in the string.

> >> Here is an example:
> >> *CZ1 2.3 4-56 *fuuuS24364 08 23 72

> >> I'd like to break this into
> >> CZ
> >> 1 2.3 4-56
> >> fuuu
> >> S24364 08 23 72

> >> I have tried the pattern (?:\*(CZ|fuuu)(.*)), which produces the
> >> following ouput:
> >> CZ
> >> 1 2.3 4-56 *fuuuS24364 08 23 72

> >> How can I force it to repeat the capturing?

> >> Thanks,
> >> Jay

> > So, to split based on an * using a regular expression:

> > string pattern = @"\*(?<Text>[^\*]+)";
> > string input = "*CZ1 2.3 4-56 *fuuuS24364 08 23 72";
> > Match match = Regex.Match(input, pattern);

> > while (match.Success) {
> > Console.WriteLine(match.Groups["Text"].Value);
> > match = match.NextMatch();
> > }

> > HTH,
> > Mythranahh, I didn't know you wanted to break it out into identifier, > > text,
> identifier, text...thus the previous post should be obliterated :P...do > you
> know if the identifier is always 4 characters? Hope so, the following
> example shows how to achieve this:

> string pattern = @"\*(?<Identifier>.{4})(?<Value>[^\*]+)";
> string input = "*CZ1 2.3 4-56 *fuuuS24364 08 23 72";
> Match match = Regex.Match(input, pattern);

> while (match.Success) {
> Console.WriteLine(
> "Identifier: {0} - Value: {1}",
> match.Groups["Identifier"].Value,
> match.Groups["Value"].Value
> );
> match = match.NextMatch();

> }HTH,
> Mythran


How many identifiers are there? If there are a small list (say, less than 10ish), then you can use the regex OR character '|' in the pattern to separate the list of valid identifiers instead of matching on the asterisk itself.

HTH,
Mythran

.



Relevant Pages

  • Re: Regex repeating capture
    ... series of delimiters that start with an asterisk. ... one identifier anywhere in the string. ... data string as long as it isn't defined as an identifier (it would be ...
    (comp.lang.perl.misc)
  • Regex repeating capture
    ... series of delimiters that start with an asterisk. ... data string of variable length. ... The input string may contain more than ... one identifier anywhere in the string. ...
    (comp.lang.perl.misc)
  • Re: Regex repeating capture
    ... I'm trying to break an input string into multpile pieces using a series of delimiters that start with an asterisk. ... The input string may contain more than one identifier anywhere in the string. ...
    (comp.lang.perl.misc)
  • Re: Regex repeating capture
    ... The identifier is at least 2 character, ... series of delimiters that start with an asterisk. ... The input string may contain more than ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Regex repeating capture
    ... an identifier, otherwise it is considered another identifier. ... to split your string on the asterisk as a first step. ... public bool StartsWith ( ...
    (microsoft.public.dotnet.languages.csharp)