Re: regular expression question



Hi Ludwig,

It is not possible to answer your question as you've stated it. Here's why:

i'm using the regular expression \b\w to find the beginning of a word,
in my C# application. If the word is 'public', for example, it works.
However, if the word is '<public', it does not work: it seems that <
is not a valid character, so the beginning of the word starts at
theletter 'p' instead of '<'.

You have not defined your terms. You use the word "word," but you have not
defined what that is supposed to mean in your situation. In regular
expressions, there are no words, only characters. The "\w" character class
indicates a word *character*. A word character is defined in regular
expressions as a character that is either a digit or a letter of the
alphabet.

So, the character '<' is not defined in regular expressions as a word
character, and therefore is not identified as belonging to the set defined
by your rule.

However, while you have stated that you *do* want to identify the character
'<' as the "beginning of a word," you have not stated exactly what the rule
is, only a small part of it. For example, by what you've told me, the
following character sequences could all be "words" -

Hello Ludwig ('H', 'L') The first letters of each word are identified.

Hello, <Ludwig> ('H', '<') The first letter of "Hello" and the beginning '<'
are identified.

Hello, !!!!!!! ('H', '!') The first letter of "Hello" and the beginning '!'
are identified. This is possible because you have not stated what characters
you do *not* consider to be the beginnings of words.

And so on. In other words, a regular expression is shorthand for a rule that
defines a pattern. You need to explicitly define what the rule is in order
for me to create a regular expression that satisfies that rule.

--
HTH,

Kevin Spencer
Microsoft MVP
Professional Numbskull

Show me your certification without works,
and I'll show my certification
*by* my works.

"Ludwig" <none@xxxxxxxx> wrote in message
news:v6ea22tttuqisnt52vuvdg0voopljbm8ht@xxxxxxxxxx
Hi,

i'm using the regular expression \b\w to find the beginning of a word,
in my C# application. If the word is 'public', for example, it works.
However, if the word is '<public', it does not work: it seems that <
is not a valid character, so the beginning of the word starts at
theletter 'p' instead of '<'.

Because I'm not an expert in regular expressions, maybe someone of you
guys can help me? I need the correct regex to find the beginning of
the word '<public' in a string.

Thanks...

Kind regards,
Ludwig


.



Relevant Pages

  • Re: Extract domain names out of URLs
    ... Match the regular expression below and capture its match into backreference ... Between zero and one times, as many times as possible, giving back as needed ... A character in the range between ?A? ...
    (microsoft.public.excel)
  • Can anyone write this recursion for simple regexp more beautifully and clearly than the braggarts
    ... I know that lisp eval is written more clear than this recursion below ... The Practice of Programming ... The problem was that any existing regular expression package was far ... c Matches any literal character c. ...
    (comp.lang.c.moderated)
  • Re: RegEx: How to ignore the number of whitespaces?
    ... a "simpler" regular expression syntax is likely to bite you eventually, ... but that some of these character sequences may be "marked" as ... This is a regular expression "if" conditional statement, ... do not understand why the pattern "personal computer" will only match ...
    (microsoft.public.dotnet.framework)
  • Re: logcheck.violations.ignore --does not work
    ... Peter T. Breuer wrote: ... it would not take care of it. ... Just use a correct regular expression. ... the period character match any single ...
    (comp.os.linux.security)
  • Re: Regular Expression Help
    ... I then allow for validation routines for the given controls. ... > Let me know if you know what the regular expression would be to limit X ... >>> character it should fail. ...
    (microsoft.public.dotnet.framework.aspnet)