Re: RegEx: How to ignore the number of whitespaces?



That is helpful, but I still have a few questions.

"Users of my programme input sequences of arbitrary Unicode characters
(from now on, referred to as "patterns"). These patterns are supposed
to match other given sequences of Unicode characters (from now on,
referred to as "strings").
<snip>
I'm looking for the easiest way to quickly convert the pattern into a
standard regular expression.

This sounds like the "patterns" are performing the work of regular
expressions, matching character sequences in strings. What I don't
understand is why you want to create a new regular expression syntax which
your users must learn, then convert it to the original, rather than using
the original? Or perhaps I'm misunderstanding your intention altogether?

Second, what are the limitations of the "arbitrary Unicode characters?"
There are over 16 million Unicode characters, and if we confine ourselves to
a single character set, we are still talking about alphanumeric characters,
punctuation, diacritical characters, and non-printing characters. I will
assume that some of these are not within the set of "arbitrary" characters
you're referencing. But I don't know which ones are allowed, and which ones
are not.

Certain subsequences of a pattern may be marked as optional. These may
be found in the string, but need not.
Certain subsequences of a pattern may be marked as a set of
alternatives. Exactly one of them must be found in the string, neither
more nor less.

Okay, we've discussed "arbitrary," but now you will need to define the term
"marked." As the "patterns" are pure text, the "marks" must also be text.
But what consitutes a "text" character and a "mark" character, and how do
you escape text characters to create marks?

--
HTH,

Kevin Spencer
Microsoft MVP

Printing Components, Email Components,
FTP Client Classes, Enhanced Data Controls, much more.
DSI PrintManager, Miradyne Component Libraries:
http://www.miradyne.net


"Florian Haag" <florianhaag@xxxxxxxx> wrote in message
news:Ofgo6JssHHA.1208@xxxxxxxxxxxxxxxxxxxxxxx
Kevin Spencer wrote:

If you can explain the requirements of the pattern you're trying to
match, without using any regular expression terminology, I can help.

Hi,
thanks for your response!

Hope this is something like what you meant:
"Users of my programme input sequences of arbitrary Unicode characters
(from now on, referred to as "patterns"). These patterns are supposed
to match other given sequences of Unicode characters (from now on,
referred to as "strings").

Certain subsequences of a pattern may be marked as optional. These may
be found in the string, but need not.
Certain subsequences of a pattern may be marked as a set of
alternatives. Exactly one of them must be found in the string, neither
more nor less.
A pattern will never require more than one space character without any
other characters in between to be found in a string.
A pattern will accept any number of space characters (greater than
zero) without any other characters in between in the string at a
position where a space character is expected.
A pattern will ignore any space characters at the beginning and at the
end of a string.
A pattern will never require any space characters at the beginning and
at the end of a string."

I'm looking for the easiest way to quickly convert the pattern into a
standard regular expression.

Thanks in advance,
Florian


.



Relevant Pages

  • Re: Extract letters and numbers from string
    ... While the Like operator patterns cannot begin to compare to those from a Regular Expression parser, they are still quite flexible and you can still get quite complex with them. ... What the pattern does is insure the text in the variable Value is made up of nothing but digits. ... The exclamation mark inside the square brackets says to look for characters NOT in the range 0 through 9, the asterisks on either side says to look for this non-digit anywhere within the text contained in the Value variable. ... > Public Function AlphaNumeralsAs String ...
    (microsoft.public.excel.programming)
  • Re: Extract letters and numbers from string
    ... While the Like operator patterns cannot begin to compare to those from a ... exclamation mark inside the square brackets says to look for characters NOT ... the Like pattern tests can all be included into a single pattern test... ... Public Function AlphaNumeralsAs String ...
    (microsoft.public.excel.programming)
  • Re: RegExp irregularity in JScript
    ... of characters in the string is at ... All three strings match if the pattern is "."; ... the pattern as a submatch ") the entire string is returned, ... This looks like a bug in Microsoft's regular expression implementation (it ...
    (microsoft.public.scripting.jscript)
  • Re: RegEx: How to ignore the number of whitespaces?
    ... do not understand why the pattern "personal computer" will only match ... few special characters which denote very few pattern features (optional ... the alternative patterns must appear in the string. ...
    (microsoft.public.dotnet.framework)
  • Re: WILDCARD: output all a* by searching a text file
    ... where * denotes a string of characters. ... When you want to write a program that search a pattern such as /a./ ... int main{ ... int main(int argc,const char** argv){ ...
    (comp.programming)