Re: regex syntax

Tech-Archive recommends: Fix windows errors by optimizing your registry



jg wrote:

I know I have to deal with
yyyy-mm-dd ( and variants thereof with dot or slash as separator instead of dash, single digit month or day)
yyyy-MMM-dd ( or just space instead of -)
MMM d, yy ( or yyyy)
and the tougher ones like
d MMM yyyy
d MMM yy

I have created a regex for you that works with all those samples. Here it is:


(?<year>\d{4})[-\./\s](?<month>\d{1,2})[-\./\s](?<day>\d{1,2})$ |
(?<year>\d{4})[-\s](?<month>JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)[-\s](?<day>\d{1,2})$ |
(?<month>JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)\s(?<day>\d{1,2}),\s*?(?<year>\d{4}|\d{2})$ |
(?<day>\d{1,2})\s(?<month>JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)\s(?<year>\d{4}|\d{2})$


I tried this with the following samples, constructed from the templates you gave:

2005-03-08
2005.03.08
2005/03/08
2005 03 08
2005 3 08
2005 3 8
2005 03 8
2005-MAR-08
2005 MAR 08
2005 MAR 8
MAR 8, 2005
MAR 08, 2005
MAR 8, 05
MAR 08, 05
8 MAR 2005
8 MAR 05
08 MAR 2005
08 MAR 05

As you can see, the expression is comprised of four different parts. Each of these has a $ sign at the end, which you'll want to get rid of before using the expression with your own long string. This is only needed to test the expression in Regulator with multiple samples.

I tried this with the IgnoreWhitespace and the IgnoreCase options switched on.

Hope this helps!

(If you have any trouble with the regex, I could send you the saved Regulator file. Just in case things get mangled in the message or something.)


Oliver Sturm -- omnibus ex nihilo ducendis sufficit unum Spaces inserted to prevent google email destruction: MSN oliver @ sturmnet.org Jabber sturm @ amessage.de ICQ 27142619 http://www.sturmnet.org/blog .