Re: Regex question
- From: "Josh Einstein" <josheinstein@xxxxxxxxxxx>
- Date: Fri, 26 Dec 2008 14:43:38 -0500
What Jeff is saying is that your approach is backwards. Take a step back and think about what you're trying to do. If you want to pull a valid date from a string, then you have to write a pattern that defines the lexical structure of the date you're trying to extract.
First, here's the easy button: www.regexlib.com
Second, regular expressions are not by any means easy. What you're trying to do is simple by Regex standards, but still requires a lot more specificity than "a digit or a slash". For example, in Regex you can specify quantifiers and alternation constructs.
\d{1,2}(-|/)\d{1,2}(-|/)(\d{2}|\d{4})
Will match a 1 or 2 digit month, a 1 or 2 digit day, and a 2 or 4 digit year - each separated by a - or /.
However, that doesn't guarantee you a valid date. For example, it matches 99/99/9999. Not to mention the fact that it's only valid for US dates with a pretty specific format.
So another poster had recommended a second pass using DateTime.TryParse. Regex will help you get most of the way there but trying to construct a pattern that will ensure a valid date within the range allowed by T-SQL would be a really lousy use of Regex in the first place.
Josh
"tshad" <tfs@xxxxxxxxxxxxxx> wrote in message news:ekbDF9tZJHA.1352@xxxxxxxxxxxxxxxxxxxxxxx
.
"Jeff Johnson" <i.get@xxxxxxxxxxx> wrote in message news:%23%23J8PgFZJHA.4596@xxxxxxxxxxxxxxxxxxxxxxx"tshad" <tfs@xxxxxxxxxxxxxx> wrote in message news:OpJPJBkYJHA.5828@xxxxxxxxxxxxxxxxxxxxxxx
This is really a regex question.
I am wonding if anyone knows a good Regex expression that would pull a valid date from a string.
I have used:
strValue = Regex.Replace(valueIn, @"[^\d/]", "");
which works most of the time.
But I have some cases where I have strings like:
05/07/08(-4%)
09/19/08 DOM 55
09/19/2008 DOM 53
FOR 09/15/08 -23
Stop using Replace to get rid of the stuff you don't want, because clearly it's causing problems. Instead, examine all the possible inputs you might get and then craft a regex to EXTRACT those parts. Then TEST what you've extracted to see if it's a date. After all, 99/76/23 might fit the regex, but it isn't a valid date.
And how would you suggest I do that??? These are just examples of some of the inputs I am getting. I can't really get rid of any parts as I don't know what will be where.
I have no control over what the user will enter in this case.
I need to be able to be able to find a date in the input. Using a variety of possible (probable) date formats, I should be able to extract the date from the input - if one exists.That was what I was looking for.
If you are positive that the separator will always be a slash and that you'll only have digits (not 10/Dec/2008), you might get away with this:
Regex dateRegex = New Regex(@"\d{1,2}/\d{1,2}/\d{2,4}");What I was planning to do - just wasn't sure of the regex.
Then you'll use the Match() method (or Matches) and see if you get anything, and then Date.TryParse[Exact]() to see if it's a real date.
Thanks,
Tom
- Follow-Ups:
- Re: Regex question
- From: tshad
- Re: Regex question
- References:
- Regex question
- From: tshad
- Re: Regex question
- From: Jeff Johnson
- Re: Regex question
- From: tshad
- Regex question
- Prev by Date: Re: WCF Architecture question
- Next by Date: Re: WCF Architecture question
- Previous by thread: Re: Regex question
- Next by thread: Re: Regex question
- Index(es):