Re: RegExp irregularity in JScript

From: Roland Hall (nobody_at_nowhere)
Date: 03/04/04


Date: Thu, 4 Mar 2004 12:36:30 -0600


"Bryan Donaldson" wrote:
: Is replying to yourself as bad a talking to yourself? Perhaps a better
: topic for an alt.psychology forum...
:
: In any event.. we decided to take this page to a .Net server and move the
: test code to the code-behind.
:
: When we did, the pattern that had been giving us trouble worked as
expected.
:
: So, we believe the VBScript Regular Expression class (version 1.0 through
: 5.5) is broken in how it handles multiple positive look-ahead when they're
: followed by the {n,m} construct.
:
: ^(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{4,8}$
:
: In particular, the lower bound seems to be accepting only strings that
have
: N of the matching forward look-ahead at the END of the string. The
limiting
: length M is validated for the entire entry
:
: 1aAAAA works 1AAAAAa fails AAAA1a fails 1Aaaa fails 12Aa fails
:
: 1Aaaaaaa works 1Aaaaaaaa fails (as expected)
:
: If the filter is reduced, removing the decimal character requirement :
: ^(?=.*[a-z])(?=.*[A-Z]).{4,8}$
:
: then 1Aaaa works, but 1Aaa fails.

Actually, I'm using 5.6 and I disagree with the analogy and I will explain
why I think this.

First, I'm using Javascript, not JScript or VBScript, but that's really a
red herring.
Second:
Your pattern has boundaries ^$, so everything must fall within the beginning
and end of a line excluding \n.
(?=) forward look ahead tests individually without capture, so it is testing
3 times with look ahead and another test at the end and I'll get to that in
a moment.
First test: .*\d which is zero or more decimals. .+\d is one or more.
Second test: .*[a-z] which is zero or more lowercase characters a-z.
Third test: .*[A-Z] which is zero or more uppercase characters A-Z.
Fourth test: .{4,8} any character minimum 4, maximum 8. This does capture.
This is why aA123 does not work but aA1234 does. You can prove that by
making any test on the end the 4 characters.
These all work: 1aABCD 1Aabcd Aa1234...
.{4,8} does not apply to all tests with inclusions as you apparently want.

What I did in my original example was to break it up and not test everything
at once. I thought if I had to do that, I would have to come up with every
iteration possible. Below is an example that will test for at least 1
lowercase, 1 uppercase and 1 digit in any order:
([a-z]+[A-Z]+[0-9]+)|([a-z]+[0-9]+[A-Z]+)|([A-Z]+[a-z]+[0-9]+)|([A-Z]+[0-9]+
[a-z]+)|([0-9]+[a-z]+[A-Z]+)|([0-9]+[A-Z]+[a-z]+)

Note: [0-9]+ can be replaced with \d+

It does not however, limit the string minimum 4, maximum 8 characters. It
only tests for 3 becuase my groups are in threes. Following this method, I
would need a very large expression.
However, if you break it up, and pass each test as a variable, it's simple.
You can search for a lowercase letter which is required /[a-z]+/
...uppercase /[A-Z]+/
and digit /[0-9]+/

I'm using [0-9]+ here because \d+ would escape so you would actually need
\\d+ to pass it as a variable in javascript.

Obviously the first test should test the length of the string, minimum 4,
maximum 8.
And, if you still wanted to keep your original design, by passing only 1
pattern, you do not have to pass it as a regular expression, but rather an
encoded one or an array.
var s = "[a-z]+ [A-Z]+ [0-9]+"; // delimiter is 3 spaces
var sCol = s.split(" ");

Array contents:
sCol[0]="[a-z]+"
sCol[1]="[A-Z]+"
sCol[2]="[0-9]+"

You can grab the size of the array obviously and loop through it. My point
in the first example by splitting it up and testing first for the length was
there was no reason to test the validity of the string characters if there
are too few or too many. The rest is just testing them individually.

You see, this is all you need if you first test the length of the string.

I have not tested it with .NET but I read that JScript.NET adds some new
functionality to regular expressions if IIRC.
I'll write a page later on that allows for the creation of complex passwords
and passing multiple arguments as one. You could just select from a list of
how difficult you want to make it.

-- 
Roland Hall
/* This information is distributed in the hope that it will be useful, but
without any warranty; without even the implied warranty of merchantability
or fitness for a particular purpose. */
Technet Script Center - http://www.microsoft.com/technet/scriptcenter/
WSH 5.6 Documentation - http://msdn.microsoft.com/downloads/list/webdev.asp
MSDN Library - http://msdn.microsoft.com/library/default.asp


Relevant Pages

  • Re: Usename regex
    ... Think of a string, ... Regular expression benchmark ... MS MAX AVG MIN DEV INPUT ... If the textbox in question is limited to say 16 characters you'd ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: Fast search for all positions in a string
    ... It addition to running timing tests in different browsers and on ... direct string comparison (which is unintuitive given the relative ... The otherwise often problematic characteristic of Regular expression ... turns all characters that are significant in regular expressions ...
    (comp.lang.javascript)
  • Re: Extracting Strings
    ... regular expression and php function that does it. ... I want to extract the data in the following string: ... The characters in the square brackets are the characters to match ...
    (alt.php)
  • Re: Regular Expression taking excessive CPU
    ... > regular expression adding so much time to the process, ... > ftIndex is a string variable that typically won't exceed 100 characters. ... static string RemoveNonAlpha1 ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: RegExp.test
    ... Karel Miklav writes: ... > only the first test passes. ... When you give the "g" option on a regular expression, ... to be matched several times against a string, ...
    (comp.lang.javascript)

Loading