Re: String.split emulation for EcmaScript 3rd ed compliance needed
From: Marek Mand (._at_.)
Date: 03/15/04
- Next message: Rob Allen: "Select by uniqueidentifier field type fails"
- Previous message: Bob Barrows [MVP]: "Re: DOM Reference/Resources"
- In reply to: Lasse Reichstein Nielsen: "Re: String.split emulation for EcmaScript 3rd ed compliance needed"
- Next in thread: Marek Mand: "Re: String.split emulation for EcmaScript 3rd ed compliance needed"
- Reply: Marek Mand: "Re: String.split emulation for EcmaScript 3rd ed compliance needed"
- Reply: Lasse Reichstein Nielsen: "Re: String.split emulation for EcmaScript 3rd ed compliance needed"
- Messages sorted by: [ date ] [ thread ]
Date: Mon, 15 Mar 2004 18:17:37 +0200
Lasse Reichstein Nielsen wrote:
> Marek Mand <.@.> writes:
>>is toString() somehow better than the source property (are there known
>>proven use cases when it has failed in the past)?
> No, I merely forgot the source property (I have never used it before :)
{I am gonna intentionally 90% fullquote Your post for better reading.
for quickjump: There are serious issues with newly posted code.
}
it is supported also by other browsers than msie, so there is no reason
to despise it. Using it in code would also add semantics to code
compared to obscure toString voodoo.
> It is probably faster. I avoid using the dynamic properties of the
> RegExp object because they are not part of ECMAScript (and because
> global properties are bad design :).
The presence of those properties is indeed so varying even if You see
the JScript, where most of them were included in JScript5.5.
The most horrible situation with my code was, that it afterwards I
dumped leftContext and rightContext due to being so new in JScript and
started to depend on presence of RegExp.lastIndex to calculate contexts
using substring ops, such one resulted in endless loop on Opera, which
didnt have that property (probably a critikal bug, but they dont have
much script aware and interested persons in NGs), thus it propably
pushed the same string eternally into resultvalue array. ;D
> If you know the target browser supports them, you can use the search
> method to find both the match and its offset in one go. Otherwise,
> it'll take two (or using the replace method, which also gets both
> as arguments).
I would totally stay away as far as possible from the functional
argumented replace in not controlled (web environment), as it is also
relatively new. But whoa, then could be then another great challenge
seen here - an emulation code that would implement the callbacked
version on string.replace! =D
As many of the methods beeing easily emulated, I am afraid
apply and call arent one of those, so I would stay far away from them if
there isnt really another way to achieve goal with bloat code.
>>Yet the resulting array of those of mine functions and the functions
>>kindly written by You arent still conformant to the ecma spec:
>>(For example,
>>"A<B>bold</B>and<CODE>coded</CODE>".split(/<(\/)?([^<>]+)>/) evaluates
>>to (page 104) the array
>>["A", undefined, "B", "bold", "/", "B", "and", undefined,
>>"CODE", "coded", "/", "CODE", ""].)
> Ah yes (section 15.5.4.14). The splicing of captured substrings into
> the result. I wasn't even aware of that.
Glad I could who You something "new".
As of that scripting language core syntax and behaviour is quite simple,
we often dont dig ourselves in details or forget them quickly as such
complexities are very rare to be met and dealt with in everyday
programming practice.
Apart from that split also has optional "limit" arg, which support is a
minor priority compared to getting the ecma compliant results out of the
method. Special cases of "re" being undefined which should result in no
split is also kindergarten game.
> A fixed mySplit-function would be:
> ---
> function mySplit(string,re) {
> // make re have g option
> var reString = re.toString();
> reString = reString.substring(reString.indexOf("/")+1,
> reString.lastIndexOf("/"));
> re = new RegExp(reString,"g");
>
> var parts = [];
> var lastIdx = 0;
> string.replace(re,function(match){
> var idx = arguments[arguments.length-2]; // index of match
> parts.push(string.substring(lastIdx,idx));
> parts.push.apply(parts,
> Array.prototype.slice.call(arguments,1,arguments.length-2));
> lastIdx = idx+match.length; // index past match
> return match;});
> parts.push(string.substring(lastIdx));
> return parts;
> }
> ---
> And for mySplit2:
> ---
> function mySplit2(string,re) {
> var parts=[];
> var match;
> var workString = string;
> while((match = re.exec(workString))!=null) {
> var idx = workString.indexOf(match[0]);
> parts.push(workString.substring(0,idx));
> parts.push.apply(parts,match.slice(1));
> workString = workString.substring(idx+match[0].length);
> re.lastIndex = 0; // in case it has "g" option
> }
> parts.push(workString);
> return parts;
> }
> ---
I address them mySplit3 and mySplit4 respectively to avoid confusion,
please do so too if new code differs from older versions.
The emulation code has serious flaws and is not working as it should.
CASE #1
<q emca-262 15.5.4.14>
while "ab".split(/a*/) evaluates to the array["","b"].)
</q>
<using mySplit3>
alert(mySplit3( "ab", /a*/ ));
<outcome MSIE6>
[,,b,]
</outcome>
</using>
<using mySplit4>
alert(mySplit3( "ab", /a*/ ));
<outcome MSIE6>
THE ALGORITHM SEEMS TO BE STUCK IN ETERNAL LOOP.
</outcome>
</using>
CASE #2
<q emca-262 15.5.4.14>
For example, "ab".split(/a*?/) evaluates to the array ["a","b"],
</q>
<using mySplit3>
alert(mySplit3( "ab", /a*/ ));
<outcome MSIE6>
[,a,b,]
</outcome>
</using>
<using mySplit4>
alert(mySplit4( "ab", /a*?/ ));
<outcome MSIE6>
THE ALGORITHM SEEMS TO BE STUCK IN ETERNAL LOOP.
</outcome>
</using>
> You could also add the submatches using a loop, if you worry about
> whether the browser supports apply and call. I just took the quick
> way out :)
[]
> I still prefer to use replace with a function argument. It gets all
> the right values passed as arguments.
Oh no, apply and call and functional argument. If there is possible way
to avoid them, it must be done at all cost in web enivornment for some
years to come to achieve broadreach . I think when some new browser
appears on the market with JS support, those are the things that are
most 'complex' part of the script spec, thus support for them is
implemented in later builds of a script interpreter.
Thanks for Your participation and hope You have found this thread
interesting among in the gray mass of 'usual problem posts how to open a
window' ! =D
--
marekmand
if (typeof delete undefined == 'unknown'){;} // do nothing
- Next message: Rob Allen: "Select by uniqueidentifier field type fails"
- Previous message: Bob Barrows [MVP]: "Re: DOM Reference/Resources"
- In reply to: Lasse Reichstein Nielsen: "Re: String.split emulation for EcmaScript 3rd ed compliance needed"
- Next in thread: Marek Mand: "Re: String.split emulation for EcmaScript 3rd ed compliance needed"
- Reply: Marek Mand: "Re: String.split emulation for EcmaScript 3rd ed compliance needed"
- Reply: Lasse Reichstein Nielsen: "Re: String.split emulation for EcmaScript 3rd ed compliance needed"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|