Re: text compare takes very long...

Tech-Archive recommends: Fix windows errors by optimizing your registry



On 13 okt, 14:47, "Mike Williams" <m...@xxxxxxxxxxxxxxxxx> wrote:
"Co" <vonclausow...@xxxxxxxxx> wrote in message

news:1192270828.787010.50550@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Mike, Great stuff, just one remark. Could you add
another boolean so the user can choose if he wants
to run it case sensitive or not?

I've been wondering how long it would be before someone asked that question
;-) It is of course not quite so simple as "adding another Boolean", as I'm
sure you are aware, but it can certainly be done. In the original code (the
code that does not include the SAFEARRAY stuff) you could very easily make
the comparison case insensitive by simply changing the following:

b1 = StrConv(String1, vbFromUnicode)
b2 = StrConv(String2, vbFromUnicode)

to:

b1 = StrConv(LCase(String1), vbFromUnicode)
b2 = StrConv(LCase(String2), vbFromUnicode)

However, in the modified faster code it is not quite so easy. The simplest
way would be to perform an OR of the data words with the value 32, but
before doing that it would be wise to investigate which is the best place to
actually do that. In the earlier versions (where the loops in the main
algorithm were equal to the product of the two string lengths) it would be
best to do it two the two Integer arrays themselves, making sure that you
revert the contents of those arrays back to their original condition at the
end of the routine to avoid altering the String data in the calling
procedure (because the original strings will respond to any changes you make
to the Integer data). However, in the latest and fastest routine (where the
loops in the new algorithm are much less than the product of the two string
lengths) it might be best to instead do it in the algorithm itself. I think
before I wrote any amendments I might be inclined to place a loop counter
inside the new algorithm and test it on various different input strings to
see how many loops occur and how many ORs are required in each loop, and
then make the decision regarding the placement of the amendment on the basis
of whether or not the number of required OR's in the main algorithm (on
average) exceeded twice the sum of the two string lengths.

I haven't actually got time to look into any of that stuff at the moment
(Saturday and the weekend and all that, and of course the all important
England versus France rugby on TV here in the UK tonight) but perhaps others
here might have a go at it. It would certainly be interesting to investigate
both methods I have suggested, and perhaps other alternative methods, so
that we end up with a flexible and yet still fast routine. It's quite
interesting stuff, this :-)

Mike

So did you win?

Marco

.



Relevant Pages

  • Re: text compare takes very long...
    ... b1 = StrConv(String1, vbFromUnicode) ... In the earlier versions (where the loops in the main algorithm were equal to the product of the two string lengths) it would be best to do it two the two Integer arrays themselves, making sure that you revert the contents of those arrays back to their original condition at the end of the routine to avoid altering the String data in the calling procedure. ...
    (microsoft.public.vb.general.discussion)
  • Re: Multiplicity, Change and MV
    ... as a String, for example. ... A lecturer teaches more than one course. ... As to your comment about loops, Mr Badour's comments and your question about ... different primary keys) the representation within the file structure is much ...
    (comp.databases.theory)
  • This Weeks Finds in Mathematical Physics (Week 226)
    ... The first week they had lots of talks on "higher-dimensional rewriting", ... to find two files that give the same bit string. ... Now, if you're a mathematician, the whole idea of a cryptographic ... You can see the algorithm for this ...
    (sci.math.research)
  • This Weeks Finds in Mathematical Physics (Week 226)
    ... The first week they had lots of talks on "higher-dimensional rewriting", ... to find two files that give the same bit string. ... Now, if you're a mathematician, the whole idea of a cryptographic ... You can see the algorithm for this ...
    (sci.physics.research)
  • Re: Attention Sean - question about CSI
    ... the best compression algorithm you have. ... It should compress very ... which compresses that string to a single bit. ... from the binary code, and use it to decompress the data, then ...
    (talk.origins)