Re: What is the Fastest way for adding string items to some array/collection in sorted order?

Tech-Archive recommends: Fix windows errors by optimizing your registry

From: [SolarAngel] (not-for-mail)
Date: 11/21/04


Date: Sun, 21 Nov 2004 02:56:49 +0100

I have found very fast solution (of course C) that satisfies my needs q=)
time calculation on my pc is around 0.037ms for each item that is added to
collection total time is around 3.7s, listbox (visible=false) total time is around >10s,
total count of items in test is 100 000 (unsorted).
Test is performed to almost always add item at beginning of array,
which is worst scenario in this case, for example:
for i=99999 to 0 step -1
    dc.AddItemSortedWD "Item " & cstr(i)
next i

make a note for larger array time will raise almost exponential.
do not use arrays larger than 500 000 elements,
everything larger than that will take a while q=).
for small arrays do not use this, you don't need it!

Here is source (you could preserve little more time if you remove returning NewIndex from both VB & C source, and declare both as
Subs (in C as void)):

'VB DECLARE (make a note X.dll is dll name where you have compiled your C dll)
Private Declare Function StrArrAddItemSortedWD Lib "X.dll" (ByVal lpszArr As Long, ByRef lpArrCount As Long, ByVal lpsz As Long) As
Long
private arr() as string
private arrcount as long
private arrCountRes as long ' for reserved memory (it saves very little time but it is also good idea to have reserved memory,
empty string uses only 4 bytes - 1000 items * 4 bytes = 4Kb, and for 1 000 000 items it will be 4Mb)
private mNewIndex as long

'VB Call
Rem AddItemSortedWD
' add szItemString (DUPLICATES)
' note: szItemString is Empty String when API call returns
'
Public Function AddItemSortedWD(ByVal szItemString As String) As Long
  If arrCountRes < arrCount Then arrCountRes = arrCount: ReDim Preserve arr(0 To arrCount)
  mNewIndex = StrArrAddItemSortedWD(VarPtr(arr(0)), arrCount, VarPtr(szItemString))
  AddItemSortedWD = mNewIndex
End Function

// C Source
long ZERO = 0;
//
// Add item lpsz to lpszarr array in sorted position (WITH DUPLICATES)
//
long __declspec(dllexport) StrArrAddItemSortedWD(long* lpszarr, long* arrCount, long* lpsz)
{
long ac = *arrCount;
double p = ac/2;
double step = p;
long index = 0;
  while (step>=0.5)
  {
    step = step / 2;
    switch ( wcscmp((BSTR)*lpsz, (BSTR)lpszarr[(long)p]) ){
      case 1:
        p += step; break;
      case -1:
        p -= step; break;
      case 0:
        goto label_addItem;
    }
  };
label_addItem:
  index =(long)(p+0.5); // fix for direct casting double->long (MSKB:Article ID: Q12297)
  // if valid index (it should be valid always)
  if(0<=index){
    // do we need to move other elements or to set at the end of array
    if(index<ac) memmove(lpszarr+index+1, lpszarr+index, (ac-index)*4); else index = ac;
    // add item, clean item pointer, increment array counter
    memmove(lpszarr+index, lpsz, 4);
    memmove(lpsz, &ZERO, 4);
    *arrCount = ++ac;
  }
  // return inserted index
  return index;
}

[SolarAngel]

"[SolarAngel]" <not-for-mail> wrote in message news:OOAQGDGzEHA.1404@TK2MSFTNGP11.phx.gbl...
| Thanks Mayayana (cool name/nick),
|
| the problem is that I don't need sorting routime I have a lot of them. q;-)
| the problem is to make adding routine doing it faster, cause for large arrays is better to add in sorted order than to sort array
| everytime.
| Cause I can't perdict when user or some other event will add item to this collection, and some events for searching..., so that I
| can perform sorting on end when few or more items are added. Multithreaded environment!
|
| I will make same code in C to see what will happen
| It was a while when I wrote last C DLL (8 months) q=)
|
| Thanks again for tips
|
| BTW: vbspeed.com doesn't work
| ---
| The requested URL could not be retrieved
| While trying to retrieve the URL: http://www.vbspeed.com/
| The following error was encountered: Unable to determine IP address from host name for www.vbspeed.com
| The dnsserver returned: Name Error: The domain name does not exist.
| This means that:
| The cache was not able to resolve the hostname presented in the URL. Check if the address is correct.
| ---
|
| [SolarAngel]
|
|
| "mayayana" <mayaXXyana1a@mindYYspring.com> wrote in message news:KpAmd.2511$pK6.73@newsread2.news.atl.earthlink.net...
| | A couple of places that you might look for more info.:
| |
| | 1) vbspeed.com
| |
| | 2) Matthew Curland's Advanced Visual Basic 6
| | book. On p. 235 is a section that discusses sorting
| | routines.
| | _____________________________
| |
| | mayayXXana1a@mindYYspring.com
| | For return email remove XX and YY.
| | _____________________________
| | [SolarAngel] <not-for-mail> wrote in message
| | news:u$#g9uFzEHA.4028@TK2MSFTNGP15.phx.gbl...
| | > Thanks for MUtility
| | >
| | > Nice thing for sorting but it seams to me that DSW project or I have
| | problem with OLEAUTO.H do I need something to know?:
| | > ---
| | > Compiling...
| | > StdAfx.cpp
| | > c:\program files\microsoft visual studio\vc98\include\oleauto.h(30) :
| | error C2146: syntax error : missing ';' before identifier
| | > 'IID_StdOle'
| | > c:\program files\microsoft visual studio\vc98\include\oleauto.h(30) :
| | fatal error C1004: unexpected end of file found
| | > Error executing cl.exe.
| | >
| | > MUtility.dll - 2 error(s), 0 warning(s)
| | > ---
| | >
| | > But thing is still that I need something that adds strings in sorted array
| | in sorted position (or I will have to sort that array
| | > every time!).
| | > as ListBox sorts every item that is added to his collection/list. Some
| | thing like that.
| | > I have something that does the job just a little bit faster than ListBox
| | (visible=true),
| | > but when I set visible to false, ListBox is awesome faster than my code.
| | >
| | > at least you have gave me an idea, try to write same thing in C, maybe VB
| | wrappes some code check and it looses some time for that.
| | >
| | > I use RtlMoveMemory and RtlZeroMemory for moving pointer (strings),
| | > also I reserve memory in advance so that VB doesn't loose time for redim.
| | > Code is very similar to next:
| | >
| | > Private Declare Sub memcpy Lib "KERNEL32" Alias "RtlMoveMemory" (ByVal
| | ulDest As Long, ByVal ulSrc As Long, ByVal Length As Long)
| | > Private Declare Sub memzero Lib "KERNEL32" Alias "RtlZeroMemory" (ByVal
| | ulDest As Long, ByVal Length As Long)
| | >
| | > Private Type UDTDynCollection
| | > ItemString As String
| | > End Type
| | > Private ITEMMEMORYBLOCK As Long ' Len(UDTDynCollection) =4 in this case
| | > Private arr() As UDTDynCollection
| | > Private arrCount As Long
| | > Private arrCountRes As Long
| | >
| | > Public Sub ReserveMemory(ByVal ulItemCount As Long)
| | > If arrCount > ulItemCount Then Exit Sub
| | > ReDim Preserve arr(0 To ulItemCount)
| | > arrCountRes = ulItemCount
| | > End Sub
| | >
| | > Public Sub AddItemSorted(ByVal szItemString As String)
| | > Dim p As Single
| | > Dim step As Single
| | > Dim n As Long
| | > p = arrCount / 2
| | > step = p
| | > Do While step >= 0.5
| | > step = step / 2
| | > Select Case StrComp(szItemString, arr(CLng(p)).ItemString,
| | vbTextCompare)
| | > Case 1: p = p + step: n = 1
| | > Case -1: p = p - step
| | > Case 0: Exit Sub ' don't add item allready exist
| | > End Select
| | > Loop
| | > AddItem szItemString, CLng(p) + n
| | > End Sub
| | >
| | > Public Sub AddItem(ByVal ItemStringParam As String, Optional ByVal Index
| | As Long = -1)
| | > If arrCountRes < arrCount Then arrCountRes = arrCount: ReDim Preserve
| | arr(0 To arrCount)
| | > If Index >= 0 And Index < arrCount Then
| | > memcpy VarPtr(arr(Index + 1)), VarPtr(arr(Index)), (arrCount - Index)
| | * ITEMMEMORYBLOCK
| | > memzero VarPtr(arr(Index)), ITEMMEMORYBLOCK
| | > ElseIf Index = -1 Then
| | > Index = arrCount
| | > End If
| | > memcpy VarPtr(arr(Index).ItemString), VarPtr(ItemStringParam),
| | ITEMMEMORYBLOCK
| | > memzero VarPtr(ItemStringParam), ITEMMEMORYBLOCK
| | > arrCount = arrCount + 1
| | > End Sub
| | >
| | > [SolarAngel]
| | >
| | >
| | > "tlviewer" <tlviewerSHRUB@yahooCHENEY.com> wrote in message
| | news:eVHg3Z$yEHA.3236@TK2MSFTNGP15.phx.gbl...
| | >
| | > "[SolarAngel]" <not-for-mail> wrote in message
| | news:OCUK4a3yEHA.2600@TK2MSFTNGP09.phx.gbl...
| | > > Well I know this is very strange,
| | > > but I have tested some algorithms for sorting string items, and one
| | works just a little bit better than sorting in ListBox Object,
| | > > but when I set visible property to false in ListBox wonder has happened
| | time is split by 2, which is in VB fastest thing I have
| | > > found (Haven't yet tested ListBox with reserved memory) but
| | unfortunately ListBox control doesn't support more items than 32768 to
| | > > be exact 65535 with API's, and I expect more than 100 000 items in
| | collection.
| | > >
| | > > So if anybody knows some good control of class or any API, please send
| | me example
| | > >
| | > > Thanks
| | > >
| | > > [SolarAngel]
| | > >
| | > >
| | >
| | > This archive tests a HuthSort against a Qsort (c++) in VB and
| | > VBScript on an array of 100,000 strings.
| | >
| | > http://4.11.211.97/cpp_qsort.zip
| | >
| | > hth,
| | > tlviewer
| | >
| | > --
| | >
| | >
| |
| |
|
|



Relevant Pages

  • Re: error with bin.base64 decoding
    ... Private Const mcClassName As String = "CBase64" ... Private Declare Sub RtlMoveMemory Lib "kernel32.dll" _ ... 'Encode encodes the byte array Source() to a string using the BASE64 ... 'Decode decodes a BASE64 encoded string to an one dimensional, ...
    (microsoft.public.vb.general.discussion)
  • Re: string array / collection ? what should I use ?
    ... > Use array of user-defined types. ... > sPerson as String ... > Private Sub InitScript ... >> First Witch, Boil thou first i' the charmed pot. ...
    (microsoft.public.vb.general.discussion)
  • Re: Best way to INSERT
    ... Private mlngCurrentIndex As Long ... Public Property Get StringValue() As String ... Public Sub AddText ... 'Class to encapsulate array generation of strings. ...
    (microsoft.public.inetserver.asp.general)
  • Re: Base64 encoding/decoding in VB6
    ... > Does anyone know how I can encode binary data to produce a character string ... Private Declare Sub RtlMoveMemory Lib "kernel32.dll" _ ... 'Encode encodes the byte array Source() to a string using the BASE64 ...
    (microsoft.public.vb.general.discussion)
  • Re: Double vector from C++
    ... private MyVec() as t_Struct ... Private MyVec() As MyVecSub' not legal in VB ... An 'array' where each element is an 'array' of t_Structs. ... For Each jnk In MyVec ...
    (microsoft.public.vb.general.discussion)