Re: Fast linked list
From: Murrgon (murrgon_at_hotmail.com)
Date: Thu, 23 Dec 2004 11:50:06 -0500
Mark Randall wrote:
> However, in my experience allocating large chunks of memory and copying the
> contents is a hidiously slow process compared to a linked list.
> While I understand that end-to-end enumeration is speed critical, this leads
> me to presume that there must be a large amount of data and as such, arrays
> would be somewhat slow in the process of adding. While not an essential
> requirement as stated it must be considered for over-all system performance.
> Typically, with some form of LL | DLL implimentation you already have your
> next location available from the last, or you must call a function
> I agree, vector would be quick, provided you allocated sufficent space
> beforehand for all the elements because as you know, in such an event the
> old array is deallocated and a new one is created each time.
> While I understand LL's are my personal preference for situations like this,
> as the processor (at least Intel) seems to work best allocating small
> amounts of memory rather than huge chunks of it. But that is for the most
> part irrelevant in this case.
> As far as vector goes, yes, useful - its speed is in its pointer addition.
> However, for end-to-end, you already know the next item of a Purely Linked
> List as soon as you have the previous one, there is no need to recalculate
> it from the base. Downside, an extra 4 bytes (LL) or 8 bytes (DLL).
Actually, random insertions into a vector/dynamic array are not as slow
as one is lead to believe. The thing most algorithms courses never even
mention is cache, which can make a world of difference in how your data
structure is used.
Consider a linked list that stores pointers, or integers if you prefer.
Now insert 10000 elements into the list in sorted order (i.e. random
insertions into arbitrary locations in the list). One of the main
features of a list is that it nicely handles inserting a new element
at the front, middle and back of the sequence. Or does it?
Inserting in the front of the list is a no brainer and is O(1) time.
If you store the tail of the list as well, inserting at the back is
O(1) as well, otherwise it will be O(n). Inserting in the middle is
worst case O(n). This is the kind of stuff they tell you in your
algorithms course. What they fail to mention is that for *every*
traversal from one list cell to the next, you are extremely likely
to cause a cache miss. This kills your performance. So although you
have fast memory allocation and insertion itself, the traversal is the
pits because you are jumping all over memory.
Now consider a dynamic array/vector. If we do the same thing and
insert 10000 pointers/integers in random locations in the array
something interesting happens. First off, inserting in the middle
of the array will cause a mass copy of all proceeding elements
down one array index. This may seem like it would be time consuming
but in reality, when you take the cache into consideration, it does
The reason it doesn't take as long as you think is because the copy
will happen in cache memory (at least until the array becomes larger
than the size of the cache). Arrays store elements in a continuous
block of memory and reading/writing through memory in a linear
fashion is exactly what the cache was made to speed up. This also
means that when you need to traverse the array, it is just about as
fast as you can get; each element is next to each other in memory so
you don't incurr nearly as many cache misses.
So what do you think happens when the size of your data structure
exceeds the size of the cache? Well, as far as the array/vector
goes, it just caches it in chunks and you hardly notice any
difference in preformance. The linked list however takes a big
hit because it is cache unfriendly and continously iterating over
elements within in the list to either do insertions or just
general iteration just kills the cache.
If you want to compare speeds, random insertions into an array and
random insertions into a list are just about equivalent in time
until you hit the cache boundary. Then the array takes over in
speed. That seems counter intuitive, I know. I thought so too.
I don't know if you can find it, but there was a lecture at the
GDC in 2001 called "Cache Conscious Coding". The lecturer had
some hard numbers on speed differences between lists and arrays.
Ultimately, it boils down to this: if iteration is most important
some type of array is the best way to go. You also don't incurr
the extra overhead of having to store extra pointers in your