Re: Sorting algorithms was: Factorial
From: U-CDK_CHARLES\\Charles (_at_cdksystems.com)
Date: 05/27/04
- Next message: U-CDK_CHARLES\\Charles: "Re: Shortest day of year"
- Previous message: Lee Miller: "Re: BEWARE OF GODADDY--BAD BUSINESS PRACTICE"
- In reply to: Bob O`Bob: "Sorting algorithms was: Factorial"
- Next in thread: J French: "Re: Sorting algorithms was: Factorial"
- Reply: J French: "Re: Sorting algorithms was: Factorial"
- Messages sorted by: [ date ] [ thread ]
Date: Thu, 27 May 2004 07:25:31 -0700
On Wed, 26 May 2004 20:46:24 -0700, Bob O`Bob
<filterbob@yahoogroups.com> wrote:
> Ulrich Korndoerfer wrote:
>>
>
> Can you folks make some specific suggestions for a specific real-world need?
>
> I repeatedly need to sort tab-separated-value files of 600megabytes or so,
> sometimes up to a GB. And I expect this file size to creep up, perhaps
> approaching 2GB as soon as a year from now. Though actually the ability to
> sort even larger files might lead to an overall improvement in the efficiency
> of the whole system it's a part of. Line lengths vary, though they're rarely
> under 90 bytes, or over 1200.
>
I think you don't want to "sort" at all.
What you need is a filing system that is self sorting, with decent
performance.
If this is an academic project, DAGS on B+-trees. It's an n-way tree,
with n chosen to fit in the amount of data your OS reads from the HD in
a single read.
Way back in the dark ages of 720kB floppy disks, we chose between a
B-tree and hashing, but chose hashing as it saved a disc access, back
when dics access took seconds rather than ms. Most of the time, we
didn't care about the ordering.
If this is a commercial project, or an acedemic project where you can
spend money, there are a number of commercial libraries available that
do this for you.
- Next message: U-CDK_CHARLES\\Charles: "Re: Shortest day of year"
- Previous message: Lee Miller: "Re: BEWARE OF GODADDY--BAD BUSINESS PRACTICE"
- In reply to: Bob O`Bob: "Sorting algorithms was: Factorial"
- Next in thread: J French: "Re: Sorting algorithms was: Factorial"
- Reply: J French: "Re: Sorting algorithms was: Factorial"
- Messages sorted by: [ date ] [ thread ]