Re: Fast File Searching

Tech-Archive recommends: Speed Up your PC by fixing your registry

bo_dong_at_yahoo.com
Date: 02/17/05


Date: 17 Feb 2005 13:34:38 -0800

Thanks to everyone's replies.

Many people suggest me to use a database instead of using the file
system. Here is my personal view on this subject.

Database is a great solution when the data relationship structure is
simple and flat. A typical scenario is few large spreadsheets alike
tables linked into a star schema.

On the other hand, file system is a great solution when the
relationships structure is complex and hierarchical, such as the
investment accounting application I am working on. (Client -> Entity ->
Account -> Asset -> Lot -> Transactions)

For this application, I use folders to represent client, entity,
account, asset, lot. And I use files to hold transactions, positions,
prices, security descriptions. The benefits are obvious:

1. You can add/remove a level, a column or an attribute without
redesigning the whole schema.
2. You can navigate the hierarchy and browse the data using windows
explorer and notepad, no need for SQL statements or stored procedures.
3. The Windows OS has built-in features to manage the file system, such
as file permissions, user rights, backup, sharing, change detections...
4. It is super easy to program in this hierarchical tree. I can use
Directory.GetDirectories() and foreach loop to get to the right file
and then use StreamReader, StreamWriter to manipulate it.

I have used Informix, Oracle, SQL server databases before, so I know
there is no magic to it. It can only do what is given to it by the
operating system and the file system which host the database files.

The database beats the file system at search speed thanks to the use of
prepared index files (a sorted long list with pointers). Can't I
build an index file and do quick searches throughout my tree without
using the database? From what I know, the WinFS (file system for next
version of Windows) will address this issue, thus blurring the line
between database and file system.

So my question is where can I find information on quick search
algorithms and database search techniques.

Thanks.


Quantcast