Re: Storing docs in database vs storing in file system

From: Zach Wells (no_zwells_spam_at_ain1.com)
Date: 07/21/04


Date: Wed, 21 Jul 2004 10:56:32 -0500

Brian Beam wrote:
> My apologies if this has been asked a thousand times already...
>
> I'm building a simple document management system. My current solution stores
> documents directly in the database. However, I'm wondering what the
> advantages (disadvantages) are to this method vs. storing the documents in
> the file system and simply keeping a pointer to each file in the database.
>

I've done this exact thing both ways. One solution stored the files in a
directory structure, the other in the database itself. In both cases I
was dealing with 10's of thousands of files, all of which needed to be
searchable and viewable.

After doing both methods, I personally found the storage in sql method
to be better. Granted, this was just for *my* particular situation.

Here is why I liked the SQL storage solution better:

Actually though, before I get to that, I want to respond to one of your
"cons". You say "Database size: smaller database when storing
pointers". While the database itself my be smaller you are still using
the same basic amount of disk space since you have to store the file
*somewhere*.

Anyway, the reasons I like the sql solution better were as follows:
1) I could use the MS SQL full text indexing for very fast & powerful
searches. This was opposed to using a 3rd party search tool to search
the files when they were stored in the directory structure. I could
easily put the searching functionality right into my app.

2) Directory limits. I'm not sure how well ntfs handles it but I know
fat32 chokes when you start getting a few thousand files in a single
directory. This caused me to have to come up with a more complex
directory structure to store all the files. I was storing resumes so
what I did was create a directory structure where there was a directory
for with the first two letters of the persons last name. So, Jones and
Johnson would both be stored in the same subdirectory. This worked ok
but still had a good number of subdirectories under my resume directory.

3) Using the database storage system I was easily able to store multiple
copies of the same file without having to come up with some crazy naming
scheme. For example, again storing resumes, I would be able to store the
original formatted version (word), a pure text version and then any
number of specially formatted versions.

4) Security. Security was a big plus. Since it wouldn't take a rocket
scientist to figure out how to burn the entire directory onto a CD and
walk out the door with thousands of resumes (which is the bread and
butter of the industry I work in), being able to secure the resumes was big.

5) Backup. You can have all your backing up done in one routine. You
don't have to worry about one working, but the other not. Not to
mention, the trouble of backing up 40,000 individual files is a very
long process.

The biggest con I remember dealing with was the fact that it was extra
work getting the data into and out of SQL server, for which there isn't
a ton of good documentation. I was using ADO at the time (ado, not
ado.net) and only had one or two examples to work from. It also required
that I create a temporary file on the local machine, which wasn't hard,
just extra

Zach


Loading