Re: Some questions about sparse files
- From: nickdu <nicknospamdu@xxxxxxxxxxxxxxxx>
- Date: Sun, 28 Dec 2008 06:26:00 -0800
The indiex is the cookie I returned from my Store() method. The consumer was
given the cookie and they give it back to me when they want to retrieve the
blob.
As it was pseudo code I provided I was not too concerned with performance.
--
Thanks,
Nick
nicknospamdu@xxxxxxxxxxxxxxxx
remove "nospam" change community. to msn.com
"m" wrote:
Okay - how would you get the index? This is what you need to be able to do.
better than the file system does for this plan to make sense. And you need
to be able to say that the benefit of using your sparse file is sufficient
to outweigh the extra problems with backup & maintenance.
BTW: this psudocode is inefficient and can be improved by using page sized
aligned reads. The seek is unnecessary and will preclude multiple readers -
use overlapped IO. Both reads can be consolidated for small (less than 1
buffer sized) blobs and for large blobs, multiple reads will be required.
All of this is true regardless of how the file(s) that store the data are
arranged
Also, storing metadata creates the possibility of file corruption - consider
the case of process termination during a blob insert. If the metadata is
updated, and the blob data isn't, there will be garbage returned when the
app asks for that blob; but if the metadata isn't updated, the insert will
be lost completely; and, even worse, if a partial metadata update is made,
then the whole file might be unreadable. These risks can be mitigated by
using FILE_FLAG_NO_BUFFERING + FILE_FLAG_WRITE_THROUGH, but you must write
code that can handle the consequences of corruption and / or to check for
these conditions and try to correct them. This is something else that the
file system does for you directly.
"nickdu" <nicknospamdu@xxxxxxxxxxxxxxxx> wrote in message
news:A3F5E9E5-3FC4-49A7-8F34-666735135A77@xxxxxxxxxxxxxxxx
Hmmm, I replied to this yet it didn't seem to make it to this thread.
The indexing would look something similar to:
GetBlob(int index, byte *bytes, int size)
{
Seek(_sparse, (index + 1) * 1GB, SEEKOFFSET_ORIGIN);
Read(_sparse, &length, sizeof(length));
Read(_sparse, bytes, min(length, size));
}
Most likely I would use the first block to store information, like the
next
free block. Other than that the indexing would be as showin above. Just
like indexing an array. That's the beauty of sparse files, right? You
can
be wasteful and pick a huge block size and the OS will only consume the
actual amount of space you use. Of course the OS must have to do some
book
keeping, but better the OS than me.
The NTFS change journal uses sparse file so I assume it saw a benefit in
using it.
--
Thanks,
Nick
nicknospamdu@xxxxxxxxxxxxxxxx
remove "nospam" change community. to msn.com
"m" wrote:
What technique do you plan to use for indexing your blobs that will be
better than a file system? This is, after all, what they are designed to
do.
"nickdu" <nicknospamdu@xxxxxxxxxxxxxxxx> wrote in message
news:ABA4112C-311F-48C1-A39E-9A62F4648620@xxxxxxxxxxxxxxxx
I have a couple questions regarding sparse files.
1. We're looking for an easy and efficient way to store blobs (array of
bytes) of data in some sort of circular queue and return some sort of
key
someone can use for later access to that blob. The blobs can very
varying
lengths. I was wondering whether sparse files would be a reasonable
approach
for this as it appears it's an efficient mechanism for storing messages
of
varying length (at least that's what I gather).
I guess you could also store each blob in its own file, but then I
think
the
overhead of creating a file per blob (may have millions) might be
costly.
I could also store all the blobs in a single "normal" (non-sparse)
file,
but
then I think the house keeping of walking the chain of blobs (most
likely
I'll need to chain them if I don't use a sparse file) might be costly
in
terms of performance and also adds to the code I'll have to write.
Though
I
guess NTFS is keeping its own list of sections of the file that contain
data.
2. If I want to copy a sparse file to other Windows machines running
NTFS
do
I need to write my own code to do that or does CopyFile() handle sparse
files
such that it only copies the parts of the file which contain data such
that
the copied file is exactly the same as the source file?
--
Thanks,
Nick
nicknospamdu@xxxxxxxxxxxxxxxx
remove "nospam" change community. to msn.com
- Follow-Ups:
- References:
- Some questions about sparse files
- From: nickdu
- Re: Some questions about sparse files
- From: m
- Re: Some questions about sparse files
- From: nickdu
- Re: Some questions about sparse files
- From: m
- Some questions about sparse files
- Prev by Date: Re: Some questions about sparse files
- Next by Date: Re: Some questions about sparse files
- Previous by thread: Re: Some questions about sparse files
- Next by thread: Re: Some questions about sparse files
- Index(es):
Relevant Pages
|