Re: How to split a compressed file programmatically?



On 2007-10-29 07:45:46 -0700, Brian Roisentul <brianroisentul@xxxxxxxxxxx> said:

BTW, this is my code, maybe i did something wrong:

The most serious issue I see is that you don't properly account for the fractional remainder of the file when you calculate the number of parts. Your division will return the number of full-size partitions of the file, but in most cases you'll have some extra bytes at the end that you're not saving.

You should keep track of how many bytes you've actually written, and then after writing out the full-size partitions, write a final partition that's whatever's less. Personally, I would forget about the calculation altogether and just write a loop that keeps writing bytes in chunks as large as you want or however many bytes you have remaining, whichever is less, until you have no more bytes to write. But how exactly you do this isn't so important as making sure you do it right.

There are other things about the code that are less-than-perfect (a couple of examples are mentioned after this paragraph), but as near as I can tell, the above is the most serious problem. I didn't bother to inspect the code that reconstructs the file, but assuming it was written with similar care as the code that splits the file, it likely has problems too.

I'm a bit bewildered at the "long.Parse(size_part.ToString())" business. The "size_part" variable is already a long; what possible value is there in converting that to a string and then back to a long?

You also have a strange calculation that screws up the "offset" variable; it turns out not to matter because you don't really need the variable at all. But it's still odd.

And why allocate a new buffer for each chunk you want to write, and why does that buffer have to be the length of the original file, and given that you're allocating a new buffer each time, why read the data anywhere other than the beginning of the buffer?

For splitting the file, I would recommend code that looks more like this:

void splitFile(string path, string path_parts, int size_part)
{
using (FileStream fs = new FileStream(path, FileMode.Open))
{
int ipart = 0;
byte[] arrBytes = new byte[size_part];
int cbRead;

while ((cbRead = fs.Read(arrBytes, 0, size_part)) > 0)
{
string filename = Path.Combine(path_parts, Path.GetFilenameWithoutExtension(path) + ipart.ToString() + this.ext);

using (FileStream fsOut = new FileStream(filename, FileMode.Create))
{
fsOut.Write(arrBytes, 0, cbRead);
}

ipart++;
}
}
}

In other words, just read chunks of the original file until you can't read any more, writing them out one by one, each to a new file.

Reading the parts and reconstructing the file should be similarly simple. And remember, the more complicated you make the code, the easier it is to create a bug in the code. The single most important thing you can do to ensure your code is correct and free of bugs is to make it as simple as you can.

Pete

.



Relevant Pages


Loading