Re: How to write array of doubles to stream without using a loop?

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance

From: John Dumais (nospam_doomer_at_sonic.net)
Date: 01/14/05


Date: Fri, 14 Jan 2005 04:58:35 GMT

Jon Skeet [C# MVP] wrote:

> Have you measured the overhead? Is it definitely causing a problem?
> There's a tendency to assume that things like this are bottlenecks
> without experimentation - that may not be the case here, but it's worth
> knowing before we get into much riskier code.
>

Not specifically in C#, because I can't get an apples-to-apples
comparison, but my experience has been that most run-time
libraries work better when you supply them with one hunk of data
as opposed to supplying many little hunks in a loop. The attached
example will illustrate. It's in C++, because I can't come up
with a reasonable comarison in C#. Sorry in advance -- I wrote
this illustration on my Linux machine. I don't have a Windows
PC at home. I ran the program 10 times in a loop and got the
following results.

> Using loop
> Elapsed time: 2 seconds, 223613 micro-seconds
> Writing in a single hunk
> Elapsed time: 1 seconds, 725197 micro-seconds
> Using loop
> Elapsed time: 3 seconds, 256561 micro-seconds
> Writing in a single hunk
> Elapsed time: 1 seconds, 277398 micro-seconds
> Using loop
> Elapsed time: 2 seconds, 224807 micro-seconds
> Writing in a single hunk
> Elapsed time: 1 seconds, 45266 micro-seconds
> Using loop
> Elapsed time: 2 seconds, 215541 micro-seconds
> Writing in a single hunk
> Elapsed time: 1 seconds, 231558 micro-seconds
> Using loop
> Elapsed time: 2 seconds, 221532 micro-seconds
> Writing in a single hunk
> Elapsed time: 1 seconds, 278577 micro-seconds
> Using loop
> Elapsed time: 2 seconds, 220584 micro-seconds
> Writing in a single hunk
> Elapsed time: 1 seconds, 11936 micro-seconds
> Using loop
> Elapsed time: 2 seconds, 215263 micro-seconds
> Writing in a single hunk
> Elapsed time: 1 seconds, 28349 micro-seconds
> Using loop
> Elapsed time: 2 seconds, 223527 micro-seconds
> Writing in a single hunk
> Elapsed time: 1 seconds, 273473 micro-seconds
> Using loop
> Elapsed time: 2 seconds, 226312 micro-seconds
> Writing in a single hunk
> Elapsed time: 1 seconds, 7940 micro-seconds
> Using loop
> Elapsed time: 2 seconds, 217676 micro-seconds
> Writing in a single hunk
> Elapsed time: 1 seconds, 23728 micro-seconds
>

The program...

> #include <cstdio>
> #include <cstdlib>
> #include <sys/time.h>
>
> typedef void(*WriterFuncPtr)(FILE*, double*, size_t);
>
> void timeIt(WriterFuncPtr funcPtr, FILE *fp, double *data,
> size_t numDataPoints)
> {
> timeval startTime = {0, 0};
> timeval endTime = {0, 0};
> timeval elapsedTime = {0, 0};
>
> if( ! gettimeofday(&startTime, 0)){
> funcPtr(fp, data, numDataPoints);
>
> if( ! gettimeofday(&endTime, 0)){
> if(startTime.tv_usec > endTime.tv_usec){
> endTime.tv_usec += 1000000;
> endTime.tv_sec--;
> }
>
> elapsedTime.tv_usec = endTime.tv_usec - startTime.tv_usec;
> elapsedTime.tv_sec = endTime.tv_sec - startTime.tv_sec;
>
> printf("Elapsed time: %ld seconds, %ld micro-seconds\n",
> elapsedTime.tv_sec, elapsedTime.tv_usec);
> }
> }
>
> }
>
> void checkFileContents(FILE *fp)
> {
> double data = 0;
>
> size_t numRead = 0;
>
> do{
> numRead = fread(&data, sizeof(data), 1, fp);
> printf("%lf\n", data);
> } while(numRead > 0);
> }
>
> void writeDataUsingLoop(FILE *fp, double *data, size_t numDataPoints)
> {
> for(size_t i = 0; i < numDataPoints; ++i){
> (void)fwrite(data++, sizeof(double), 1, fp);
> }
> }
>
> void writeDataInOneHunk(FILE *fp, double *data, size_t numDataPoints)
> {
> (void)fwrite(data, sizeof(double), numDataPoints, fp);
> }
>
> int main(void)
> {
> FILE *fp = fopen("data", "w");
> if(fp){
> const size_t numDataPoints = 8 * 1024 * 1024;
> double *data = (double*)malloc(numDataPoints * sizeof(double));
> if(data){
> for(size_t i = 0; i < numDataPoints; ++i){
> data[i] = i;
> }
>
> printf("Using loop\n");
> timeIt(writeDataUsingLoop, fp, data, numDataPoints);
>
> fclose(fp);
>
> fp = fopen("data", "w");
> if(fp){
> printf("Writing in a single hunk\n");
> timeIt(writeDataInOneHunk, fp, data, numDataPoints);
>
> fclose(fp);
> fp = 0;
> }
>
> free(data);
> }
> }
>
> /*
> fp = fopen("data", "r");
> if(fp){
> checkFileContents(fp);
>
> fclose(fp);
> fp = 0;
> }
> */
>
> return 0;
> }
>

Thanks,



Relevant Pages

  • Re: How to write array of doubles to stream without using a loop?
    ... > without experimentation - that may not be the case here, ... as opposed to supplying many little hunks in a loop. ... > Writing in a single hunk ...
    (microsoft.public.dotnet.framework.clr)
  • Re: Problem with a script
    ... a loop there becomes impractical. ... You still have them as uniquely named array indexes... ... writing the code twice will only ... reading your entire code and parsing it in their head, ...
    (comp.lang.php)
  • Re: Problem with a script
    ... Okay, so variables have unique labels, that doesn't mean they still couldn't be handled in a loop. ... You still have them as uniquely named array indexes... ... I believe that for the new guy this code would be readable, and identifying problems should really not be any more difficult with this, plus I think that it actually might save some time to write the actual code from the beginnig, even though it's not at it's final stage, instead of first writing everything spread out, and then rewriting the same code again cleaned. ... If you expect a person to spend an hour reading your entire code and parsing it in their head, you wont get any help and have to solve the problem by yourself. ...
    (comp.lang.php)
  • Re: FOR Loop slows down with each iteration?
    ... >>> P.S. Don't write to a file inside a triple for loop. ... Writing to disk is generally>> done asynchronously by the operating system when the program requests to>> write one or more pages; pages were traditionally 4 K byte, but these>> days are more likely to be 16 K byte. ... All else being equal, the larger>> the more pages of data you can submit to write at one time, the less>> overhead will be involved, but only if those buffers can be pre-allocated>> ... > I cannot pre- allocate an array for the data that is being written as the> number of data is unknown...it is basically iterating through arrays and> extracting relevant data... ...
    (comp.soft-sys.matlab)
  • Re: Problem with a script
    ... a loop there becomes impractical. ... You still have them as uniquely named array indexes... ... writing the code twice will only ... reading your entire code and parsing it in their head, ...
    (comp.lang.php)