Re: + or += (was: why not use i++ instead of ++i in for()?)

Tech-Archive recommends: Fix windows errors by optimizing your registry

From: Bo Persson (bop_at_gmb.dk)
Date: 02/24/04

  • Next message: Hendrik Schober: "Re: + or += (was: why not use i++ instead of ++i in for()?)"
    Date: Tue, 24 Feb 2004 19:40:57 +0100
    
    

    "Hendrik Schober" <SpamTrap@gmx.de> skrev i meddelandet
    news:usBYr7v%23DHA.684@tk2msftngp13.phx.gbl...
    > Bo Persson <bop@gmb.dk> wrote:
    > > [...]
    > > > > > s = a + b + c + d;
    > > > > >
    > > > > > is not as efficient as
    > > > >
    > > > > This all depends on the implementation, and the lengths of the
    strings
    > > > > involved. If some, or all, of the strings are short, there might not
    be many
    > > > > allocations at all.
    > > >
    > > > Still, strings are copied several times.
    > >
    > > Sure, but if the strings are short, and the implementation uses the
    small
    > > string optimization, there doesn't have to be any dynamic memory
    allocations
    > > at all.
    >
    >
    > Yes, but the same is always true for
    > 'operator+=()'. So this argument is
    > moot.

    But I'm not the one that said that the following ugly-looking code is always
    better.

    >
    > > > > > s = a;
    > > > > > s += b;
    > > > > > s += c;
    > > > > > s += d;
    > > > > >
    >
    > > > Am I missing something?
    > >
    > > That the OP made a statement that some obscure code is always more
    efficient
    > > than the simple and obvious version.
    >
    > I don't think the code is obscure, but YMMV.

    Obviously. :-)

    > I
    > don't think it will always be more efficient
    > either, but I do think it has a good chance
    > to be in many cases. OTOH, I still fail to
    > see how it could ever be less efficient.
    > I do use such code in libraries that might
    > be used where performance matters. I would
    > be very interested in an example where this
    > might do harm.

    So you say that it might be faster, sometimes. The OP said it was more
    efficient, always.

    >
    > > It's not, it all depends on the
    dynamic
    > > sizes of the strings involved.
    >
    > With the small string optimization (SSO),
    > allocations might be prevented depending on
    > the size of the strings. OTOH, copying cannot
    > be prevented by SSO.
    > FWIW, a somehow canonical implementation of
    > 'operator+()' looks like this:
    >
    > T operator+(const T& l, const T& r)
    > {
    > T tmp(l);
    > tmp += r; // invoking 'operator+=()' here
    > return tmp;
    > }
    >
    > This makes it obvious that, regarding
    > performance, 'operator+()' can never be
    > better than 'operator+=()', but it will
    > often be worse.
    >

    If you are unlucky, this will allocate a buffer and copy l once, then
    reallocate the buffer and copy l again, before adding r.

    Another implementation looks like this:

    T operator+(const T& l, const T& r)
    {
       T tmp;
        tmp.reserve(l.size() + r.size());

        return tmp.append(l).append(r);
    }

    A maximum of one memory allocation. Copies each character exactly once.

    > > Try this for example:
    > >
    > > #include <iostream>
    > > #include <string>
    > >
    > > int main()
    > > {
    > > const std::string a = "abcd";
    > > const std::string b = "bcda";
    > > const std::string c = "cdab";
    > > const std::string d = "dabc";
    > >
    > > std::string s = a + b + c + d;
    > >
    > > std::cout << s;
    > >
    > > return 0;
    > > }
    > >
    > > On my system there are no dynamic allocations for the strings. It's all
    done
    > > on the local stack.
    >
    >
    > Still, this creates three temporaries,
    > where Mark's code creates none. And his
    > code will not trigger any allocations
    > either.
    > What is your point?

    My point is:

    Best case, three temporaries on stack, but no allocations.

    Mark's best case, no temporaries, but s is updated 3 more times.

    Worst case, three temporaries allocated, s is reallocated on assignment.
    Four in total.

    Mark's worst case, s is reallocated on assignment, and for each append. Four
    in total.

    s = a + b + c + d;

    Looks much better. :-)

     Bo Persson


  • Next message: Hendrik Schober: "Re: + or += (was: why not use i++ instead of ++i in for()?)"

    Relevant Pages

    • Re: Reserve_Capacity for Unbounded_String?
      ... things, be it strings, I/O, or anything else. ... the only way to save allocations in an increasing length string is ... to overallocate the memory; ... the first rule of performance improvement is to find a better ...
      (comp.lang.ada)
    • Re: why not use i++ instead of ++i in for()?
      ... > more allocations than in the above ... A really smart implementation could defer the actual concatenation until the ... returning a proxy object that gets the final length before it starts working. ...
      (microsoft.public.vc.stl)
    • Re: FastMM newbie questions
      ... So now I have a clean project, ... Can I somehow get source code references to those? ... strings are ref counted and will go away when the references to ... you're seeing all memory allocations; ...
      (borland.public.delphi.language.basm)
    • + or += (was: why not use i++ instead of ++i in for()?)
      ... strings are copied several times. ... there doesn't have to be any dynamic memory allocations ... OTOH I see that it would ... >> have a good chance to do less copying. ...
      (microsoft.public.vc.stl)
    • Re: What I can to do with old PL/1 code?
      ... IBM1221W Statement uses count bytes for temporaries. ... This message is produced if a statement uses more bytes for temporaries than allowed by the STORAGE compiler option. ... Programs containing such statements would run in as little as 50K as long as the actual lengths of the strings were modest. ...
      (comp.lang.pl1)