Re: Why crash ?




"John Carson" <jcarson_n_o_sp_am_@xxxxxxxxxxxxxxx> skrev i meddelandet
news:egq7%23F1pGHA.4912@xxxxxxxxxxxxxxxxxxxxxxx
"Ulrich Eckhardt" <eckhardt@xxxxxxxxxxxxxx> wrote in message
news:tqgko3-qp1.ln1@xxxxxxxxxxxxxxxxxxxxxx
Eberhard Schefold wrote:
Arnaud Debaene schrieb:
That's a side effect of the "small string optimization" :
std::basic_string has an internal 16 bytes buffer that is used to
hold the string content when it is short enough : this avoid
dynamic memory allocation for small strings.

Yes, but naively, I would assume that the optimal buffer size
depended on a certain number of characters, and therefore would
double with wstring. I might well be wrong, though.

The effort to copy said buffer (which contains PODs and can thus be
copied with memcpy) is the same, regardless of what it holds. The
effort to allocate memory dynamically is also the same, no matter
what it will finally contain. Both only depend on the size of the
buffer, so that's a natural approach.


If you are arguing that a buffer size based on bytes rather than
characters
is natural, them I am not convinced.

My understanding is that there is a tradeoff between speed and
memory usage.
Allocating a large default buffer increases speed if it means that
you don't
need to make an additional allocation, but costs extra memory if
that buffer
ends up not being fully utilized.

If the foregoing is correct, then it is not obvious that the loss of
speed
in going from 15 to 7 characters in the wide character case is
justified by
the greater memory saving in that case. That presumably depends on
how one
values speed vs memory and on the size distribution of your strings.
To take
an extreme case, suppose that all your strings are 15 characters
long. Then
if you care at all about speed, you are plainly not going to want a
7 character limit.

The calculations isn't done that way exactly. The idea behind the
small string optimization is that the std::string object has a certain
size. If you allocate dynamic storage, the string object has to store
at least a pointer and the amount allocated.

If you don't allocate dynamically, this bookeeping area can be reused
to store the characters directly in the string object. The fact that
wide characters are twice the size of narrow characters, also reduces
the number of chars that will fit in this area. That a space
optimization works best for smaller objects, it just a fact of life.

If we were to make the std::wstring object larger, we would end up
with sizeof(wstring) == 36 or something equally cache-unfriendly.

What percentage of wide character strings fit in the 8-15 char range?
:-)



Bo Persson


.



Relevant Pages

  • Re: Delphi Quiz: SetLength( WideString, 10 );
    ... >> I call a function and the function returns a buffer of bytes. ... Let's assume it's a 16 bit unicode string. ... characters to a wide character encoding scheme such as Unicode. ...
    (alt.comp.lang.borland-delphi)
  • Re: Why crash ?
    ... Yes, but naively, I would assume that the optimal buffer size ... depended on a certain number of characters, ... small string optimization is that the std::string object has a certain ... If you allocate dynamic storage, the string object has to store ...
    (microsoft.public.vc.language)
  • Re: max size for printf() format conversion?
    ... For example, %i should never convert to a string that is longer than the number of digits in INT_MAX, right? ... This limit goes up to 4095 characters. ... lower bound for or at least give an impression of sensible buffer sizes. ... If the format string contains "%s", of course, the result is limited only by the possible length of a string. ...
    (comp.lang.c)
  • Re: iso_varying_string in f2003?
    ... > You're proposing that the compiler reallocate (if ... > Only if there is such a length (i.e. if the string is allocated). ... remaining characters of the present input record. ... then the result is to allocate the string to the appropriate length ...
    (comp.lang.fortran)
  • Re: Search a binary file for a string... again! (its to slow)
    ... >>Im writing a program to search for a string in a binary file. ... > Most of the time the first character in the buffer won't match the first ... characters, then taking 3 steps back, and reading ...
    (comp.lang.c)