Re: Problem with using char* to return string by reference

Tech-Archive recommends: Fix windows errors by optimizing your registry



Hendrik Schober wrote:
There is, however, one problem with all this:
'std::basic_string<>' was not designed for multi-byte
encodings. Therefor, when you put multi-byte encoded
strings into it (and if we're talking Unicode, except
for UTF-32 all encodings are multi-byte, since Unicode
specifies >2^16 characters), you're on your own.

Indeed, the situation is very complicated:
http://d-type.com/layout/index.htm

Consider bi-directional caret and selection handling in Arabic and Hebrew, ligature reordering substitutions in Hindi/Telugu, etc.

Tom
.



Relevant Pages

  • Re: Python Unicode to String conversion
    ... unicode encode and decode, try using a mix of latin1 and utf8 ... encodings to figure out whats going on, ... All input data should be decoded from byte strings into unicode as early as possible. ...
    (comp.lang.python)
  • Re: Unicode Support
    ... if two Unicode strings are the same? ... UTF-16 is basically telling everyone "ok we all got to start ... character, and will likely support *both* endians. ... UTF-8 encodings are also easy to learn to ...
    (alt.lang.asm)
  • Re: Quieter glyphs than parentheses
    ... ASCII or 16-bit Unicode characters, it did not require rewriting the entire ... by non ISO8859 language scripts. ... Japanese has three popular non-Unicode-based encodings, ... display fonts is one reason I would caution against using characters from ...
    (comp.lang.lisp)
  • Re: Problem with using char* to return string by reference
    ... strings into it (and if we're talking Unicode, ... for UTF-32 all encodings are multi-byte, since Unicode ... API function '). ...
    (microsoft.public.vc.language)
  • Re: New Years Resolution (was Re: cell phones, was: car help, was: Starving people refuse to eat foo
    ... so all the characters that match ... There's no such thing as "bare Unicode". ... and a multitude of encodings which ... default for characters in strings, ...
    (rec.arts.sf.written)