Re: Is there a Unicode equivalent to ASCIIZ Stings?

Tech-Archive recommends: Fix windows errors by optimizing your registry



On Thu, 16 Nov 2006 11:59:57 -0500, Joseph M. Newcomer
<newcomer@xxxxxxxxxxxx> wrote:

But in UTF8 you *must* use CharNext and CharPrev to advance one character at a time.

Yes, I am (and was) aware of this.
In fact, in my opinion, one of the "bad things" of UTF-8 was the
negation of "one character, one position" (the "good thing" was saving
space for characters storage).

On the other side, I thought that in UTF-16 I could assume "one
character-one position", but this thread showed me I was in error (but
so why the UTF-16? Use UTF-8 instead, as previously written).

Mr Asm
.



Relevant Pages

  • Re: UTF-16 file input, C programming.
    ... However, you are only partly correct, from the fact that all standard ASCII chars, are mapped on a single byte as you mention. ... UTF-8 only maps the standard ASCII chars in one byte and anything above is represented in two or more bytes. ... I believe unicode.org has some source, providing functions, that can convert UTF-16 surrogate pairs, into UTF-8 multibyte characters, but I will have to look into that. ...
    (comp.unix.programmer)
  • Re: unichr() question
    ... Unicode code points. ... If you eventually need UTF-8, you might just as well create a mapping ... Recent Unicode revisions added characters beyond the first ... If you want to learn more about UTF-16, ...
    (comp.lang.python)
  • Re: The Register interview Nigel Brown
    ... performance isn't quite as good as string. ... Have you considered implementing a native UTF-8 ... than UTF-16 with European ... which does not include all Chinese characters. ...
    (borland.public.delphi.non-technical)
  • Re: wstring to ostream
    ... There are different encodings for Unicode characters; UTF-8 and UTF-16 ... a Unicode character can be stored in one or two ...
    (microsoft.public.vc.stl)
  • Re: Supporting full Unicode
    ... > Keeping in mind that in UTF-16 some characters take two bytes and ... It is true that variable-width encodings such as UTF-16 or UTF-8 are ... But UTF-8 is gaining momemtum. ... encoding only, it is now in use as an internal encoding, too. ...
    (comp.lang.ada)