Re: Byte size of characters when encoding

Tech-Archive recommends: Fix windows errors by optimizing your registry

From: Vladimir (xozar_at_tut.by)
Date: 07/09/04


Date: Sat, 10 Jul 2004 01:01:17 +0300


> > Method UnicodeEncoding.GetMaxByteCount(charCount) returns charCount * 2.
> > Method UTF8Encoding.GetMaxByteCount(charCount) returns charCount * 4.
> >
> > But why that?
>
> Strings in .NET are already Unicode encoded. So if you encode the
> string to an array of bytes, you get bytes per character.
>
> However, for UTF8 encoding a single Unicode character can be encoded
> using up to 4 bytes in the worst case. charCount*4 is just a worst case
> scenario if the string happened to contain only characters that required
> 4 byte encoding.

Do you want to say that two instances of struct Char in UTF-8 can occupy 8
bytes?



Relevant Pages

  • Re: unicode
    ... 'ascii' codec can't encode character u'\u9999' in ... it looks like when I try to display the string, ... If you try to print a Unicode string, then Python will attempt to first ... encode it using the default encoding for that file. ...
    (comp.lang.python)
  • Re: Sending floats over a client-server in Smalltalk
    ... The trick is knowing what to decode them ... Then encode the number in the remaining bytes. ... ByteString>>floatAt: byteIndex ... I could then take a string ...
    (comp.lang.smalltalk)
  • Re: Command line Terminal Instructions
    ... Well ROT13 is a simple cypher where you replace each character in some ... each character in the string to encode the entire string. ...
    (comp.sys.mac.system)
  • Re: Byte size of characters when encoding
    ... So if you encode the ... > string to an array of bytes, you get bytes per character. ... for UTF8 encoding a single Unicode character can be encoded ...
    (microsoft.public.dotnet.framework)
  • Re: unicode
    ... 'ascii' codec can't encode character u'\u9999' in ... it looks like when I try to display the string, the ascii decoder ... encode it using the default encoding for that file. ...
    (comp.lang.python)