Re: MBCS vs UNICODE



"--== Alain ==--" <nospam@xxxxxxxxxxx> schrieb im Newsbeitrag
news:uCEHHMk5GHA.4064@xxxxxxxxxxxxxxxxxxxxxxx
Hi,

If i understood well MBCS is also used for eastern languages like chinese,
japanese and so on.... as UNICODE is.

A large difference between Unicode and MBCS is that there is only one
Unicode (relevant to the Windows API) but there is a very large number of
multi-byte character sets. And usually a single MBCS is only usefull for a
single language. Of cause you can use all MBCS available in Windows to
represent (american) English, but you cannot use a single MBCS to represent
both Japanese and Korean, or even Greek and Russian. With Unicode, however,
you can represent many languages with a single character set, and you don't
have to worry about codepages as you have with MBCS. And if you do it right,
and if VC 2020 finally will implement full Unicode (UCS-32) support, your
Unicode app might even be able to display Klingonean text.

So which one is the best format ?
I see a lot of applications developed with UNICODE.

What are the pros/cons of UNICODE vs MBCS ? (except 2 bytes characters
coding)

Unicode is much easier to handle than MBCS. All characters have the same
size (currently 2 bytes) and you can use such characters and strings as easy
as single-byte character sets like ASCII or ANSI.

Till now i wrote applications only in ANSI. I want to allow my application
to work with Eastern Languages (e.g. Asian characters), so what should i
consider as best solution ?

Unicode is probaly easier to use if you want to target multiple languages
simultaniously or if you don't know the language in advance. A well written
MBCS application might work with whatever codepage selected by the user of
your app, but you have to be very carefull and many string functions don't
work really well with true multi-byte character sets.

Also, API calls expecting or returning strings are slightly faster with
Unicode than they are with MBCS or even plain ANSI. Internally Windows
NT/2000/XP only uses Unicode and all MBCS strings must be translated into
Unicode before Windows can actually use them.

HTH
Heinz


.



Relevant Pages

  • Re: How to Get the ByteLength from CString when it is Unicode
    ... UTF8 is one of many MBCS encodings. ... Unicode is not an MBCS; UTF8 is (or at least the WideCharToMultiByte API call thinks it ... The number of characters is based on interpreting 'character' as WCHAR in Unicode and CHAR ...
    (microsoft.public.vc.mfc)
  • Re: Code Page problem in SetWindowText
    ... to Unicode before calling NT internal routines and converting back to MBCS before returning to the caller. ... W take Unicode i.e. UTF-16. ... in most character sets, ... I understand that Unicode is the best way of string operations for morden ...
    (microsoft.public.vc.mfc)
  • Re: Project chenged from MBCS to Unicode gives Linker error
    ... Calculations -> MBCS ... User Interface -> Unicode ... You can't use LPCTSTR or TCHAR or CString at the *interface* between modules ...
    (microsoft.public.vc.mfc)
  • Re: _MBCS
    ... example Unicode. ... Now I understand that MBCS is strictly a double byte ... Unicode Character Set, ... Set is a name for all character sets that use a variable number of bytes to ...
    (microsoft.public.vc.language)
  • Re: about MBCS and UNICODE support
    ... there is no such 'code page' stuff in UNICODE?? ... CJKV languages but not much about other languages. ... So where is the problem from using MBCS in our scenario? ... and next to impossible to support multiple languages in the ...
    (microsoft.public.vc.atl)