Re: UNICODE to MBCS
- From: "CLoser" <Matthew.A.Cox@xxxxxxxxx>
- Date: 1 Mar 2006 05:48:43 -0800
I believe I have figured part of this out from various sources on the
net. What appears to be happening is when I'm reading the unicode
encode ini file (UTF-16) the windows API GetPrivateProfileStringA does
a conversion internally from Unicode encoding to ANSI encoding. The
reason I believe this is because of the following tests I have run.
Each each of the tests below I was basically reading the unicode
encoded file into a MBCS compiled dll using the win32 API
GetPrivateProfileStringA. I was then displaying the result into a
document (crystal report) using the Arial Unicode Font.
OS = English Regional Settings
INI = UNICODE (UTF-16) containing code points for English and Chinese
charactes
When the regional settings where set to English, and thus using the
1252 codepage, the english characters where coming through after a
conversion in the win32 API from Unicode >> English ANSI. But the
Chinese characters where all unresolved and appearing as '?'. This
makes sense because the 1252 code page does not have equivalent code
points for the Chinese characters.
OS = Chinese
INI = UNICODE (UTF-16) containing code points for English and Chinese
charactes
Under these conditions the same conversion was happening except that a
different code page was being used because my default locale at that
time was set to chinese. Here the Unicode >> Chinese ANSI was
displaying everything correctly because the chinese code points had
equivalents in the ANSI code page (code page 936?). And of course the
English was displaying becuase most (all?) code pages have 32-127
devoted to ASCII.
I then took the above tests a step further and noticed that I could
place other language code points into the ini file and have them appear
when using the Chinese regional settings. I was seeing Russian and
Japanese in the result which added to my confusion. I then did some
more looking around and found that there are some mega code pages that
have been developed recently for the Chinese language like GB
18030-2000 which is based on GB 2312-1980. These code pages include not
only Chinese but a who array of characters from different languages.
Now given that I can't help but wonder well how much support. Would it
be safe for me use set a systems regional settings to Chinese when I
really want to show Japanese text? Will these mega-codepages cover me?
My gut feeling is no. There are most likely certain language constructs
that are only available in that language codepage.
I hope this helps someone out there.
.
- References:
- UNICODE to MBCS
- From: CLoser
- Re: UNICODE to MBCS
- From: David Lowndes
- Re: UNICODE to MBCS
- From: CLoser
- UNICODE to MBCS
- Prev by Date: Re: How to convert TCHAR type to standard ISO C++ string?
- Next by Date: LNK2005 errors
- Previous by thread: Re: UNICODE to MBCS
- Next by thread: I don't know why this code don't compile (22 lines)
- Index(es):
Relevant Pages
|