Re: WideCharToMultiByte returns default character when input language for non-Unicode programs set to English



"David Liebtag" <liebtag@xxxxxxxxxx> wrote in message
news:eHba2KXkGHA.2200@xxxxxxxxxxxxxxxxxxxx
I'm having a problem with WideCharToMultiByte. Maybe I just don't
understand how it is supposed to work.

I have support for Japanese installed on my machine. If I set the
language for non-Unicode programs to Japanese, and call
WideCharToMultiByte to translate a Unicode string to Multibyte and
use codepage CP_ACP or 932, it works fine.

However, if I set the language for non-Unicode programs to English
(United States), reboot, and do the same thing, the Japanese
characters are replaced by the default character (which is question
mark in 932.)

How are you determining this? From the screen display of the text string or
by examining the numerical value of the characters in the text string?

GetCPInfo still reports that 932 is installed and available.

Can anyone explain why WideCharToMultiByte is not returning the
Shift-JIS representation of the Japanese characters?

What WideCharToMultiByte does has nothing to do with the currently active
code page on your machine. It simply translates a buffer of WCHARS into a
buffer of CHARS based on the arguments you use when you call it.

Of course to actually see those CHARS displayed properly on screen, you need
to have the code page for your system set appropriately.

The following program snippet illustrates how it works.

We start with the Unicode character 0x547D, which is the Shift-JIS multibyte
character 0x96BD. The program converts the Unicode string to the multibyte
string, which means that the multibyte string contains 0x96BD. This can be
verified by converting the characters to integers and outputting the
integers.


// do the conversion
WCHAR wch[] = {(WCHAR)0x547D, 0};
CHAR ch[10];
int count = WideCharToMultiByte(932, 0, wch, -1, ch, sizeof ch, 0, 0);

// show it worked by outputting characters as integers
int charNumber[] = {(unsigned char)ch[0], (unsigned char)ch[1]};
TCHAR buffer[50];
_stprintf(buffer, _T("multibyte character number is 0x%X%X"),
charNumber[0],charNumber[1]);
TextOut(hdc, 0, 0, buffer, _tcslen(buffer));

// Now we display both the Unicode and multibyte strings. The
// unicode string's display is independent of the code page, but the
// multibyte string will only display correctly if the appropriate
// code page is in effect
TextOutW(hdc, 0, 30, wch, wcslen(wch));
TextOutA(hdc, 0, 60, ch, count-1);

If there is a better news group for this question, please tell me
which one to use.

microsoft.public.win32.programmer.international

--
John Carson


.



Relevant Pages

  • Re: Japanese strings get mangled when sent from VB6 to VC++ dll
    ... Japanese characters, as I only noticed two of the rectangles and none of the ... In the second string, the first three characters are wrong. ... The first character comes up a rectangle, ...
    (microsoft.public.vb.general.discussion)
  • Re: multi language support
    ... a Unicode string (string composed of Unicode characters, ... ANSI characters), ... font and sent a string that has Japanese characters in it, ... Your application, however, can be in the second language (menus and window ...
    (microsoft.public.windowsce.embedded.vc)
  • Re: help wanted regarding displaying Japanese characters in a GUI using QT and python
    ... [Posting via Google's web interface again and hoping that double ... I am able to take input in japanese. ... characters Can anyone suggest me some way how to debug the issue. ... unicode coded characters of the japanese string given in the GUI? ...
    (comp.lang.python)
  • Re: Character Encodings and display of strings
    ... a hex display and sometimes get the string printed as characters. ... Japanese strings into a list, say, catlis. ... the first print displays Japanese text as characters.. ... The second print (print catlis) displays a list with the double byte ...
    (comp.lang.python)
  • Re: Prothon should not borrow Python strings!
    ... """It does not make sense to have a string without knowing what encoding ... same cul de sac as Python. ... Prothon_String_As_ASCII // raises error if there are high characters ... Python's split between byte strings and Unicode strings is ...
    (comp.lang.python)