RE: converting application to UNICODE

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



Hi Srishti,

>I am totally new to .NET so I'm not sure of this, but I read somewhere
>that if converted my apllication to .NET (manages C++) and specify
>Encoding.UTF8 it should work???
>
>Would this be a better approach (make it .NETtable) or shouls I convert my
>existing code in VC++6.0 to convert it to Unicode ( wcscat, wcscpy etc.)

For the reason that the VC++.NET can not convert your MFC program to the
managed C++ program, the practical convention is migrating your current
code to Unicode-compatible code, to do this, you need to take the following
steps(digested from the <<Developing International Software>> from MS
Press):

1. Modify your code to use generic data types.
such as char, char* -> TCHAR and TCHAR*, which defined in the Win32
file WINDOWS.H, or to _TCHAR as defined in the Visual C++ file TCHAR.H.
Replace instances of LPSTR and LPCH with LPTSTR and LPTCH.

2. Modify your code to use generic function prototypes.
such as use the C run-time call _tcslen instead of strlen, and use the
Win32 API SetWindowText instead of SetWindowTextA.

3. Surround any character or string literal with the TEXT macro.
The TEXT macro conditionally places an "L" in front of a character
literal or a string literal definition.

4. Create generic versions of your data structures.
Type definitions for string or character fields in structures should
resolve correctly based on the UNICODE compile-time flag.

5. Change your build process. When you want to build a Unicode version of
your application, both the Win32 compile-time flag UNICODE and the C
run-time compile-time flag _UNICODE must be defined.

6. Adjust pointer arithmetic.
Subtracting char* values yields an answer in terms of bytes; subtracting
wchar_t* values yields an answer in terms of 16-bit chunks. When
determining the number of bytes (for example, when allocating memory for a
string), multiply the length of the string in symbols by sizeof(TCHAR).
When determining the number of characters from the number of bytes, divide
by sizeof(TCHAR).

7. Check for any code that assumes a character is always 1 byte long.
Code that assumes a character's value is always less than 256 (for
example, code that uses a character value as an index into a table of size
256) must be changed. Make sure your definition of NULL is 16 bits long.

For more detailed direction on how to migrate to Unicode Applications in
VC++, I suggest you can refer to the Chapter 3 "Unicode" of the book
<<Developing International Software>> 2nd edition.


Hope this helps!

Best regards,

Gary Chang
Microsoft Community Support
--------------------
Get Secure! ¡§C www.microsoft.com/security
Register to Access MSDN Managed Newsgroups!
http://support.microsoft.com/default.aspx?scid=/servicedesks/msdn/nospam.asp
&SD=msdn

This posting is provided "AS IS" with no warranties, and confers no rights.




.



Relevant Pages

  • Re: Defacto standard string library
    ... Is there a defacto standard string library ... Unicode, encoded in UTF8 format, except that a zero byte is ... Standard C string functions will be fine with this ... result, it cannot be encoded using a single byte per character, unless ...
    (comp.lang.c)
  • Re: Determining if a string is Unicode
    ... there's nothing magic about Unicode. ... where each character occupies 2 bytes, as opposed to a Single-Byte Character ... You could load up a string with rubbish, ... > INF file like so: ...
    (microsoft.public.vb.general.discussion)
  • Re: Determining if a string is Unicode
    ... bytes per character, and MULTI-byte occupies one!!?? ... there's nothing magic about Unicode. ... You could load up a string with rubbish, ... if I read in the INF file from a 9x based computer the string does ...
    (microsoft.public.vb.general.discussion)
  • Re: Arabic characters gives ASCII code 63
    ... The only problem is that you are looking at the ASCII/ANSI values i.e. assuming that each character is represented as a number between 0 and 255. ... This is hidden from the developer - the length of a 5 character string is still 5 but it's still 10 bytes. ... all you need to do is get the unicode value for each character rather than the ANSI number. ... Dim CellValue As String ...
    (microsoft.public.excel.programming)
  • Re: Unicode conversion problem (codec cant decode)
    ... I've read a lot of stuff about Unicode and Python and I'm pretty comfortable ... with how you can convert between different encoding types. ... understand is how to go from a byte string with 8-bit characters to an encoded ... I really don't care about the character set used. ...
    (comp.lang.python)