Re: Multi language application



Pierre,

I second Giovanni on this issue. Just to add my 2-cents, the <code> block below implies that you created your app as "Unicode app".

Using UTF8 for data storage is just fine. It doesn't prevent from using UTF16 internally. Whether you choose UTF8 or UTF16 is really up to you but by default in a "Unicode app" (necessary if you want a textbox to support multiple scripts and codepage), you're likely to write less code with UTF16 strings.

Warning: ad ahead:

Regarding the translation of your app, you may want to check out appTranslator. It makes these tasks easier: transle your app, manage relations with translators, manage translations for new versions of your app,...

Regards,

Serge.
http://www.apptranslator.com - Localization tool for your C++/MFC applications



"Giovanni Dicanio" <giovanni.dicanio@xxxxxxxxxxx> wrote in message news:egHaM18oIHA.3652@xxxxxxxxxxxxxxxxxxxxxxx
Hi,

I think that if you want to correctly display Unicode characters in Windows, you must use UTF-16 (which is the Unicode encoding used internally by Windows).

However, if for some reasons you do want to use UTF-8 inside your application, I think that you can do that (e.g. storing your strings in CStringA or std::string), and then you can just convert your UTF-8 strings to UTF-16 just before passing them to Windows controls (like edit control) for display.

You can use ::MultiByteToWideChar Win32 API to convert from UTF-8 to UTF-16, and pass the UTF-16 string to Windows controls.

e.g.

<code>

// Your UTF-8 string
CStringA utf8;
... process your utf8 string ...
...

// Convert UTF-8 string to Unicode UTF-16
// to be used by Windows
CStringW utf16 = ConvertUtf8ToUtf16( utf8 );

// Pass the UTF-16 string to Windows
... e.g.
pSomeEditCtrl->SetWindowText( utf16 );

</code>

I can share a function I developed to convert strings from UTF-8 to UTF-16, feel free to use that in your code, if you need:

<code>

// ======================================================================
//
// FUNCTION: ConvertUtf8ToUtf16
// AUTHOR: Giovanni Dicanio
//
// Converts from Unicode UTF-8 string to UTF-16.
//
// On error: ASSERTs in debug builds; in release builds throws using
// AtlThrow or AtlThrowLastWin32 (see documentations of these functions
// for more details).
//
// This function should work since VS2003 (VC++7.1), but not on VC6,
// because of lack of newer ATL-MFC shared classes and functions like
// CStringA/W in VC6.
//
// ======================================================================
CStringW ConvertUtf8ToUtf16( const CStringA & utf8 )
{
//
// Special case of empty string
//
if ( utf8.IsEmpty() )
{
return L"";
}


//
// Consider byte count corresponding to total string length,
// including end-of-string (\0) character
//
const int utf8ByteCount = utf8.GetLength() + 1;


//
// Get size of destination UTF-16 buffer, in wchar_t's
//
int utf16Size = ::MultiByteToWideChar(
CP_UTF8, // convert from UTF-8
MB_ERR_INVALID_CHARS, // error on invalid chars
static_cast<const char *>(utf8), // source UTF-8 string
utf8ByteCount, // total length of source UTF-8 string,
// in bytes, including end-of-string \0
NULL, // unused - no conversion done in this step
0 // request size of destination buffer, in wchar_t's
);
ATLASSERT( utf16Size != 0 );
if ( utf16Size == 0 )
{
AtlThrowLastWin32();
}


//
// Allocate destination buffer to store UTF-16 string
//
std::vector< wchar_t > utf16Buffer( utf16Size );


//
// Do the conversion from UTF-8 to UTF-16
//
int result = ::MultiByteToWideChar(
CP_UTF8, // convert from UTF-8
MB_ERR_INVALID_CHARS, // error on invalid chars
static_cast< const char *>(utf8), // source UTF-8 string
utf8ByteCount, // total length of source UTF-8 string,
// in bytes, including end-of-string \0
&utf16Buffer[0], // destination buffer
utf16Size // size of destination buffer, in wchar_t's
);
ATLASSERT( result != 0 );
if ( result == 0 )
{
AtlThrowLastWin32();
}


//
// Build UTF-16 string from conversion buffer
//
return CStringW( &utf16Buffer[0] );
}

</code>


HTH,
Giovanni




"Nord Pierre" <non> ha scritto nel messaggio news:480cb5d3$0$8138$426a34cc@xxxxxxxxxxxxxxx

Hello,

I would like to know if there is any method to create an app (without 16- bits unicode) to be translated later in different language (including language with non us-ascii) - I'm quiet alergic to unicode 16bits but i'm ready to work a lot with UTF8 - I also have in an app a window that can display many messages in english-french-german-japanese-chinese-korean ... And i would like to know if there is a way to display each message (one after the other) in a correct format (all source is in utf8) - It's a classic CEdit control

Thanks

PS : sorry if i'm dreaming of an easy solution :)




.



Relevant Pages

  • Re: Populating CString in Win32 dll interface that accepts LPCTSTR
    ... and I want a string to be populated by calling this inside a UNICODE ... CString str; ... If your app is Unicode, and your DLL returns 8-bit characters, you would declare ...
    (microsoft.public.vc.mfc)
  • Re: accessing individual characters in unicode strings
    ... mailer can see utf8) ... each character (actually set a width attribute somewhere else for each ... So I use lento find out how long my simple greek string is, ... A day of intensive searching around the lists tells me that unicode ...
    (comp.lang.python)
  • Converting text between various encodings
    ... I'm playing with converting text strings between various encodings like ... Unicode and UTF8 and UTF7. ... a string to be converted and a long integer ...
    (microsoft.public.scripting.vbscript)
  • Re: LWP and Unicode
    ... until you understand Perl's Unicode handling better. ... Isn't there a way to tell LWP that the content is utf8? ... encoding supports many encodings. ... If the string already has the UTF8 flag on, ...
    (comp.lang.perl.misc)
  • Re: Unicode in Delphi: just deprecate WideString/WideChar
    ... UTF8 is how we're currently dealing with Unicode here, essentially because it's the only efficient string currently in Delphi, that said... ... These aren't so minor, actually they're the reason a very large proportion of string-manipulating .Net applications out there aren't Unicode capable, and only deal well with UCS-2. ... Right now for us and UTF8, this is a necessity arising from the lack of Unicode support in Delphi, but IMO for a "Unicode-compliant Delphi" this would be quite a shame not to be able to have and use UTF8/UTF16/UTF32 string types directly. ...
    (borland.public.delphi.non-technical)