Re: Dangerous behavior of CString



Norman Diamond wrote:
"David Wilkinson" <no-reply@xxxxxxxxxxxx> wrote in message news:ee97QkgEGHA.984@xxxxxxxxxxxxxxxxxxxxxxx

CString str(_T("Hello "));
str += "world."; // 8-bit string!!


As you posted, you know what you needed here. But still, your code looks even stranger than the explanation that you posted.

CString str(_T("Hello "));


You use the _T() macro in a place where you didn't need it, though of course it doesn't hurt a bit and it helps readability and it might help efficiency.

str += "world."; // 8-bit string!!


You omitted the _T() macro in a place where you did need it.

This combination looks very strange, because if you decided to use the _T() macro only sometimes, how did you come to a decision that's exactly backwards?

Does anybody use these implicit 8-bit <---> 16-bit conversion features in CString?


Yes. If I'm reading a data file or serial port or something, if the raw data are multibyte but the compilation is Unicode or vice-versa, then sometimes the converting constructors in CString are convenient. On occasions when they're convenient, I don't have to code two calls to MultibyteToWideChar and vice-versa and wrap them in #ifdef UNICODE etc.

Norman:

I think you misunderstood my post (which was therefore not clear, I guess). The snippet

CString str(_T("Hello "));
str += "world.";        // 8-bit string!!
AfxMessageBox(str);

was just meant to be an example of something that "shouldn't" compile in a Unicode build, but does because CString has an implicit conversion constructor from 8-bit to 16-bit.

I did not actually write code like this; in fact I was pretty careful always to use the _T macro with any literal strings. The problem came with strings that were not literal strings.

My app has a business logic that is written entirely in ISO-C++, using std::string (8-bit). I did this so I could relatively easily port to some other platform. I do not want to convert this business logic to 16-bit, because it would be a lot of work, and because it would be conterproductive (most other operating systems do not use use 16-bit Unicode as their native format).

In the 8-bit version of my code, I would freely pass strings between the business logic and the MFC GUI classes, and the whole app worked in the 8-bit code page of the user. In order for the 16-bit version to be Unicode-aware I need the 8-bit business logic strings to be UTF-8 and the 16-bit GUI strings to be UTF-16. So I wrote myself easy-to-use converters CU2T and CT2U which in a Unicode build convert between UTF-8 and UTF-16.

Suppose in my business logic I have a method like

std::string GetDisplayString() const;

and that I want to display this string in a message box. In my 8-bit version I would just do

AfxMessageBox(m_pBL->GetDisplayString().c_str());

In Unicode build this does not compile, so I change it to

AfxMessageBox(CU2T(m_pBL->GetDisplayString().c_str()));

No problem. But suppose I had done

CString str = _T("Display string = ");
str += m_pBL->GetDisplayString().c_str();
AfxMessageBox(str);

This code compiles in a Unicode build!! But it does the conversion using the current 8-bit code page, which is not what I want. In this case, the complier does not help me identify the code that needs to be changed. So potentially, I have to examine every line of my code by hand.

If CString did not have the implicit conversion constructor, then the compiler would have identified every line that needed to be changed. Or if I had typedef'd std::basic_string<TCHAR> as tstring and done

tstring str = _T("Display string = ");
str += m_pBL->GetDisplayString().c_str();
AfxMessageBox(str.c_str());

the compliler would also have helped me when I switched to Unicode build.

So for me, the implicit conversion feature of CString is a big liability. Actually, Bob Eaton has an interesting idea: modify the MFC headers to remove the conversion constructors. I may do this.

David Wilkinson





.



Relevant Pages

  • Re: Sets and portability (was) Re: Is ISO Pascal compatible with J&W (original) Pascal ?
    ... strings, the user can control the length by the data they process; ... >> The computer world is more complex than it's ever been (eg Unicode) ... The Pascal `Char' type can be this size (unlike C, ... > Note that ansi->wide conversion is codepage sensitive. ...
    (comp.lang.pascal.misc)
  • Re: Want Input boxes to accept unicode strings on Standard Window
    ... strings with _T ... pattern) but these blow up immediately. ... as a "massive effort" or, in one case, "we need a complete rewrite in Unicode and can't ... the process a couple of times the conversion thing is pretty academic. ...
    (microsoft.public.vc.mfc)
  • Dangerous behavior of CString
    ... On initial compilation under Unicode, there were several hundred errors, and it took me a couple of days to get rid of them. ... I then started to test my app with strings from different languages, and was surprised to find that in some places the strings were displayed correctly, but in others they were not. ... Thus the implicit conversion constructor prevents the compiler form telling me that my code is not as I intended. ...
    (microsoft.public.vc.mfc)
  • Re: A CString question
    ... >>Cast from a char to TCHAR does to not give you the Unicode value. ... > For short strings in release mode this is cheap; ... The problem did not state that the CString is Unicode. ... When it point to ANSI, ...
    (microsoft.public.vc.mfc)
  • Re: Convert CString to LONG.
    ... people used ASCII to store strings. ... Unicode was considered the standard to represent ... Typically, ANSI version ended with A, ... It is easy to code in Unicode-aware way if you use CString and Microsoft ...
    (microsoft.public.vc.mfc)

Loading