Re: Are _T() and TEXT() macros equivalent?

Tech-Archive recommends: Fix windows errors by optimizing your registry



Remember that billions of lines of code existed in the C language before anyone thought of
supporting Unicode. If you start arguing that C ought to have these functions, that's
fine, but unfortunately, it didn't, and doesn't, and all of that code would break.
Furthermore, code written that would compile and run with the Unicode option enabled in
this way would break horribly if compiled with the ANSI option. So instead the programmer
is required to make the decisions, which means the programmer is claiming to be aware of
the implications of the changes in representation.
joe
On Mon, 9 Apr 2007 17:30:20 -0700, "David Ching" <dc@xxxxxxxxxxxxxxxxxxxxxx> wrote:

"Joseph M. Newcomer" <newcomer@xxxxxxxxxxxx> wrote in message
news:1skl13578pilh48i5bb493mqddpclh9fn1@xxxxxxxxxx
That was my point; if you define 'char' to be Unicode, it violates
sizeof(char) premises
that are wired deeply into a lot of programs. The resulting chaos of
changing 'char' such
that sizeof(char) == 2 would be disastrous. Think of all the places where
in doing
Unicode we have to *sizeof(TCHAR) or /sizeof(TCHAR).


It seems to me the reason sizeof(char) == 1 is "wired deeply into" a lot of
programs is to workaround the lack of something like _cb() and _cch()
reserved words which would return the number of bytes and the number of
chars, respectively, in the variable passed in as the parameter. Had these
been available, people would have used them instead of sizeof(char) and it
would be a piece of cake to

#define char wchar_t


and have it just work.

I would say the correct behavior at this point is to introduce these
concepts now and let the programmer beware that using them sporadically in
existing codebases may not work. That's no different than giving both
release and debug builds of DLL's and not supporting them being mixed and
matched.

But instead, the standard insists on preserving some ancient rule that says
sizeof(char) == 1, and we all have to suffer the shortcomings (e.g. using
_T()) forever and ever. For reasons I don't understand or appreciate.

-- David




Joseph M. Newcomer [MVP]
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
.



Relevant Pages

  • Re: Getting prepared for Unicode
    ... more straightforward because one char would always contain ... Unicode means WideString, ie a string of 16-bit characters. ... the careful programmer will always be able to work in any kind ...
    (borland.public.delphi.non-technical)
  • Re: Getting prepared for Unicode
    ... Unicode means WideString, ie a string of 16-bit characters. ... the careful programmer will always be able to work in any kind of framework we set up. ... Let's disregard combining characters and higher level lexical elements at this stage, and leave that for the VCL to handle, not the language itself. ... There's no such thing as UCS-2 ...
    (borland.public.delphi.non-technical)
  • Re: Suggested Alternative Unicode Implementation (for Rudy+ miscothers)
    ... The idea that "supporting Unicode" is a simple case of changing ... from ANSI strings to Unicode ones. ... The obsession with propogating the myth of is what has created ...
    (borland.public.delphi.non-technical)
  • Re: Need help converting a CString to 8 bit Ascii char string
    ... You are royally screwed if you have UNICODE selected globally (project ... options) and the previous programmer has been using Unicode features. ... 'prevent' buffer overflows), and if you have one CString, you might as well ...
    (microsoft.public.vc.mfc)
  • Re: ascii codec cant encode character uxf3
    ... Now I can print the tags with no aparent problem. ... insert that value into a PostgreSQL data base I get the same error. ... createdb -E UNICODE oggtest ... > programmer must know it) to unicode and the second one converts it back ...
    (comp.lang.python)