Re: More MSDN lies: RtlStringCchLength
- From: "Norman Diamond" <ndiamond@xxxxxxxxxxxxxxxx>
- Date: Wed, 24 Aug 2005 18:09:59 +0900
"Jochen Kalmbach [MVP]" <nospam-Jochen.Kalmbach@xxxxxxxxx> wrote in message news:%23It%23$PIqFHA.3720@xxxxxxxxxxxxxxxxxxxxxxx
Hi Norman!
Instead of "character" MS should have used "codepoint".
But "codepoint" doesn't mean byte because each character has a codepoint.
No
Wrong.
Codepoint means simply a defined value in a defined code-space.
Right.
So for MBCS (A) the code-space is from 0x00 to 0xFF
Wrong. The code-space depends on which code page is in use. Surely you know that the codepoint for "笹" is larger than 0xFF and the codepoint for "塚" is larger than 0xFF. By the way these characters exist in Chinese too, and in Chinese codepages these characters also have codepoints larger than 0xFF (and probably different from the codepoints that they occupy in codepage 932).
For wchar (W) the code-space is from 0x0000 to 0xFFFF
Yes (for Microsoft's wchar code-space).
And this is exactly what this function are using as "characters".
No, because some parts of MSDN use "characters" to really mean "characters".
"byte" is only true for A-versions, for W-version it is false,
You are right. MSDN should say that the A version counts bytes and the W version counts characters.
No. W also does not count characters. It also counts codepoints!!!
Wrong. Wrong. Right. The W version does count characters and the W version also counts codepoints. But the A version does not count codepoints. MSDN lies about the A version.
Unicode has currently a range from 0x0 to 0x10FFFF so it does not fit into wchar!
True, Microsoft's Unicode is a subset of real Unicode (except for a few exceptions, I think).
Therefor you also need to do some kind of mutli-codepoint for one character.
No. To handle all of Unicode's codepoints you need multi-Microsoft-somethings, which Microsoft doesn't handle (except for a few exceptions, I think).
MSDN really needs fixing (again).
Yes. It should update the corresponding docus to replace "characters" with "codepoint".
Well, notice that the cited MSDN pages already give two separate function headers instead of trying to unify them the way user-space MSDN pages do. So I thought that it would not be too hard to fix the pages clearly and correctly by giving two separate descriptions too. But you want to see a single word fixed in a single description. In that case, the word "characters" could be replaced by "TCHARs". But I'm still not sure if TCHARs are supposed to exist in kernel mode or not -- although ntddk.h and wdm.h export definitions of some subset of the user-mode TCHAR stuff, it seems that maybe that's a bug and maybe these headers weren't supposed to export any TCHAR definitions.
.
- Follow-Ups:
- Re: More MSDN lies: RtlStringCchLength
- From: Jochen Kalmbach [MVP]
- Re: More MSDN lies: RtlStringCchLength
- References:
- More MSDN lies: RtlStringCchLength
- From: Norman Diamond
- Re: More MSDN lies: RtlStringCchLength
- From: Jochen Kalmbach [MVP]
- Re: More MSDN lies: RtlStringCchLength
- From: Norman Diamond
- Re: More MSDN lies: RtlStringCchLength
- From: Jochen Kalmbach [MVP]
- More MSDN lies: RtlStringCchLength
- Prev by Date: Re: More MSDN lies: RtlStringCchLength
- Next by Date: RE: How can I prepare a SCSI device for removal?
- Previous by thread: Re: More MSDN lies: RtlStringCchLength
- Next by thread: Re: More MSDN lies: RtlStringCchLength
- Index(es):
Relevant Pages
|