Re: CommandLineToArgvA?
- From: "Igor Tandetnik" <itandetnik@xxxxxxxx>
- Date: Sun, 7 Jun 2009 12:49:09 -0400
Vincent Fatica wrote:
On Sun, 7 Jun 2009 11:44:47 -0400, "Igor Tandetnik"
<itandetnik@xxxxxxxx> wrote:
No. But if the current system codepage is in fact CP1253 (Windows
codepage for Greek), and the caller did want to pass some Greek
characters to you, you will silently convert them to accented latin
characters that just happen to have the same codes in Latin-1 aka
ISO-8859-1 codepage (which is what Unicode codepoints U+0000 through
U+00FF correspond to, for historical reasons).
For example, GREEK CAPITAL LETTER ALPHA is code 193 (hex 0xC1) in
CP1253. But you will interpret it as U+00C1, LATIN CAPITAL LETTER A
WITH ACUTE.
What's the problem? When I convert each Unicode argv back to MBCS
with
while ( *p++ == (CHAR) *wp++ );
won't it go back to 193 (and again be interpreted as GREEK CAPITAL
LETTER ALPHA)? I don't think CommandLineToArgvW cares whether it's
GREEK CAPITAL LETTER ALPHA or LATIN CAPITAL LETTER A WITH ACUTE. I'm
assuming CommandLineToArgvW only **interprets** whitespace,
backslashes, and double-quotes.
Ah, I didn't realize you were going to Unicode and back. Anyway, you'd
still have problems with true double-byte encodings, like Chinese BIG-5
or Japanese Shift-JIS. In these encodings, some characters are
represented by two bytes, called lead byte and trailing byte. Lead byte
always has high bit set, but trailing byte could have any value at all,
including values that just happen to be the same as ASCII codes for
space, backslash or double quote.
Your naive algorithm will convert such double-byte character to two
independent Unicode codepoints. The codepoint corresponding to the
trailing byte could then be interpreted by CommandLineToArgvW as a
separator. As a result, a) some parameter will be broken up in the
middle, and b) when your algorithm converts back from Unicode to MBCS,
you'll end up with a lead byte not followed by a trailing byte (or
followed by an unrelated ASCII character that will be misinterpreted as
a trailing byte).
--
With best wishes,
Igor Tandetnik
With sufficient thrust, pigs fly just fine. However, this is not
necessarily a good idea. It is hard to be sure where they are going to
land, and it could be dangerous sitting under them as they fly
overhead. -- RFC 1925
.
- Follow-Ups:
- Re: CommandLineToArgvA?
- From: Vincent Fatica
- Re: CommandLineToArgvA?
- References:
- CommandLineToArgvA?
- From: Vincent Fatica
- Re: CommandLineToArgvA?
- From: Tim Roberts
- Re: CommandLineToArgvA?
- From: Vincent Fatica
- Re: CommandLineToArgvA?
- From: Tim Roberts
- Re: CommandLineToArgvA?
- From: Vincent Fatica
- Re: CommandLineToArgvA?
- From: random . coder
- Re: CommandLineToArgvA?
- From: Vincent Fatica
- Re: CommandLineToArgvA?
- From: Scot T Brennecke
- Re: CommandLineToArgvA?
- From: Vincent Fatica
- Re: CommandLineToArgvA?
- From: Igor Tandetnik
- Re: CommandLineToArgvA?
- From: Vincent Fatica
- Re: CommandLineToArgvA?
- From: Igor Tandetnik
- Re: CommandLineToArgvA?
- From: Vincent Fatica
- CommandLineToArgvA?
- Prev by Date: Re: CommandLineToArgvA?
- Next by Date: Re: CommandLineToArgvA?
- Previous by thread: Re: CommandLineToArgvA?
- Next by thread: Re: CommandLineToArgvA?
- Index(es):
Relevant Pages
|