Re: #define and (brackets)
- From: "Igor Tandetnik" <itandetnik@xxxxxxxx>
- Date: Fri, 28 Nov 2008 14:00:14 -0500
"Alan Carre" <alan@xxxxxxxxxxxxxxxxx> wrote in message
news:ezDUT0XUJHA.4372@xxxxxxxxxxxxxxxxxxxx
"Igor Tandetnik" <itandetnik@xxxxxxxx> wrote in message
news:%23$3ZwcXUJHA.6092@xxxxxxxxxxxxxxxxxxxxxxx
lots and lots and lots of information which I can hardly (pre)process
myself...
Ok well, then so it seperates everything into these tokens and then
pulls them out (or pops them off or whatever) as seperate entities
which presumably are all seperable by spaces correct?
Yes, you could take a sequence of tokens generated after phase 6, and
output them to a file separated by spaces, if you are so inclined. When
the resulting file is translated, after phase 6 the same sequence of
tokens would be produced as that from the original file.
.. You know, actually I find that a little difficult to comprehend:
We all know that white-space characters are significant in the case
of consecutive minus-sign tokens.
And that's precisely why, in the process above, you would add a space
between two - tokens, so that they don't accidentally get mistaken for a
single -- token when the file is re-translated. It doesn't matter by
that point whether the two - tokens were explicitly present in the
source file (in which case they would necessarily have been separated by
whitespace) or were produced by the preprocessor's manipulation of the
token stream (in which case they might not be).
So how can that be? How can it be
that there exists a point where we can eliminate all whitespace
without changing the program?
Roughly, by that time, the program is not represented by a single
string, but by an array of strings, where each element stands for a
single token. Consider:
char* sequence1[] = ["-", "-"];
char* sequence2[] = ["--"];
The two sequences are obviously different, when compared
element-by-element. But if you were to concatenate their elements into a
single string, you would end up with the same string. And then it would
be impossible to tell which sequence this string was originally produced
from.
The compiler proper (phase 7 onward) works on this array of strings,
detecting sequences of tokens that form various language constructs.
I mean what about *pnNum1/ *pnNumber ?
I can't remove that space following the division symbol, the whole
program would become a comment...
Phase 3 (tokenizer) turns the sequence of characters "*pnNum1/
*pnNumber" into the sequence of tokens ["*", "pnNum1", "/", ws, "*",
"pnNumber"]. "ws" here stands for a special whitespace token: phase 3
needs to preserve it only because the sequence might be passed to #
operator in phase 4 (preprocessor), where whitespace is still somewhat
significant (it generates a single space character in the resulting
string literal). After phase 6, ws tokens are dropped, and the compiler
(phase 7) sees only ["*", "pnNum1", "/", "*", "pnNumber"].
In any case, what if one's macro was, in fact, "--X" (ie.
intentionally decrementing X)?
You mean, with "--" actually appearing in the source text? It would be
parsed as a single token in phase 3, and would then travel as a single
token through the rest of the process. Phase 3 doesn't know nor care
whether the token is or isn't part of a macro.
Should the preprocessor seperate those
minus signs as well?
Of course not. After phase 3, the preprocessor (phase 4) sees a sequence
of two tokens ["--", "X"]. It's possible that X is a macro, which for
example expands into a sequence of tokens ["-", "10"]. Then, after
preprocessing, the resulting sequence becomes ["--", "-", "10"], which
is fed to the compiler (phase 7).
How come they are treated differently?
In what way do you believe they are treated differently?
Is the
preprocessor aware of C++ syntax?
No (unless you count preprocessing directives as part of the C++ syntax,
which formally they are, but I understand what you are trying to say
here).
Is it actually compiling when it's
decifering macro expressions?
No, in the sense that it doesn't parse C++ language constructs other
than preprocessing directives.
The ultimate arbiter is the compiler.
I wonder what you mean by _the_ compiler. Which implementation, and
which version of that implementation, run with which command line
options, do you hold up as the ultimate arbiter?
Is there such a thing as a "compiler bug"? What happens when two
different compilers contradict each other, by producing different
results when given the same source - which one is more ultimate than the
other? Heck, what happens when the same compiler contradicts itself (as
was shown in this thread when the same source file either passes or
fails compilation when processed by the same compiler in two different
ways)?
It's like saying that the ultimate arbiter of what an electrical plug
should look like is the electrical outlet in your wall. If you have two
of them, slightly different, which one is defective? If a particular
plug doesn't fit the outlet, why would you automatically assume the plug
is defective, and not the outlet? That's what we have standards for.
The C++ language is what the C++ standard says it is. A C++ compiler
that doesn't follow the C++ standard is called non-conforming, which is
just a eupemism for "buggy". The way you demonstrate compiler bugs is
you produce an example program, figure out from the normative language
of the standard how it should behave, then observe that it behaves
differently when processed by a particular compiler.
--
With best wishes,
Igor Tandetnik
With sufficient thrust, pigs fly just fine. However, this is not
necessarily a good idea. It is hard to be sure where they are going to
land, and it could be dangerous sitting under them as they fly
overhead. -- RFC 1925
.
- Follow-Ups:
- Re: #define and (brackets)
- From: Alan Carre
- Re: #define and (brackets)
- From: Tommy
- Re: #define and (brackets)
- References:
- #define and (brackets)
- From: Gerry Hickman
- Re: #define and (brackets)
- From: David Webber
- Re: #define and (brackets)
- From: Alan Carre
- Re: #define and (brackets)
- From: Igor Tandetnik
- Re: #define and (brackets)
- From: Alan Carre
- Re: #define and (brackets)
- From: Alan Carre
- Re: #define and (brackets)
- From: Igor Tandetnik
- Re: #define and (brackets)
- From: Alan Carre
- Re: #define and (brackets)
- From: Igor Tandetnik
- Re: #define and (brackets)
- From: Alan Carre
- #define and (brackets)
- Prev by Date: Re: Accessing COM objects through SAFEARRAY elements
- Next by Date: Re: Accessing COM objects through SAFEARRAY elements
- Previous by thread: Re: #define and (brackets)
- Next by thread: Re: #define and (brackets)
- Index(es):
Relevant Pages
|
Loading