471,579 Members | 1,872 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,579 software developers and data experts.

Odd preprocessor behaviour?

Is the following code standard-compliant, and if so what should it do?
And where in the standard defines the behaviour?

#include <stdio.h>

#define DEF defined XXX

int main(void)
{
int defined = 2;
#if ! DEF
#define XXX +
printf("%d\n", DEF 1);
#endif
return 0;
}

How about the following code?

#include <stdio.h>

#define CAT(a) defined a ## X

int main(void)
{
#if CAT(XX)
printf("XXX defined\n");
#endif
return 0;
}

As far as I can see the relative precedence of the join (##) operator
and the defined operator (in an #if) are not stated anywhere.

(Incidentally, using GCC 2.95.4 the first example works and parses the
'defined' in the macros as an operator in the #if and as an identifier
(variable) in the printf statement. The second fails with an error
(`defined' without an identifier). GCC 3.0 allows both but gives a
warning about using 'defined' during macro expansion -- but expands and
uses it, implementing the join operator before testing for 'XXX'
defined. I haven't tried other compilers yet...)

If a compiler (or preprocessor) were to say that using the 'defined'
operator during macro expansion is always an error, would it be breaking
the standard-compliance (and if so, where in the standard)? Is there a
difference between C89, C99 and C++ preprocessor behaviour in this?

Thanks,

Chris C
Nov 14 '05 #1
13 1924
Chris Croughton wrote:
Is the following code standard-compliant, and if so what should it do?
And where in the standard defines the behaviour?

#include <stdio.h>

#define DEF defined XXX

int main(void)
{
int defined = 2;
#if ! DEF
#define XXX +
printf("%d\n", DEF 1);
#endif
return 0;
}

How about the following code?

#include <stdio.h>

#define CAT(a) defined a ## X

int main(void)
{
#if CAT(XX)
printf("XXX defined\n");
#endif
return 0;
}

As far as I can see the relative precedence of the join (##) operator
and the defined operator (in an #if) are not stated anywhere.

(Incidentally, using GCC 2.95.4 the first example works and parses the
'defined' in the macros as an operator in the #if and as an identifier
(variable) in the printf statement. The second fails with an error
(`defined' without an identifier). GCC 3.0 allows both but gives a
warning about using 'defined' during macro expansion -- but expands and
uses it, implementing the join operator before testing for 'XXX'
defined. I haven't tried other compilers yet...)

If a compiler (or preprocessor) were to say that using the 'defined'
operator during macro expansion is always an error, would it be breaking
the standard-compliance (and if so, where in the standard)? Is there a
difference between C89, C99 and C++ preprocessor behaviour in this? In section 6.10.1, the C99 standard says:

"Preprocessing directives of the forms

#if <constant-expression> new-line ...

Prior to evaluation, macro invocations in the list of preprocessing
tokens that will become the controlling constant expression are
replaced... If the token "defined" is generated as a result of this
replacement process ... the behaviour is undefined."

so in

#if CAT(XX)

you get undefined behaviour. And joins are evaluated before #if's. At
least that's my reading.

Robert
Thanks,

Chris C

Nov 14 '05 #2
Chris Croughton <ch***@keristor.net> wrote:
Is the following code standard-compliant, and if so what should it do?
And where in the standard defines the behaviour?
n869.txt 6.10.1#3:
If the token `defined' is generated as a result of this replacement
process or use of the `defined' unary operator does not match one
of the two specified forms prior to macro replacement, the behavior
is undefined.
(I added the single quotes for better readability - in the Std this is
in Courier font.)
#include <stdio.h> #define DEF defined XXX int main(void)
{
int defined = 2;
#if ! DEF
UB.
#define XXX +
printf("%d\n", DEF 1);
#endif
return 0;
} How about the following code? #include <stdio.h> #define CAT(a) defined a ## X int main(void)
{
#if CAT(XX)
UB (for same reason).
printf("XXX defined\n");
#endif
return 0;
} As far as I can see the relative precedence of the join (##) operator
and the defined operator (in an #if) are not stated anywhere.
In a "single preprocessing loop" `#' operator is applied first (during
the expansion); `##' is resolved at the last step (just before
the result of partial expansion is subjected to further expansion
again - see "Rescanning...").

This way they cannot coexist in a single "preprocessing expression",
because string tokens (or string token and something-else) don't form
a single valid token when pasted:
/*BAD CODE*/
#define DOUBLE_STR(x) # x ## # x
DOUBLE_STR(abc)
won't work, because
"abc""abc"
is not a valid pp-token (they are, actually, two valid pp-tokens, and would
be merged into a single C token "abcabc" before proper code translation).
(Incidentally, using GCC 2.95.4 the first example works and parses the
'defined' in the macros as an operator in the #if and as an identifier
(variable) in the printf statement. The second fails with an error
(`defined' without an identifier). GCC 3.0 allows both but gives a
warning about using 'defined' during macro expansion -- but expands and
uses it, implementing the join operator before testing for 'XXX'
defined. I haven't tried other compilers yet...)
All of them are "correct" in sense that they don't break the Standard.
As you have learned, you can't count on anything here, though.
If a compiler (or preprocessor) were to say that using the 'defined'
operator during macro expansion is always an error, would it be breaking
the standard-compliance (and if so, where in the standard)?
Of course not! Because it's just undefined - anything is allowed.
Is there a
difference between C89, C99 and C++ preprocessor behaviour in this?


No.

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #3
S.Tobias <si***@famous.bedbug.pals.invalid> wrote:
In a "single preprocessing loop" `#' operator is applied first (during


I meant: single preprocessing loop step

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #4
On Mon, 22 Nov 2004 23:00:46 GMT, Robert Harris
<ro*****************@blueyonder.co.uk> wrote:
Chris Croughton wrote:

As far as I can see the relative precedence of the join (##) operator
and the defined operator (in an #if) are not stated anywhere.
In section 6.10.1, the C99 standard says:

"Preprocessing directives of the forms

#if <constant-expression> new-line ...

Prior to evaluation, macro invocations in the list of preprocessing
tokens that will become the controlling constant expression are
replaced... If the token "defined" is generated as a result of this
replacement process ... the behaviour is undefined."


Ah, I think I read that as the token 'defined' being generated by using
## for instance (def##ined), but I see that it does include "generated
by the expansion of a macro". That makes sense.
so in

#if CAT(XX)

you get undefined behaviour. And joins are evaluated before #if's. At
least that's my reading.


So if it's undefined, my preprocessing program can do whatever it wants
and no one can complain about it because they shouldn't be doing it
anyway. Not that my program was supposed to be a full preprocessor
originally, it's just getting that way...

Thanks,

Chris C
Nov 14 '05 #5
On 22 Nov 2004 23:52:56 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:
Chris Croughton <ch***@keristor.net> wrote:
Is the following code standard-compliant, and if so what should it do?
And where in the standard defines the behaviour?


n869.txt 6.10.1#3:
If the token `defined' is generated as a result of this replacement
process or use of the `defined' unary operator does not match one
of the two specified forms prior to macro replacement, the behavior
is undefined.
(I added the single quotes for better readability - in the Std this is
in Courier font.)


Yes, I see, I was misreading the term 'generated' (I was thinking of it
being something like def##ined, rather than a token within a macro).
As far as I can see the relative precedence of the join (##) operator
and the defined operator (in an #if) are not stated anywhere.


In a "single preprocessing loop" `#' operator is applied first (during
the expansion); `##' is resolved at the last step (just before
the result of partial expansion is subjected to further expansion
again - see "Rescanning...").

This way they cannot coexist in a single "preprocessing expression",
because string tokens (or string token and something-else) don't form
a single valid token when pasted:
/*BAD CODE*/
#define DOUBLE_STR(x) # x ## # x
DOUBLE_STR(abc)
won't work, because
"abc""abc"
is not a valid pp-token (they are, actually, two valid pp-tokens, and would
be merged into a single C token "abcabc" before proper code translation).


Ah, I see the difference, it's a valid sequence of characters only at
the lexical level, but not as a single token.
(Incidentally, using GCC 2.95.4 the first example works and parses the
'defined' in the macros as an operator in the #if and as an identifier
(variable) in the printf statement. The second fails with an error
(`defined' without an identifier). GCC 3.0 allows both but gives a
warning about using 'defined' during macro expansion -- but expands and
uses it, implementing the join operator before testing for 'XXX'
defined. I haven't tried other compilers yet...)


All of them are "correct" in sense that they don't break the Standard.
As you have learned, you can't count on anything here, though.


Indeed. However, for my purpose (writing a preprocessor) it is good
because it means that whatever I do will not be wrong, since if anyone
uses the construct they can't depend on it being portable anyway.
If a compiler (or preprocessor) were to say that using the 'defined'
operator during macro expansion is always an error, would it be breaking
the standard-compliance (and if so, where in the standard)?


Of course not! Because it's just undefined - anything is allowed.


And it doesn't even have to be documented (unlike implementation-defined
behaviour).
Is there a
difference between C89, C99 and C++ preprocessor behaviour in this?


No.


That's a relief, I really did not want to get into multiple switches to
handle different standards...

Thanks,

Chris C

Nov 14 '05 #6
Chris Croughton <ch***@keristor.net> wrote:
On 22 Nov 2004 23:52:56 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote: Indeed. However, for my purpose (writing a preprocessor) it is good
Be warned: writing C preprocessor is not an especially easy task!
If you need a stand-alone pp, `mcpp' is said to be good.
If you need a cold shower, have a look at ::boost::preprocessor.

Is there a
difference between C89, C99 and C++ preprocessor behaviour in this?


No.

That's a relief, I really did not want to get into multiple switches to
handle different standards...


Hey, not so fast! I answered your question on differences "in this",
nothing else. If you're writing an include-all preprocessor, you'll
have to have some.

I don't think there are major differences in expansion algorithm between
those standards, but there are differences nevertheless. C99 added a
few things to the preprocessor, the most prominent being the ellipsis.
I don't think there is big change from C89 to C++, but there are slight
differences in the text from the beginning on; I think this might be related
to new keywords (arithmetic operators) in C++; it might be worth asking
in C++ NG what those differences are.

Last but not least, there are bound to be differences between
implementations of the same standard. C preprocessor is part of the
compiler and expression evaluation in the #if directive depends on how
certain types are implemented in the language (some people say here "C
preprocessor doesn't know C" - I think this is not the whole story).
For example, in C89:
#if 0xffffffff + 1
will depend on the value of LONG_MAX.
--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #7
On 23 Nov 2004 15:25:51 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:
Chris Croughton <ch***@keristor.net> wrote:
On 22 Nov 2004 23:52:56 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:
Indeed. However, for my purpose (writing a preprocessor) it is good


Be warned: writing C preprocessor is not an especially easy task!
If you need a stand-alone pp, `mcpp' is said to be good.
If you need a cold shower, have a look at ::boost::preprocessor.


See my other recent thread, I'm writing a pre-preprocessor, as in the
idea of scpp and rmif, to process out 'known' #ifs and defines (and
known undefined symbols). It doesn't have to do everything (indeed, it
already does more than I need it to do, things like the join operator
are extra but if I'm going to add them I may as well get them right).
It doesn't deal with any preprocessor lines apart from #if #elif #else
#endif, #define and #undef.
>> Is there a
>> difference between C89, C99 and C++ preprocessor behaviour in this?
>
> No.

That's a relief, I really did not want to get into multiple switches to
handle different standards...


Hey, not so fast! I answered your question on differences "in this",
nothing else. If you're writing an include-all preprocessor, you'll
have to have some.


Ah yes, I do know about others.
I don't think there are major differences in expansion algorithm between
those standards, but there are differences nevertheless. C99 added a
few things to the preprocessor, the most prominent being the ellipsis.
Oh yes, I've already decided that I'm not going to do the varargs part
(among other things because the GCC syntax for it is a de facto standard
for a large chunk of code and it's not compatible with the C99 syntax).

Incidentally, is it possible to get a copy of the C89 standard now?
I've found several things where I want to be maximally compatible but I
only have the C99 standard.
I don't think there is big change from C89 to C++, but there are slight
differences in the text from the beginning on; I think this might be related
to new keywords (arithmetic operators) in C++; it might be worth asking
in C++ NG what those differences are.
Comparing them is a pain, grep notices differences in line lengths and
word wrapping.
Last but not least, there are bound to be differences between
implementations of the same standard. C preprocessor is part of the
compiler and expression evaluation in the #if directive depends on how
certain types are implemented in the language (some people say here "C
preprocessor doesn't know C" - I think this is not the whole story).
Indeed it isn't. The C preprocessor doesn't need to know anything about
the compiler for C code (indeed, C preprocessors are frequently used for
assembler preprocessing), but in most cases it does know something about
it (common implementation of integer operators, for instance).
For example, in C89:
#if 0xffffffff + 1
will depend on the value of LONG_MAX.


Or in C99 on the value of INTMAX_MAX. My program tries to work out the
"largest integer type", but that's one area where I'm knowingly not
conforming because I'm not distinguishing between signed and unsigned
values (I treat them all as signed).

Chris C
Nov 14 '05 #8
Chris Croughton <ch***@keristor.net> wrote:
On 23 Nov 2004 15:25:51 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote: Incidentally, is it possible to get a copy of the C89 standard now?


I suppose a simple answer whould be: no, unless you're rich.
This has been discussed in c.s.c., so check for answers there.

To have access to C89 text I bought Schildt's book on Amazon, and
keep my right I shut when I read it (but that might not be enough,
http://www.lysator.liu.se/c/schildt.html#6-1-3-1).

For practical purposes I prefer to work with
http://danpop.home.cern.ch/danpop/ansi.c

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #9
S.Tobias <si***@famous.bedbug.pals.invalid> wrote:
keep my right I shut when I read it (but that might not be enough,

damn fingers!: ^^^ eye

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #10
On 24 Nov 2004 13:55:36 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:
Chris Croughton <ch***@keristor.net> wrote:
On 23 Nov 2004 15:25:51 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:
Incidentally, is it possible to get a copy of the C89 standard now?


I suppose a simple answer whould be: no, unless you're rich.
This has been discussed in c.s.c., so check for answers there.

To have access to C89 text I bought Schildt's book on Amazon, and
keep my right I shut when I read it (but that might not be enough,
http://www.lysator.liu.se/c/schildt.html#6-1-3-1).


Yes, getting the annotations wrong is one thing but getting the
specification itself wrong... One of the pages is also missing (or
rather one is duplicated instead of the page which should have followed
it).

I like Clive's comments (there and in other places)...
For practical purposes I prefer to work with
http://danpop.home.cern.ch/danpop/ansi.c


<innocent>
I downloaded that ansi.c file and tried to compile it, but GCC reported
lots of errors...
</innocent>

Thanks, that's a very useful link. I have somewhere a marked-up draft
from the Working Committee (a cow-orker was a commentator), but several
things changed between then and the ANSi standard being published...

Chris C
Nov 14 '05 #11
On 23 Nov 2004 15:25:51 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:
Last but not least, there are bound to be differences between
implementations of the same standard. C preprocessor is part of the
compiler and expression evaluation in the #if directive depends on how
certain types are implemented in the language (some people say here "C
preprocessor doesn't know C" - I think this is not the whole story).
For example, in C89:
#if 0xffffffff + 1
will depend on the value of LONG_MAX.


There's another one which comes out of your description of making
'illegal' tokens.

#define CAT(a,b) a ## b

char *p = CAT("a", "b");
int i = CAT(1 + 8, / 2);

makes the invalid preprocessing tokens "a""b" and 8/ when expanded.
However, if the preprocessor is run separately from the compiler the
expansion is the perfectly valid code

char *p = "a""b";
int i = 1 + 8/ 2;

(GCC gives warnings about "invalid preprocessor token", but then goes on
to compile the code happily doing what one would expect. But it's UB, a
compiler could fail or produce complete rubbish if it wanted.)

Gah. What was that saying about a mouse designed by a committee? (It
could be worse. Perl /is/ worse...) <g>

Chris C
Nov 14 '05 #12

"Chris Croughton" <ch***@keristor.net> wrote in message
news:sl******************@ccserver.keris.net...
Gah. What was that saying about a mouse designed by a committee?
Thay say a camel is a horse designed by commitee...
(It could be worse. Perl /is/ worse...) <g>


I'd love to hear Dijkstra's comments on that language.
Nov 14 '05 #13
On Thu, 25 Nov 2004 13:15:40 +0100, dandelion
<da*******@meadow.net> wrote:
"Chris Croughton" <ch***@keristor.net> wrote in message
news:sl******************@ccserver.keris.net...
Gah. What was that saying about a mouse designed by a committee?


Thay say a camel is a horse designed by commitee...


I think it was an elephant which was a mouse designed by committee. Or
by IBM <g>...
(It could be worse. Perl /is/ worse...) <g>


I'd love to hear Dijkstra's comments on that language.


Only from a safe distance!

Chris C
Nov 14 '05 #14

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

25 posts views Thread by Sabyasachi Basu | last post: by
205 posts views Thread by Jeremy Siek | last post: by
24 posts views Thread by Nudge | last post: by
16 posts views Thread by Trying_Harder | last post: by
18 posts views Thread by /* frank */ | last post: by
9 posts views Thread by Walter Roberson | last post: by
31 posts views Thread by Sam of California | last post: by
reply views Thread by XIAOLAOHU | last post: by
reply views Thread by leo001 | last post: by
reply views Thread by lumer26 | last post: by
reply views Thread by Vinnie | last post: by
1 post views Thread by lumer26 | last post: by
reply views Thread by lumer26 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.