473,396 Members | 2,115 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Odd preprocessor behaviour?

Is the following code standard-compliant, and if so what should it do?
And where in the standard defines the behaviour?

#include <stdio.h>

#define DEF defined XXX

int main(void)
{
int defined = 2;
#if ! DEF
#define XXX +
printf("%d\n", DEF 1);
#endif
return 0;
}

How about the following code?

#include <stdio.h>

#define CAT(a) defined a ## X

int main(void)
{
#if CAT(XX)
printf("XXX defined\n");
#endif
return 0;
}

As far as I can see the relative precedence of the join (##) operator
and the defined operator (in an #if) are not stated anywhere.

(Incidentally, using GCC 2.95.4 the first example works and parses the
'defined' in the macros as an operator in the #if and as an identifier
(variable) in the printf statement. The second fails with an error
(`defined' without an identifier). GCC 3.0 allows both but gives a
warning about using 'defined' during macro expansion -- but expands and
uses it, implementing the join operator before testing for 'XXX'
defined. I haven't tried other compilers yet...)

If a compiler (or preprocessor) were to say that using the 'defined'
operator during macro expansion is always an error, would it be breaking
the standard-compliance (and if so, where in the standard)? Is there a
difference between C89, C99 and C++ preprocessor behaviour in this?

Thanks,

Chris C
Nov 14 '05 #1
13 2072
Chris Croughton wrote:
Is the following code standard-compliant, and if so what should it do?
And where in the standard defines the behaviour?

#include <stdio.h>

#define DEF defined XXX

int main(void)
{
int defined = 2;
#if ! DEF
#define XXX +
printf("%d\n", DEF 1);
#endif
return 0;
}

How about the following code?

#include <stdio.h>

#define CAT(a) defined a ## X

int main(void)
{
#if CAT(XX)
printf("XXX defined\n");
#endif
return 0;
}

As far as I can see the relative precedence of the join (##) operator
and the defined operator (in an #if) are not stated anywhere.

(Incidentally, using GCC 2.95.4 the first example works and parses the
'defined' in the macros as an operator in the #if and as an identifier
(variable) in the printf statement. The second fails with an error
(`defined' without an identifier). GCC 3.0 allows both but gives a
warning about using 'defined' during macro expansion -- but expands and
uses it, implementing the join operator before testing for 'XXX'
defined. I haven't tried other compilers yet...)

If a compiler (or preprocessor) were to say that using the 'defined'
operator during macro expansion is always an error, would it be breaking
the standard-compliance (and if so, where in the standard)? Is there a
difference between C89, C99 and C++ preprocessor behaviour in this? In section 6.10.1, the C99 standard says:

"Preprocessing directives of the forms

#if <constant-expression> new-line ...

Prior to evaluation, macro invocations in the list of preprocessing
tokens that will become the controlling constant expression are
replaced... If the token "defined" is generated as a result of this
replacement process ... the behaviour is undefined."

so in

#if CAT(XX)

you get undefined behaviour. And joins are evaluated before #if's. At
least that's my reading.

Robert
Thanks,

Chris C

Nov 14 '05 #2
Chris Croughton <ch***@keristor.net> wrote:
Is the following code standard-compliant, and if so what should it do?
And where in the standard defines the behaviour?
n869.txt 6.10.1#3:
If the token `defined' is generated as a result of this replacement
process or use of the `defined' unary operator does not match one
of the two specified forms prior to macro replacement, the behavior
is undefined.
(I added the single quotes for better readability - in the Std this is
in Courier font.)
#include <stdio.h> #define DEF defined XXX int main(void)
{
int defined = 2;
#if ! DEF
UB.
#define XXX +
printf("%d\n", DEF 1);
#endif
return 0;
} How about the following code? #include <stdio.h> #define CAT(a) defined a ## X int main(void)
{
#if CAT(XX)
UB (for same reason).
printf("XXX defined\n");
#endif
return 0;
} As far as I can see the relative precedence of the join (##) operator
and the defined operator (in an #if) are not stated anywhere.
In a "single preprocessing loop" `#' operator is applied first (during
the expansion); `##' is resolved at the last step (just before
the result of partial expansion is subjected to further expansion
again - see "Rescanning...").

This way they cannot coexist in a single "preprocessing expression",
because string tokens (or string token and something-else) don't form
a single valid token when pasted:
/*BAD CODE*/
#define DOUBLE_STR(x) # x ## # x
DOUBLE_STR(abc)
won't work, because
"abc""abc"
is not a valid pp-token (they are, actually, two valid pp-tokens, and would
be merged into a single C token "abcabc" before proper code translation).
(Incidentally, using GCC 2.95.4 the first example works and parses the
'defined' in the macros as an operator in the #if and as an identifier
(variable) in the printf statement. The second fails with an error
(`defined' without an identifier). GCC 3.0 allows both but gives a
warning about using 'defined' during macro expansion -- but expands and
uses it, implementing the join operator before testing for 'XXX'
defined. I haven't tried other compilers yet...)
All of them are "correct" in sense that they don't break the Standard.
As you have learned, you can't count on anything here, though.
If a compiler (or preprocessor) were to say that using the 'defined'
operator during macro expansion is always an error, would it be breaking
the standard-compliance (and if so, where in the standard)?
Of course not! Because it's just undefined - anything is allowed.
Is there a
difference between C89, C99 and C++ preprocessor behaviour in this?


No.

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #3
S.Tobias <si***@famous.bedbug.pals.invalid> wrote:
In a "single preprocessing loop" `#' operator is applied first (during


I meant: single preprocessing loop step

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #4
On Mon, 22 Nov 2004 23:00:46 GMT, Robert Harris
<ro*****************@blueyonder.co.uk> wrote:
Chris Croughton wrote:

As far as I can see the relative precedence of the join (##) operator
and the defined operator (in an #if) are not stated anywhere.
In section 6.10.1, the C99 standard says:

"Preprocessing directives of the forms

#if <constant-expression> new-line ...

Prior to evaluation, macro invocations in the list of preprocessing
tokens that will become the controlling constant expression are
replaced... If the token "defined" is generated as a result of this
replacement process ... the behaviour is undefined."


Ah, I think I read that as the token 'defined' being generated by using
## for instance (def##ined), but I see that it does include "generated
by the expansion of a macro". That makes sense.
so in

#if CAT(XX)

you get undefined behaviour. And joins are evaluated before #if's. At
least that's my reading.


So if it's undefined, my preprocessing program can do whatever it wants
and no one can complain about it because they shouldn't be doing it
anyway. Not that my program was supposed to be a full preprocessor
originally, it's just getting that way...

Thanks,

Chris C
Nov 14 '05 #5
On 22 Nov 2004 23:52:56 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:
Chris Croughton <ch***@keristor.net> wrote:
Is the following code standard-compliant, and if so what should it do?
And where in the standard defines the behaviour?


n869.txt 6.10.1#3:
If the token `defined' is generated as a result of this replacement
process or use of the `defined' unary operator does not match one
of the two specified forms prior to macro replacement, the behavior
is undefined.
(I added the single quotes for better readability - in the Std this is
in Courier font.)


Yes, I see, I was misreading the term 'generated' (I was thinking of it
being something like def##ined, rather than a token within a macro).
As far as I can see the relative precedence of the join (##) operator
and the defined operator (in an #if) are not stated anywhere.


In a "single preprocessing loop" `#' operator is applied first (during
the expansion); `##' is resolved at the last step (just before
the result of partial expansion is subjected to further expansion
again - see "Rescanning...").

This way they cannot coexist in a single "preprocessing expression",
because string tokens (or string token and something-else) don't form
a single valid token when pasted:
/*BAD CODE*/
#define DOUBLE_STR(x) # x ## # x
DOUBLE_STR(abc)
won't work, because
"abc""abc"
is not a valid pp-token (they are, actually, two valid pp-tokens, and would
be merged into a single C token "abcabc" before proper code translation).


Ah, I see the difference, it's a valid sequence of characters only at
the lexical level, but not as a single token.
(Incidentally, using GCC 2.95.4 the first example works and parses the
'defined' in the macros as an operator in the #if and as an identifier
(variable) in the printf statement. The second fails with an error
(`defined' without an identifier). GCC 3.0 allows both but gives a
warning about using 'defined' during macro expansion -- but expands and
uses it, implementing the join operator before testing for 'XXX'
defined. I haven't tried other compilers yet...)


All of them are "correct" in sense that they don't break the Standard.
As you have learned, you can't count on anything here, though.


Indeed. However, for my purpose (writing a preprocessor) it is good
because it means that whatever I do will not be wrong, since if anyone
uses the construct they can't depend on it being portable anyway.
If a compiler (or preprocessor) were to say that using the 'defined'
operator during macro expansion is always an error, would it be breaking
the standard-compliance (and if so, where in the standard)?


Of course not! Because it's just undefined - anything is allowed.


And it doesn't even have to be documented (unlike implementation-defined
behaviour).
Is there a
difference between C89, C99 and C++ preprocessor behaviour in this?


No.


That's a relief, I really did not want to get into multiple switches to
handle different standards...

Thanks,

Chris C

Nov 14 '05 #6
Chris Croughton <ch***@keristor.net> wrote:
On 22 Nov 2004 23:52:56 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote: Indeed. However, for my purpose (writing a preprocessor) it is good
Be warned: writing C preprocessor is not an especially easy task!
If you need a stand-alone pp, `mcpp' is said to be good.
If you need a cold shower, have a look at ::boost::preprocessor.

Is there a
difference between C89, C99 and C++ preprocessor behaviour in this?


No.

That's a relief, I really did not want to get into multiple switches to
handle different standards...


Hey, not so fast! I answered your question on differences "in this",
nothing else. If you're writing an include-all preprocessor, you'll
have to have some.

I don't think there are major differences in expansion algorithm between
those standards, but there are differences nevertheless. C99 added a
few things to the preprocessor, the most prominent being the ellipsis.
I don't think there is big change from C89 to C++, but there are slight
differences in the text from the beginning on; I think this might be related
to new keywords (arithmetic operators) in C++; it might be worth asking
in C++ NG what those differences are.

Last but not least, there are bound to be differences between
implementations of the same standard. C preprocessor is part of the
compiler and expression evaluation in the #if directive depends on how
certain types are implemented in the language (some people say here "C
preprocessor doesn't know C" - I think this is not the whole story).
For example, in C89:
#if 0xffffffff + 1
will depend on the value of LONG_MAX.
--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #7
On 23 Nov 2004 15:25:51 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:
Chris Croughton <ch***@keristor.net> wrote:
On 22 Nov 2004 23:52:56 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:
Indeed. However, for my purpose (writing a preprocessor) it is good


Be warned: writing C preprocessor is not an especially easy task!
If you need a stand-alone pp, `mcpp' is said to be good.
If you need a cold shower, have a look at ::boost::preprocessor.


See my other recent thread, I'm writing a pre-preprocessor, as in the
idea of scpp and rmif, to process out 'known' #ifs and defines (and
known undefined symbols). It doesn't have to do everything (indeed, it
already does more than I need it to do, things like the join operator
are extra but if I'm going to add them I may as well get them right).
It doesn't deal with any preprocessor lines apart from #if #elif #else
#endif, #define and #undef.
>> Is there a
>> difference between C89, C99 and C++ preprocessor behaviour in this?
>
> No.

That's a relief, I really did not want to get into multiple switches to
handle different standards...


Hey, not so fast! I answered your question on differences "in this",
nothing else. If you're writing an include-all preprocessor, you'll
have to have some.


Ah yes, I do know about others.
I don't think there are major differences in expansion algorithm between
those standards, but there are differences nevertheless. C99 added a
few things to the preprocessor, the most prominent being the ellipsis.
Oh yes, I've already decided that I'm not going to do the varargs part
(among other things because the GCC syntax for it is a de facto standard
for a large chunk of code and it's not compatible with the C99 syntax).

Incidentally, is it possible to get a copy of the C89 standard now?
I've found several things where I want to be maximally compatible but I
only have the C99 standard.
I don't think there is big change from C89 to C++, but there are slight
differences in the text from the beginning on; I think this might be related
to new keywords (arithmetic operators) in C++; it might be worth asking
in C++ NG what those differences are.
Comparing them is a pain, grep notices differences in line lengths and
word wrapping.
Last but not least, there are bound to be differences between
implementations of the same standard. C preprocessor is part of the
compiler and expression evaluation in the #if directive depends on how
certain types are implemented in the language (some people say here "C
preprocessor doesn't know C" - I think this is not the whole story).
Indeed it isn't. The C preprocessor doesn't need to know anything about
the compiler for C code (indeed, C preprocessors are frequently used for
assembler preprocessing), but in most cases it does know something about
it (common implementation of integer operators, for instance).
For example, in C89:
#if 0xffffffff + 1
will depend on the value of LONG_MAX.


Or in C99 on the value of INTMAX_MAX. My program tries to work out the
"largest integer type", but that's one area where I'm knowingly not
conforming because I'm not distinguishing between signed and unsigned
values (I treat them all as signed).

Chris C
Nov 14 '05 #8
Chris Croughton <ch***@keristor.net> wrote:
On 23 Nov 2004 15:25:51 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote: Incidentally, is it possible to get a copy of the C89 standard now?


I suppose a simple answer whould be: no, unless you're rich.
This has been discussed in c.s.c., so check for answers there.

To have access to C89 text I bought Schildt's book on Amazon, and
keep my right I shut when I read it (but that might not be enough,
http://www.lysator.liu.se/c/schildt.html#6-1-3-1).

For practical purposes I prefer to work with
http://danpop.home.cern.ch/danpop/ansi.c

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #9
S.Tobias <si***@famous.bedbug.pals.invalid> wrote:
keep my right I shut when I read it (but that might not be enough,

damn fingers!: ^^^ eye

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #10
On 24 Nov 2004 13:55:36 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:
Chris Croughton <ch***@keristor.net> wrote:
On 23 Nov 2004 15:25:51 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:
Incidentally, is it possible to get a copy of the C89 standard now?


I suppose a simple answer whould be: no, unless you're rich.
This has been discussed in c.s.c., so check for answers there.

To have access to C89 text I bought Schildt's book on Amazon, and
keep my right I shut when I read it (but that might not be enough,
http://www.lysator.liu.se/c/schildt.html#6-1-3-1).


Yes, getting the annotations wrong is one thing but getting the
specification itself wrong... One of the pages is also missing (or
rather one is duplicated instead of the page which should have followed
it).

I like Clive's comments (there and in other places)...
For practical purposes I prefer to work with
http://danpop.home.cern.ch/danpop/ansi.c


<innocent>
I downloaded that ansi.c file and tried to compile it, but GCC reported
lots of errors...
</innocent>

Thanks, that's a very useful link. I have somewhere a marked-up draft
from the Working Committee (a cow-orker was a commentator), but several
things changed between then and the ANSi standard being published...

Chris C
Nov 14 '05 #11
On 23 Nov 2004 15:25:51 GMT, S.Tobias
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote:
Last but not least, there are bound to be differences between
implementations of the same standard. C preprocessor is part of the
compiler and expression evaluation in the #if directive depends on how
certain types are implemented in the language (some people say here "C
preprocessor doesn't know C" - I think this is not the whole story).
For example, in C89:
#if 0xffffffff + 1
will depend on the value of LONG_MAX.


There's another one which comes out of your description of making
'illegal' tokens.

#define CAT(a,b) a ## b

char *p = CAT("a", "b");
int i = CAT(1 + 8, / 2);

makes the invalid preprocessing tokens "a""b" and 8/ when expanded.
However, if the preprocessor is run separately from the compiler the
expansion is the perfectly valid code

char *p = "a""b";
int i = 1 + 8/ 2;

(GCC gives warnings about "invalid preprocessor token", but then goes on
to compile the code happily doing what one would expect. But it's UB, a
compiler could fail or produce complete rubbish if it wanted.)

Gah. What was that saying about a mouse designed by a committee? (It
could be worse. Perl /is/ worse...) <g>

Chris C
Nov 14 '05 #12

"Chris Croughton" <ch***@keristor.net> wrote in message
news:sl******************@ccserver.keris.net...
Gah. What was that saying about a mouse designed by a committee?
Thay say a camel is a horse designed by commitee...
(It could be worse. Perl /is/ worse...) <g>


I'd love to hear Dijkstra's comments on that language.
Nov 14 '05 #13
On Thu, 25 Nov 2004 13:15:40 +0100, dandelion
<da*******@meadow.net> wrote:
"Chris Croughton" <ch***@keristor.net> wrote in message
news:sl******************@ccserver.keris.net...
Gah. What was that saying about a mouse designed by a committee?


Thay say a camel is a horse designed by commitee...


I think it was an elephant which was a mouse designed by committee. Or
by IBM <g>...
(It could be worse. Perl /is/ worse...) <g>


I'd love to hear Dijkstra's comments on that language.


Only from a safe distance!

Chris C
Nov 14 '05 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

25
by: Sabyasachi Basu | last post by:
While trying to port some stuff from Unix to Windows, I encountered a strange behaviour of function macros with empty arguments. Here is a small snippet which illustrates the problem: #include...
205
by: Jeremy Siek | last post by:
CALL FOR PAPERS/PARTICIPATION C++, Boost, and the Future of C++ Libraries Workshop at OOPSLA October 24-28, 2004 Vancouver, British Columbia, Canada http://tinyurl.com/4n5pf Submissions
24
by: Nudge | last post by:
I have an array, and an unrolled loop which looks like this: do_something(A); do_something(A); .... do_something(A); I thought: why should I type so much? I should write a macro. So I was...
16
by: Trying_Harder | last post by:
Is it possible to redefine a macro with global scope after undefining it in a function? If yes, could someone explain how? /If/ my question above isn't very clear you can refer to the...
18
by: /* frank */ | last post by:
My teacher said that array in C is managed by preprocessor. Preprocesser replace all array occurences (i.e. int a ) with something that I don't understand/remember well. What's exactly happens...
9
by: Walter Roberson | last post by:
I have run into a peculiarity with SGI's C compiler (7.3.1.2m). I have been reading carefully over the ANSI X3.159-1989 specification, but I cannot seem to find a justification for the behaviour....
2
by: Paolo | last post by:
I imported a VC++6.0 project into VC++7.1. The conversion operation makes a mess with Preprocessor Definitions, adding a "$(NoInherit)" for each file. For example: I had a DLL project in VC++6.0...
32
by: spibou | last post by:
Is the output of the C preprocessor deterministic ? What I mean by that is , given 2 compilers which conform to the same standard, will their preprocessors produce identical output given as input...
31
by: Sam of California | last post by:
Is it accurate to say that "the preprocessor is just a pass in the parsing of the source file"? I responded to that comment by saying that the preprocessor is not just a pass. It processes...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.