Conditional compilation sans the cpp

Steven T. Hatton

I've made no secret of the fact that I really dislike the C preprocessor in
C++. No aspect of the language has caused me more trouble. No aspect of
the language has cause more code I've read to be difficult to understand.
I've described it as GOTO's on steroids, and that's what it is!.

One argument against abolishing it it that it is useful for conditional
compilation when porting code, etc. Well, it seems to me C++ supports that
natively. According to TC++PL(SE) §24.3.7.2 if a block of code is
bracketted with an if(CONDITION){...} the entire expression is "compiled
away" when CONDITION==false.

Can this not be used in place of the typical #ifdef ... #endif used for
conditional compilation?
--
STH
Hatton's Law: "There is only One inviolable Law"
KDevelop: http://www.kdevelop.org SuSE: http://www.suse.com
Mozilla: http://www.mozilla.org

Jul 22 '05 #1

Subscribe Post Reply

2431

Kai-Uwe Bux

Steven T. Hatton wrote:

I've made no secret of the fact that I really dislike the C preprocessor
in
C++. No aspect of the language has caused me more trouble. No aspect of
the language has cause more code I've read to be difficult to understand.
I've described it as GOTO's on steroids, and that's what it is!.

One argument against abolishing it it that it is useful for conditional
compilation when porting code, etc. Well, it seems to me C++ supports
that
natively. According to TC++PL(SE) §24.3.7.2 if a block of code is
bracketted with an if(CONDITION){...} the entire expression is "compiled
away" when CONDITION==false.

Can this not be used in place of the typical #ifdef ... #endif used for
conditional compilation?

This will work in many places, however, there are some instances where your
idea would not work, namely when the code is not within the body of a
function:

(a) the preprocessor allows you to conditionally include files.
(b) it allows to change the signature of a function.
(c) it allows to include or exclude certain members in a class.

Probably, there is way more.
Best

Kai-Uwe Bux

Jul 22 '05 #2

lilburne

Steven T. Hatton wrote:

I've made no secret of the fact that I really dislike the C preprocessor in
C++. No aspect of the language has caused me more trouble. No aspect of
the language has cause more code I've read to be difficult to understand.
I've described it as GOTO's on steroids, and that's what it is!.

One argument against abolishing it it that it is useful for conditional
compilation when porting code, etc. Well, it seems to me C++ supports that
natively. According to TC++PL(SE) §24.3.7.2 if a block of code is
bracketted with an if(CONDITION){...} the entire expression is "compiled
away" when CONDITION==false.

Can this not be used in place of the typical #ifdef ... #endif used for
conditional compilation?

#ifdef DEVELOPMENT
#ifdef USE_DEBUG
#define DEBUG_FUNC(FUNC) FUNC
#else
#define DEBUG_FUNC(FUNC)
#endif
#endif

int sort(class Collection& col, sortType type)
{
// sort the collection some how
...

DEBUG_FUNC(check_sorted(col,type));
}

is less clutter than:

int sort(class Collection& col, sortType type)
{
// sort the collection some how
...

if (DEBUG_FUNC) {
check_sorted(col,type);
}
}

and you still have the problem of how to get the compiler to know that
DEBUG_FUNC is always false?

Then there is also <cassert> or your favourite replacement.

Jul 22 '05 #3

Karl Heinz Buchegger

"Steven T. Hatton" wrote:

I've made no secret of the fact that I really dislike the C preprocessor in
C++. No aspect of the language has caused me more trouble. No aspect of
the language has cause more code I've read to be difficult to understand.
I've described it as GOTO's on steroids, and that's what it is!.

One argument against abolishing it it that it is useful for conditional
compilation when porting code, etc. Well, it seems to me C++ supports that
natively. According to TC++PL(SE) §24.3.7.2 if a block of code is
bracketted with an if(CONDITION){...} the entire expression is "compiled
away" when CONDITION==false.
But it nevertheless runs through the compiler.

Can this not be used in place of the typical #ifdef ... #endif used for
conditional compilation?

Not in all cases.

Consider:
You want to write a function that deals with directories. On one
system the calls for this are callled eg. GetFirstFile, GetNextFile.
On another system the very same calls are called eg. GetFirst, GetNext.
On a third system the whole mechanism works in a complete different
way.
So the point is here: While on system A there are functions GetFirstFile and
GetNextFile, those functions aren't even available on system B or system C. So
you need a way to 'hide' those function calls from the compiler.

#ifdef SYSTEM_A

i = GetFirstFile( Directory );
while( i ) {
i = GetNextFile( FileName );
Process( FileName );
}

#else if SYSTEM_B

i = GetFirst( Directory );
while( i ) {
i = GetNext( FileName );
Process( FileName );
}

#else if SYSTEM_C

ReadDirectory( Directory, DirStruct );
for( i = 0; i < DirStruct.NrEntries; ++ i )
Process( DirStruct( File[i] );

#endif

When compiling for SYSTEM_A you actually need the compiler to not even
*see* the code for implementations in SYSTEM_B or SYSTEM_C, since SYSTEM_A
simple doesn't have that functions available.

--
Karl Heinz Buchegger
kb******@gascad.at

Jul 22 '05 #4

Steven T. Hatton

lilburne wrote:

int sort(class Collection& col, sortType type)
{
// sort the collection some how
...

DEBUG_FUNC(check_sorted(col,type));
}

is less clutter than:

int sort(class Collection& col, sortType type)
{
// sort the collection some how
...

if (DEBUG_FUNC) {
check_sorted(col,type);
}
This form is more consitant with C++, more immediately intelligible, and
doesn't require the code to be modified behind my back. If you are really
concerned about clutter you could do this:

if (DEBUG_FUNC) {check_sorted(col,type);}

Or depending on its return type:

if (DEBUG_FUNC && check_sorted(col, type){}
and you still have the problem of how to get the compiler to know that
DEBUG_FUNC is always false?
const DEBUG_FUNC=false;
Then there is also <cassert> or your favourite replacement.

I'm not really sure why I need those. What does assert give me that I can't
get from native C++? If the sensible thing were done, there would be a way
to deterministicly resolve names in progams to declarations and definitions
in libraries as needed without the use of header files. Such a mechanism
could be used to locate the definition of DEBUG_FUNC whereever it may be
relative to the code you are compiling. I'm assuming it is a C++ constant,
and not a macro mangle. But as a general rule, I dislike the use of
globals. Be they functions, classes, variables, or constants. I would not
want to encourage such proactices as a general means of changing the
character of a program.

I really wish someone with the power to influence what goes into the
Standard would identify a means to accomplish the few things of value that
the cpp provides in a way that is more deterministic, coherent and
deterministic.

I've come to believe the greatest advantage Java has over C++ regarding ease
of use is not introspection, not garbage collection, not the elimination of
user manipulated pointers, not the restriction against using assignment as
a boolean condition, not the lack of MI, not the simpler and more uniform
syntax, not the extensive set of easy to use libraries. What Java has
over C++ is that it doesn't use the CPP. And it has, hands down, a better
exception handling mechanism. It is simply too hard to build tools that
can evaluate your code in the context of your development environment with
the incoherence introduced by the CPP.

Stroustrup's comment in §24.3.7.2 regarding the use of NDEBUG in conjunction
with /assert/ is: "Like all macro magic, this use of NDEBUG is too
low-level, messy, and error-prone.

A simple example of what is fundamentally wrong with C++'s reliance on the
#include <header> approach is when I tried using /size_t/ in a translation
unit that didn't #include anything else from the Standard Library. It was
undefined. The way I found it was to look it up in ISO/IEC 14882:2003 and
discovered it is defined in #include <cstddef>. The fact that /size_t/ had
always been available without my #including <cstddef> means there are
multiple paths through the headers that can result in such identifiers
being introduced into my code silently. That is a bad thing. It leads to
dependencies on things I am not aware of. THAT DOESN'T SCALE!
--
STH
Hatton's Law: "There is only One inviolable Law"
KDevelop: http://www.kdevelop.org SuSE: http://www.suse.com
Mozilla: http://www.mozilla.org

Jul 22 '05 #5

lilburne

Steven T. Hatton wrote:

lilburne wrote:

int sort(class Collection& col, sortType type)
{
// sort the collection some how
...

DEBUG_FUNC(check_sorted(col,type));
}

is less clutter than:

int sort(class Collection& col, sortType type)
{
// sort the collection some how
...

if (DEBUG_FUNC) {
check_sorted(col,type);
}

This form is more consitant with C++, more immediately intelligible, and
doesn't require the code to be modified behind my back. If you are really
concerned about clutter you could do this:

if (DEBUG_FUNC) {check_sorted(col,type);}

Or depending on its return type:

if (DEBUG_FUNC && check_sorted(col, type){}

To say that the rewrite you habe given reduces clutter you can't have
had the privilege of looking at functions that have these
DEBUG_FUNC(...) statements every two or three lines.

and you still have the problem of how to get the compiler to know that
DEBUG_FUNC is always false?

const DEBUG_FUNC=false;

How do you get that into the program? By having different headers, or by
editing a header? If the former how do you choose between the headers?
If the later how do you deal with the recompilation of 100s of source files?

Then there is also <cassert> or your favourite replacement.

I'm not really sure why I need those. What does assert give me that I can't
get from native C++?

Earlier you were discussing rectangles and points. The nearest thing we
have is a 3d box. This is the code for returning what we consider the
maximum point of a 3d box, which is pretty typical of our 'one-line'
functions:

Point3D Box3D::max() const
{
ASSERT_STATE(!invalid(),"Box3D::max");
ASSERT_STATE(m_arr[1][0] < infinity(),"Box3D::max | infinite");
ASSERT_STATE(m_arr[1][1] < infinity(),"Box3D::max | infinite");
ASSERT_STATE(m_arr[1][2] < infinity(),"Box3D::max | infinite");)
return Point3D(m_arr[1][0],m_arr[1][1],m_arr[1][2]);
}

Stroustrup's comment in §24.3.7.2 regarding the use of NDEBUG in conjunction
with /assert/ is: "Like all macro magic, this use of NDEBUG is too
low-level, messy, and error-prone.

Anyone not using /assert/ has messy, error-prone code.

A simple example of what is fundamentally wrong with C++'s reliance on the
#include <header> approach is when I tried using /size_t/ in a translation
unit that didn't #include anything else from the Standard Library. It was
undefined. The way I found it was to look it up in ISO/IEC 14882:2003 and
discovered it is defined in #include <cstddef>. The fact that /size_t/ had
always been available without my #including <cstddef> means there are
multiple paths through the headers that can result in such identifiers
being introduced into my code silently. That is a bad thing. It leads to
dependencies on things I am not aware of. THAT DOESN'T SCALE!

Well there are many projects applications that contain 1,000,000s of
lines of code. The application I work on has about 10,000,000. Defines
in header files and conditional compilations really aren't a problem.

Jul 22 '05 #6

Steven T. Hatton

Karl Heinz Buchegger wrote:

"Steven T. Hatton" wrote:

I've made no secret of the fact that I really dislike the C preprocessor
in
C++. No aspect of the language has caused me more trouble. No aspect of
the language has cause more code I've read to be difficult to understand.
I've described it as GOTO's on steroids, and that's what it is!.

One argument against abolishing it it that it is useful for conditional
compilation when porting code, etc. Well, it seems to me C++ supports
that
natively. According to TC++PL(SE) §24.3.7.2 if a block of code is
bracketted with an if(CONDITION){...} the entire expression is "compiled
away" when CONDITION==false.

But it nevertheless runs through the compiler.

It would appear that is the case with g++. I doesn't simply ignore
unreachable code. I know Stroustrup and some others have been playing with
a design for some kind of macro firewall that is supposed to isolate code
from an effects from the preprocessor. But that can't be completely
effective. The CPP is capable of rewriting any code not exclusively
protected, and if you depend on any of the stuff it mangles, all bets are
off.

I fully understand that the CPP has been used to to some great stuff. The
damn thing's been around since the mid 70s or so. It couldn't have
survived this long if it wasn't useful. I probably have most of my major
problems with understanding what it's doing to me under normal
circumstances behind me. OTOH, if Stroustrup has been stung by other
people's macros mangling his code, I'm sure it can and will happen to a
mere mortal like me.

Not only that, but take a look at a translation unit some time. I was in
shock when I saw everything the CPP sucked in to build a tiny little file.
There's got to be a better way!

--
STH
Hatton's Law: "There is only One inviolable Law"
KDevelop: http://www.kdevelop.org SuSE: http://www.suse.com
Mozilla: http://www.mozilla.org

Jul 22 '05 #7

Karl Heinz Buchegger

"Steven T. Hatton" wrote:

I fully understand that the CPP has been used to to some great stuff. The
damn thing's been around since the mid 70s or so. It couldn't have
survived this long if it wasn't useful. I probably have most of my major
problems with understanding what it's doing to me under normal
circumstances behind me.

My understanding is this:
The CPP is nothing else then a glorified text editor that runs before
the actual compiler even sees the source code. The tricky thing is:
The text editor is controlled by a programming language and the tricky
part is that the statements of that programming language are embedded
in the text to edit itself.

--
Karl Heinz Buchegger
kb******@gascad.at

Jul 22 '05 #8

Dietmar Kuehl

"Steven T. Hatton" <su******@setidava.kushan.aa> wrote:

One argument against abolishing it it that it is useful for conditional
compilation when porting code, etc. Well, it seems to me C++ supports that
natively. According to TC++PL(SE) §24.3.7.2 if a block of code is
bracketted with an if(CONDITION){...} the entire expression is "compiled
away" when CONDITION==false.

However, the body of the function is still analysed for correctness,
i.e. it just an optimization not really conditional compilation. To some
extend you can use template for conditional compilation: a template which
is not instantiated is not checked for correctness to some degree.
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.contendix.com> - Software Development & Consulting

Jul 22 '05 #9

Thomas Matthews

lilburne wrote:

Steven T. Hatton wrote:
lilburne wrote:

Well there are many projects applications that contain 1,000,000s of
lines of code. The application I work on has about 10,000,000. Defines
in header files and conditional compilations really aren't a problem.

On large programs that have produce many combinations, the
defines become a large problem. I've worked on projects
where the OS command line was not large enough to support
all the #defines.

I prefer using standard interfaces and letting the linker
or build script choose which source file to implement
the interface with.

For example, my project at work as a lot of
#ifdef PLATFORM_IS_BIG_ENDIAN
#else
#endif
for converting message fields to the native byte ordering.

I would rather have a single function:
value = Apply_Endian_Conversion(old_value);
and have different implementations that the build
script chooses. IMHO, this technique provides cleaner
code (easier to read) and fewer compilation and
porting problems.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.comeaucomputing.com/learn/faq/
Other sites:
http://www.josuttis.com -- C++ STL Library book

Jul 22 '05 #10

lilburne

Thomas Matthews wrote:

lilburne wrote:

Steven T. Hatton wrote:
lilburne wrote:
Well there are many projects applications that contain 1,000,000s of
lines of code. The application I work on has about 10,000,000. Defines
in header files and conditional compilations really aren't a problem.

On large programs that have produce many combinations, the
defines become a large problem. I've worked on projects
where the OS command line was not large enough to support
all the #defines.

I didn't suggest sticking them all on the command line was a good idea
-DTHIS -DTHAT -DOTHER etc yeuck.

I prefer using standard interfaces and letting the linker
or build script choose which source file to implement
the interface with.

For example, my project at work as a lot of
#ifdef PLATFORM_IS_BIG_ENDIAN
#else
#endif
for converting message fields to the native byte ordering.

I would rather have a single function:
value = Apply_Endian_Conversion(old_value);
and have different implementations that the build
script chooses. IMHO, this technique provides cleaner
code (easier to read) and fewer compilation and
porting problems.

Well yes we have wrapper functions for platform specific functions so
that all the client code has to do is:

value = Apply_Endian_Conversion(old_value);

such functions don't crop up that often, and once written few ever
venture into them again.

Mostly in application code we have

#ifdef WIN32K
#else
#endif

but even that doesn't occur much outside of code that deals with widgets
and forms, and most of that is wrapped too.

Jul 22 '05 #11

Steven T. Hatton

At this point I have to acknowledge that, to achieve my objectives, C++
would need to provide a native means of conditional compilation different
from simply excluding blocks on the basis of some boolean constant whose
value is known at compile time. Either that, or the programmer would need
to predeclare all identifiers, even ones not relevant to the current target
platform. And could not have code that required the compiler to know the
definition at compile time.

There are other issues involved with the CPP that are worth considering.
I've discussed some of the in the following.

Karl Heinz Buchegger wrote:

"Steven T. Hatton" wrote:

I fully understand that the CPP has been used to to some great stuff.
The
damn thing's been around since the mid 70s or so. It couldn't have
survived this long if it wasn't useful. I probably have most of my major
problems with understanding what it's doing to me under normal
circumstances behind me.

My understanding is this:
The CPP is nothing else then a glorified text editor that runs before
the actual compiler even sees the source code. The tricky thing is:
The text editor is controlled by a programming language and the tricky
part is that the statements of that programming language are embedded
in the text to edit itself.

All of this is correct. But I'm not sure that's the most problematic aspect
of the CPP. Though the CPP and its associated proprocessor directives do
constitute a very simple language (nowhere near the power of sed or awk),
it obscures the concept of translation unit by supporting nested #includes.
When a person is trying to learn C++, the additional complexity can
obfuscate C++ mistakes. It's hard to determine if certain errors are CPP
or C++ related.

IMO, the CPP (#include) is a workaround to compensate for C++'s failure to
specify a mapping between identifiers used within a translation unit and
the declarations and definitions they refer to.

As an example let's consider the source in the examples from
_Accelerated_C++:_Practical_Programming_by_Example _ by Andrew Koenig and
Barbara E. Moo:

http://acceleratedcpp.com/

// unix-source/chapter03/avg.cc
#include <iomanip>
#ifndef __GNUC__
#include <ios>
#endif
#include <iostream>
#include <string>

using std::cin; using std::setprecision;
using std::cout; using std::string;
using std::endl; using std::streamsize;

I chose to use this as an example because it's done right (with the
exception that the code should have been in a namespace.) All identifiers
from the Standard Library are introduced into the translation unit through
using declarations. Logically, the using declaration provides enough
infomation to deterministically map between an identifier, and the
declaration it represents in the Standad Library. The #include CPP
directives are necessary because ISO/IEC 14882 doesn't require the
implementation to resolve these mappings. I believe - and have suggested
on comp.std.c++ - that it should be the job of the implementation to
resolve these mappings.

Now a tricky thing that comes into play is the relationship between
declaration and definition. I have to admit that falls into the category
of religious faith for me. Under most circumstances, it simply works, when
it doesn't I play with the environment variables, and linker options until
something good happens.

I believe what is happening is this: When I compile a program with
declarations in the header files I've #included somewhere in the whole
mess, the compiler can do everything that doesn't require allocating memory
without knowing the definitions associated with the declarations.
(by /compiler/ I mean the entire CPP, lexer, parser, compiler and linker
system) When it comes time to use the definition which is contained in a
source file, the source file has to be available to the compiler either
directly, or through access to an object file produced by compiling the
defining source file.

For example, if I try to compile a program with all declarations in header
files which are #included in appropriate places in the source, but neglect
to name one of the defining source files on the command line that initiates
the compilation, the program will "compile" but fail to link. This results
in a somewhat obscure message about an undefined reference to something
named in the source. I believe that providing the object file resulting
from compiling the defining source, rather than that defining source
itself, will solve this problem.

The counterpart to this in Java is accomplished using the following:

* import statement

* package name

* directory structure in identifier semantics

* classpath

* javap

* commandline switches to specify source locations

Mapping this to C++ seems to go as follows:

* import statement

This is pretty much the same as a combination of a using declaration and and
a #include. A Java import statement looks like this:

import org.w3c.dom.Document

In C++ that translates into something like:

#include <org/w3c/dom/Document.hh>
using org::w3c::dom::Document

* package name

This is roughly analogous to the C++ namespace, and is intended to support
the same concept of component that C++ namespaces are intended to support.
In Java there is a direct mapping between file names and package names.
For example if your source files are rooted at /java/source/ (c
\java\source) and you have a package named org.w3c.dom the name of the file
containing the source for org.w3c.dom.Document will
be /java/source/org/w3c/dom/Document.java. Using good organizational
practices, a programmer will have his compiled files placed in another,
congruent, directory structure, e.g., /java/classes/ is the root of the
class file hierarchy, and the class file produced by
comepiling /java/source/org/w3c/dom/Document.java will reside
in /java/classes/org/w3c/dom/Document.class. This is analogous to placing
C++ library files in /usr/local/lib/org/w3c/dom
_and_ /usr/local/include/org/w3c/dom.

* directory structure in identifier semantics

In Java the location of the root of the class file hierarchy is communicated
to the java compiler, and JVM using the $CLASSPATH variable. In C++ (g++)
the same is accomplished using various variables such as $INCLUDE_PATH
(IIRC) $LIBRARY_PATH $LD_LIBRARY_PATH and -L -I -l switches on the
compiler.

Once Java know where the root of the class file hierarchy is, it can find
individual class files based on the fully qualified identifier name. For
example:

import import org.w3c.dom.Document

means go find $CLASSPATH/org/w3c/dom/Document.class

The C++ Standard does not specify any mapping between file names and
identifiers. In particular, it does not specify a mapping between
namespaces and directories. Nor does in specify a mapping between class
names and file names.

* classpath

As discussed above the $CLASSPATH is used to locate the roots of directory
hierarchies containing the compiled Java 'object' files. To the compiler,
this functions similarly to the use of $LIBRARY_PATH for g++. It also
provides the service that the -I <path/to/include> serves in g++

* javap

The way the include path functionality of C++ is supported in Java is
through the use of the same mechanism that enables javap to provide the
interface for a given Java class.

For example:

Thu Aug 19 09:40:27:> javap org.w3c.dom.Document
Compiled from "Document.java"
interface org.w3c.dom.Document extends org.w3c.dom.Node{
public abstract org.w3c.dom.DOMImplementation getImplementation();
...
public abstract org.w3c.dom.Attr createAttribute(java.lang.String);
throws org/w3c/dom/DOMException
....
}

What Javap tells me about a Java class is very similar to what I would want
a header file to tell me about a C++ class.

* commandline switches to specify source locations
This was tacked on for completeness. Basically, it means I can tell javac
what classpath and source path to use when compiling. If a class isn't
defined in the source files provided, then it must be available in compiled
form in the class path.

One final feature of Java which makes life much easier is the use of .jar
files. A C++ analog would be to create a tar file containing object files
and header associated header files that compilers and linkers could use by
having them specified on the commandline or in an environment variable.
I know there are C++ programmers reading this and thinking it is blasphemous
to even compare Java to C++. My response is that Java was built using C++
as a model. The mechanisms described above are, for the most part, simply
a means of accomplishing the same thing that the developers of Java had
been doing by hand with C and C++ for years. There is nothing internal to
the Java syntax other than the mapping between identifier names and file
names that this mechanism relies on. This system works well. The only
thing preventing such an approach from becoming part of the C++ standard is
inertia, and the reluctance of C++ programmers to consider that there may
be better ways of doing things.

The world will be a better place when there is such a thing as a C++ .car
file analogous to a Java .jar file. Grant that these will not be binary
compatable from platfor to platform, but in many ways that doesn't matter.
--
STH
Hatton's Law: "There is only One inviolable Law"
KDevelop: http://www.kdevelop.org SuSE: http://www.suse.com
Mozilla: http://www.mozilla.org

Jul 22 '05 #12

Conditional compilation sans the cpp

Similar topics