What has C++ become?

On Jun 3, 5:39 pm, Noah Roberts <u...@example.netwrote:

James Kanze wrote:
On Jun 2, 7:27 pm, rpbg...@yahoo.com (Roland Pibinger) wrote:
On Sun, 1 Jun 2008 16:34:58 -0700 (PDT), plenty...@yahoo.com wrote:
I recall having the same experience, the *first* time I
looked at a C program, having before that seen only Pascal,
Modula-2, Basic and assembly. But I've seen C++ many times
now, albeit mostly my own which is deliberately readable.

You can safely ignore this geek style 'template programming'
because it will never reach the mundane area of real-world
programming.

First, you can't ignore anything, because you never know where
it will crop up. And like most things, it will be more or less
readable, depending on who wrote it.

What is true is that at the application level, there is very
little need for meta-programming; it is mostly used in low level
libraries (like the standard library).

Well, first of all, I don't think that the standard library, where it
actually makes use of generic/meta programming techniques, is "low
level". It is very much application level - stacks, lists,
vectors...this isn't hardware talking stuff.

It's not talking to the hardware, but it is still very low
level. A vector is not (usually) an application level
abstraction, but rather a tool used in application level
abstractions.

There is nothing low level about abstract data types. It is
exactly the opposite of low level in my opinion.

It's about the lowest level you can get. What's below it?

Second, I disagree that there's little need for it in the
application level programming. We, where I work, actually use
it a moderate amount and to great advantage. For instance, we
are an engineering firm and use a data type that uses
metaprogramming techniques to provide type safe dimensional
analysis. Since adopting this it has already saved us
numerous man hours in debugging.

But is the meta-programming in the application itself, or in the
lower level tools you use to implement it? (Not that I would
expect much metaprogramming in type safe dimensional analysis.)

We use boost::units and some other stuff that I wrote on top
of it. Other areas it is used is in a variety of generic
functions that use enable_if to choose or disqualify template
instantiations.

So as one that is not afraid of TMP and uses it *in the
application layer* I really have to disagree with those
claiming it has no place there.

You can't really use templates too much in the application layer
anyway, because of the coupling they induce (unless all of your
compilers support export). And the whole point about being the
application level is that it is specific to the application;
it's not generic. What makes code the application level is that
it deals with concrete abstractions, like ClientOrder or
BookingInstruction (currently) or IPAddress (where I was
before). Just the opposite of template based generics.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 27 '08 #52

Vidar Hasfjord wrote:

Here I assume that you actually propose a very limited subset of
imperative features for compile-time processing; not that the whole
language should be available for processing at compile-time. A subset
of imperative features can be supported, as seen by constexpr function
in C++09 and by CTFE in D, but they are limited and required to live
by the rules of functional programming. For example, compile-time
functions must be 'pure', ie. the result must only depend on the
arguments, the function can have no side-effects and no state can
escape the function. I intuitively think this is good.

I think that a subset approach is fine, but a different language
approach is not. For example, in C++, you cannot write a factorial
function that works at both compile time and run time.

Or did you actually mean that you think the ideal would be something
akin to Metacode?

I don't know anything about Metacode.

Jun 27 '08 #53

Vidar Hasfjord

On Jun 5, 1:14 am, Walter Bright <wal...@digitalmars-nospamm.com>
wrote:

Vidar Hasfjord wrote:
[...]
I think that a subset approach is fine, but a different language
approach is not. For example, in C++, you cannot write a factorial
function that works at both compile time and run time.

I agree CTFE is convenient. I view it as part of the reasoning
abilities of the language that allows ordinary code to cross into the
compile-time domain. My observation is that code can only cross into
the compile-time domain when it adheres to principles of functional
programming; that it's output is solely dependent on its input and
that it has no external side-effects.

CTFE can't do computation on compile-time entities such as types. To
make imperative meta-programming pervasive you would need further
extensions to the language (such as Metacode).

I don't know anything about Metacode.

Metacode is an experimental extension to C++ to allow imperative meta-
programming and reflection. I don't think it was ever formally
proposed, but you can find a presentation at the ISO site:

http://www.open-std.org/jtc1/sc22/wg...2003/n1471.pdf

Regards,
Vidar Hasfjord

Jun 27 '08 #54

Vidar Hasfjord wrote:

I agree CTFE is convenient. I view it as part of the reasoning
abilities of the language that allows ordinary code to cross into the
compile-time domain. My observation is that code can only cross into
the compile-time domain when it adheres to principles of functional
programming; that it's output is solely dependent on its input and
that it has no external side-effects.

I generally agree with that, but I keep finding ways to expand the
domain of CTFE. But I don't think CTFE will ever be doing things like
spawning threads, writing files, etc., and I don't think it should even
if possible (because of security concerns).

CTFE can't do computation on compile-time entities such as types. To
make imperative meta-programming pervasive you would need further
extensions to the language (such as Metacode).

You're quite right, CTFE operates on values, not types. To do types
needs some other facility. In the D programming language, one can create
arrays of types, and then operate on them at compile time using
conventional array notation and operators. This seems to work out rather
nicely, is easy to implement, and sidesteps the need to instantiate
large numbers of template classes.

Metacode is an experimental extension to C++ to allow imperative meta-
programming and reflection. I don't think it was ever formally
proposed, but you can find a presentation at the ISO site:

http://www.open-std.org/jtc1/sc22/wg...2003/n1471.pdf

Thanks for the pointer!

Jun 27 '08 #55

Vidar Hasfjord

On Jun 3, 7:53 pm, Walter Bright <wal...@digitalmars-nospamm.com>
wrote:

Noah Roberts wrote:
Well, I prefer something more like so:

template<typename S, typename T// compute index of T in S
struct IndexOf
: mpl::distance
<
typename mpl::begin<S>::type
, typename mpl::find<S, T>::type

{};

But to each his own. Neither is particularly difficult to understand.

Suppose we could write it as:

int IndexOf(S, T)
{
return distance(begin(S), find(S, T));

}

Would you prefer the latter? I sure would.

I agree that meta-function applications should look more like
traditional function applications. I believe C++09 will allow this
short and sweet style:

template <typename S, typename T>
using index_of = distance <begin <S>, find <S, T>>;

("Template aliases for C++", N2258)

Regards,
Vidar Hasfjord

Jun 27 '08 #56

James Kanze wrote:

But is the meta-programming in the application itself, or in the
lower level tools you use to implement it?

Well, since you are going to assert an arbitrary point of separation
between the two, seemly generated solely to and dependent on your
conclusion, obviously only main() counts as application programming and
no, it's not a template meta-program.

(Not that I would
expect much metaprogramming in type safe dimensional analysis.)

I would suggest you go look at boost's, and other versions like Quan, then.

You can't really use templates too much in the application layer
anyway, because of the coupling they induce (unless all of your
compilers support export). And the whole point about being the
application level is that it is specific to the application;
it's not generic. What makes code the application level is that
it deals with concrete abstractions, like ClientOrder or
BookingInstruction (currently) or IPAddress (where I was
before). Just the opposite of template based generics.

Hehehe, where do you get this stuff??

* templates induce coupling? :p
* IPAddress is "application specific"? :p
* You don't use templates in the application layer? :p

Jun 27 '08 #57

Walter Bright wrote:

Vidar Hasfjord wrote:
>Here I assume that you actually propose a very limited subset of
imperative features for compile-time processing; not that the whole
language should be available for processing at compile-time. A subset
of imperative features can be supported, as seen by constexpr function
in C++09 and by CTFE in D, but they are limited and required to live
by the rules of functional programming. For example, compile-time
functions must be 'pure', ie. the result must only depend on the
arguments, the function can have no side-effects and no state can
escape the function. I intuitively think this is good.

I think that a subset approach is fine, but a different language
approach is not. For example, in C++, you cannot write a factorial
function that works at both compile time and run time.

I think that would only introduce confusion, which is already there,
about the difference between the compiler program and the one the
compiler is compiling. Having a strong split, conceptually and
practically, between the two is important.

Furthermore, meta-programing is not conducive to the language proper (as
has been explained). This means that to make them the same language you
would need to push toward the meta-programing model, not the other way.
Personally, I don't want to have to write THAT much code in that
manner. If I did I'd be using LISP or something.

Jun 27 '08 #58

Noah Roberts wrote:

Walter Bright wrote:
>I think that a subset approach is fine, but a different language
approach is not. For example, in C++, you cannot write a factorial
function that works at both compile time and run time.

I think that would only introduce confusion, which is already there,
about the difference between the compiler program and the one the
compiler is compiling. Having a strong split, conceptually and
practically, between the two is important.

Is anyone confused by:

const int X = 5;
int a[X + 3];

? The array dimension is computed at compile time. I don't think it is
necessary to conceptually make a difference.

Furthermore, meta-programing is not conducive to the language proper (as
has been explained). This means that to make them the same language you
would need to push toward the meta-programing model, not the other way.
Personally, I don't want to have to write THAT much code in that
manner. If I did I'd be using LISP or something.

I think we are all so used to the conventional limits of compile time
programming we don't even notice the severe restrictions. I know that
was (and still is) true for me.

Jun 27 '08 #59

Jerry Coffin

In article <48***********************@news.free.fr>,
mi************@free.fr says...

[ ... ]

But multiplication is not expected to be commutative other the matrix
group (or other non-Abelian rings). In math, the + sign is reserved for
commutative operations.

Not so -- a group is defined over an operation. If that operation is not
commutative, the group is non-Abelian. It may be true that study of non-
Abelian groups tends more often to look at multiplication than addition,
but it is not true that the + sign is reserved for commutative
operations. It's often true (probably far more often than not) but not
always.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jun 27 '08 #60

Pascal J. Bourguignon

Noah Roberts <us**@example.netwrites:

Walter Bright wrote:
>Vidar Hasfjord wrote:
>>Here I assume that you actually propose a very limited subset of
imperative features for compile-time processing; not that the whole
language should be available for processing at compile-time. A subset
of imperative features can be supported, as seen by constexpr function
in C++09 and by CTFE in D, but they are limited and required to live
by the rules of functional programming. For example, compile-time
functions must be 'pure', ie. the result must only depend on the
arguments, the function can have no side-effects and no state can
escape the function. I intuitively think this is good.
I think that a subset approach is fine, but a different language
approach is not. For example, in C++, you cannot write a factorial
function that works at both compile time and run time.

I think that would only introduce confusion, which is already there,
about the difference between the compiler program and the one the
compiler is compiling. Having a strong split, conceptually and
practically, between the two is important.

As has been shown by 50 years of metaprogramming in lisp.

Furthermore, meta-programing is not conducive to the language proper
(as has been explained). This means that to make them the same
language you would need to push toward the meta-programing model, not
the other way. Personally, I don't want to have to write THAT much
code in that manner. If I did I'd be using LISP or something.

Why not?

Why would you want to automatize the job of accountants or graphists,
but not your own?

--
__Pascal Bourguignon__

Jun 27 '08 #61

On Jun 5, 5:58 pm, Noah Roberts <u...@example.netwrote:

James Kanze wrote:
But is the meta-programming in the application itself, or in the
lower level tools you use to implement it?

Well, since you are going to assert an arbitrary point of
separation between the two, seemly generated solely to and
dependent on your conclusion, obviously only main() counts as
application programming and no, it's not a template
meta-program.

(Not that I would expect much metaprogramming in type safe
dimensional analysis.)

I would suggest you go look at boost's, and other versions
like Quan, then.

And where would they use meta-programming, except for
obfuscation? (Not all templates are metaprogramming.)

You can't really use templates too much in the application
layer anyway, because of the coupling they induce (unless
all of your compilers support export). And the whole point
about being the application level is that it is specific to
the application; it's not generic. What makes code the
application level is that it deals with concrete
abstractions, like ClientOrder or BookingInstruction
(currently) or IPAddress (where I was before). Just the
opposite of template based generics.

Hehehe, where do you get this stuff??

Practical experience.

* templates induce coupling? :p

And how, unless all of your compilers support export.

* IPAddress is "application specific"? :p

It was in my application (dynamical allocation of IP addresses).

* You don't use templates in the application layer? :p

They've been banned at the higher levels in most coding
guidelines I've seen, because of the coupling problems they
induce.

It's just a question of good software engineering. You don't
introduce complexity (or coupling) where it isn't needed. You
don't throw in or use features just because they're the in
thing. Templates (like many other things) have a cost. With
most current compilers, part of that cost is a significant
increase in coupling, which becomes very expensive the higher up
toward the application level you go. Whereas the benefits of
templates are mostly present at the lower levels. As always,
one might find some exceptions, but the in general, templates
aren't used much at the application level when engineering
criteria are used to decide.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientï¿½e objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sï¿½mard, 78210 St.-Cyr-l'ï¿½cole, France, +33 (0)1 30 23 00 34

Jun 27 '08 #62

kwikius

"James Kanze" <ja*********@gmail.comwrote in message
news:35b4bd62-2e24-4560-9fc6-

They've been banned at the higher levels in most coding
guidelines I've seen, because of the coupling problems they
induce.

Interesting. std::string being a typedef for a class template, similarly
std::vector, std::list , std::map ,set

Are all these banned?

regards
Andy Little

Jun 27 '08 #63

James Kanze wrote:

>* templates induce coupling? :p

And how, unless all of your compilers support export.

Header dependencies and code coupling are very, very different things.
Templates *reduce* coupling.

Jun 27 '08 #64

Pascal J. Bourguignon wrote:

Noah Roberts <us**@example.netwrites:

>Furthermore, meta-programing is not conducive to the language proper
(as has been explained). This means that to make them the same
language you would need to push toward the meta-programing model, not
the other way. Personally, I don't want to have to write THAT much
code in that manner. If I did I'd be using LISP or something.

Why not?

Why would you want to automatize the job of accountants or graphists,
but not your own?

You have a point, and I do where I can, but having to work without
things like assignment is difficult for me. The arguments for it have
been pretty good in this thread so I don't think that's something that
should change. So maybe it's laziness, maybe it's prudence...who knows.

Jun 27 '08 #65

On Jun 6, 3:36 pm, "kwikius" <a...@servocomm.freeserve.co.ukwrote:

"James Kanze" <james.ka...@gmail.comwrote in message

news:35b4bd62-2e24-4560-9fc6-

They've been banned at the higher levels in most coding
guidelines I've seen, because of the coupling problems they
induce.

Interesting. std::string being a typedef for a class template, similarly
std::vector, std::list , std::map ,set

Are all these banned?

If you consider them application level code, yes. In the places
I've worked, they've been considered part of the standard
library, and application programmers weren't allowed to modify
them.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientï¿½e objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sï¿½mard, 78210 St.-Cyr-l'ï¿½cole, France, +33 (0)1 30 23 00 34

Jun 27 '08 #66

On Jun 6, 5:12 pm, Noah Roberts <u...@example.netwrote:

James Kanze wrote:

* templates induce coupling? :p

And how, unless all of your compilers support export.

Header dependencies and code coupling are very, very different things.

They're related, but yes: I should have made it clear that I was
talking about compiler dependencies, and not design coupling.

Templates *reduce* coupling.

They can be used for design decoupling, especially in lower
level software. It's not automatic, though; a poorly designed
template can also increase coupling.

The important thing to realise is that they're a tool. Like
most (or even all) tools, they have a cost. If the advantages
of using the tool outweigh the cost, then you should use it. If
they don't, then you shouldn't.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientï¿½e objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sï¿½mard, 78210 St.-Cyr-l'ï¿½cole, France, +33 (0)1 30 23 00 34

Jun 27 '08 #67

kwikius

On Jun 6, 6:04*pm, James Kanze <james.ka...@gmail.comwrote:

On Jun 6, 3:36 pm, "kwikius" <a...@servocomm.freeserve.co.ukwrote:

"James Kanze" <james.ka...@gmail.comwrote in message
news:35b4bd62-2e24-4560-9fc6-
They've been banned at the higher levels in most coding
guidelines I've seen, because of the coupling problems they
induce.
Interesting. std::string being a typedef for a class template, similarly
std::vector, std::list , std::map ,set
Are all these banned?

If you consider them application level code, yes. *In the places
I've worked, they've been considered part of the standard
library, and application programmers weren't allowed to modify
them.

Theres a difference between application development and library
development. I have done both.

At the application level in quan, my physical quantities library ( and
I use quan a lot in my own applications) , only typedefs are used for
common quantities, which are provided by the library, the format is

quan::length::mm x;

quan::force_per_length::kN_per_m F1;

(I hope the intended quantities and units are obvious)

This is entirely similar to std::string, except that there are a large
number of quantities. Nevertheless no template parameters are used in
my own application code, though underneath there is a large amount of
template machinery.

From time to time I see pronouncements that templates are for experts
only, however this is FUD really as the only way to become expert is
to start from being a non expert.

regards
Andy Little

Jun 27 '08 #68

James Kanze wrote:

On Jun 6, 5:12 pm, Noah Roberts <u...@example.netwrote:
>James Kanze wrote:
>>>* templates induce coupling? :p

>>And how, unless all of your compilers support export.

>Header dependencies and code coupling are very, very different things.

They're related, but yes: I should have made it clear that I was
talking about compiler dependencies, and not design coupling.

Just to clarify, your objections are practical (tool limitations) rather
than philosophical?

If that is the case and you can't get a better hammer, use a bigger one.

I like to include build times as part one of my project requirements
(and yes, I do test it!). If the build times get too long, treat this
like any other design issue. Weigh the time/cost of design changes to
the code against design changes to the build environment. On past
projects, adding another machine to the build farm has been the more
cost effective option. This is probably more typical today with
plummeting hardware costs and rising labour costs.

--
Ian Collins.

Jun 27 '08 #69

On Jun 7, 11:08 pm, Ian Collins <ian-n...@hotmail.comwrote:

James Kanze wrote:
On Jun 6, 5:12 pm, Noah Roberts <u...@example.netwrote:
James Kanze wrote:
* templates induce coupling? :p

>And how, unless all of your compilers support export.

Header dependencies and code coupling are very, very
different things.

They're related, but yes: I should have made it clear that I
was talking about compiler dependencies, and not design
coupling.

Just to clarify, your objections are practical (tool
limitations) rather than philosophical?

My objections are always practical, rather than philosophical.
I'm a practicing programmer, not a philosopher. Using templates
today has a very definite cost.

If that is the case and you can't get a better hammer, use a
bigger one.

In other words, C++ isn't the language I should be using for
large applications? From what I can see, it's not really a very
good language, but all of the others are worse.

Note that the standard actually addressed this particular
problem, at least partially, with export, which the compiler
implementors have pretty much ignored. Part of the reason, no
doubt, is that it mainly affects application level code. And
there's really not that much use for templates at that level;
they're generally more appropriate for low level library code.

(The fact that there is a dependency on the implementation of
std::vector isn't generally considered a problem: std::vector is
part of the compiler, and when you upgrade the compiler, you do
a clean build anyway, regardless of how long it takes.)

I like to include build times as part one of my project
requirements (and yes, I do test it!). If the build times get
too long, treat this like any other design issue. Weigh the
time/cost of design changes to the code against design changes
to the build environment. On past projects, adding another
machine to the build farm has been the more cost effective
option. This is probably more typical today with plummeting
hardware costs and rising labour costs.

The problem is less total build time (at least until it starts
reaching the point where you can't do a clean build over the
week-end); it is recompilation times due to a change. In large
applications, for example, header files are generally frozen
early, and only allowed to change exceptionally. Recompile
times aren't the only reason for this, of course, but they're
part of it.

As for adding a machine to the build park: throwing more
hardware at a problem is often the simplest and most economic
solution (although in this case, the problem is perhaps more
linked with IO throughput than with actual CPU power---and
adding a machine can actually make things worse, but increasing
network load). But practically, in most enterprises, it's part
of a different budget:-(.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 27 '08 #70

James Kanze wrote:

>
As for adding a machine to the build park: throwing more
hardware at a problem is often the simplest and most economic
solution (although in this case, the problem is perhaps more
linked with IO throughput than with actual CPU power---and
adding a machine can actually make things worse, but increasing
network load). But practically, in most enterprises, it's part
of a different budget:-(.

On my last couple of C++ projects, I was fortunate to enough to be
responsible for both the build farm design and budget as well as the
software design. So neither problem arose :)

--
Ian Collins.

Jun 27 '08 #71

Ian Collins wrote:

On my last couple of C++ projects, I was fortunate to enough to be
responsible for both the build farm design and budget as well as the
software design. So neither problem arose :)

Wow, I didn't know people actually used build farms for C++! How many
lines of code was that?

Jun 27 '08 #72

Walter Bright wrote:

Ian Collins wrote:
>On my last couple of C++ projects, I was fortunate to enough to be
responsible for both the build farm design and budget as well as the
software design. So neither problem arose :)

Wow, I didn't know people actually used build farms for C++! How many
lines of code was that?

We never bothered to count.

I have been using distributed building for C and C++ for over a decade
now. All that's required is sensible compiler licensing and a decent
make system.

--
Ian Collins.

Jun 27 '08 #73

On Jun 8, 10:58 pm, Walter Bright <wal...@digitalmars-nospamm.com>
wrote:

Ian Collins wrote:
On my last couple of C++ projects, I was fortunate to enough to be
responsible for both the build farm design and budget as well as the
software design. So neither problem arose :)

Wow, I didn't know people actually used build farms for C++!
How many lines of code was that?

And how many different versions does he need? If you have
separate debug and release versions, for each program, on each
target platform, you can easily end up with ten or fifteen
complete builds. And with enough templates in the header files,
it doesn't take very many lines of source code (less than a
million, even) to end needed a build farm, just to be able to do
a clean build over the week-end.

Or course, you usually have the material anyway. Just tell the
programmers to not turn their machines off when they go home for
the week-end.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 27 '08 #74

James Kanze wrote:

The important thing to realise is that they're a tool. Like
most (or even all) tools, they have a cost. If the advantages
of using the tool outweigh the cost, then you should use it. If
they don't, then you shouldn't.

Well, I can agree with that but you seem to be making stuff up to argue
against using a tool. Like asserting, without basis, that templates are
only useful in "lower level code", only decouple in "lower level code",
and various other things that, quite frankly, make no sense at all.

You can't use screws where they are useful if you've got some sort of
weird prejudice against screwdrivers.

Jun 27 '08 #75

James Kanze wrote:

On Jun 8, 10:58 pm, Walter Bright <wal...@digitalmars-nospamm.com>
wrote:
>Ian Collins wrote:
>>On my last couple of C++ projects, I was fortunate to enough to be
responsible for both the build farm design and budget as well as the
software design. So neither problem arose :)

>Wow, I didn't know people actually used build farms for C++!
How many lines of code was that?

And how many different versions does he need? If you have
separate debug and release versions, for each program, on each
target platform, you can easily end up with ten or fifteen
complete builds. And with enough templates in the header files,
it doesn't take very many lines of source code (less than a
million, even) to end needed a build farm, just to be able to do
a clean build over the week-end.

This project was about 300K lines including tests. A distributed clean
build (which included a code generation phase) took about 12 minutes,
which was too long (10 was the design limit). Any longer and
productivity would have been hit enough to add another node.

--
Ian Collins.

Jun 27 '08 #76

Ian Collins wrote:

James Kanze wrote:
>On Jun 8, 10:58 pm, Walter Bright <wal...@digitalmars-nospamm.com>
wrote:
>>Ian Collins wrote:
On my last couple of C++ projects, I was fortunate to enough to be
responsible for both the build farm design and budget as well as the
software design. So neither problem arose :)
Wow, I didn't know people actually used build farms for C++!
How many lines of code was that?
And how many different versions does he need? If you have
separate debug and release versions, for each program, on each
target platform, you can easily end up with ten or fifteen
complete builds. And with enough templates in the header files,
it doesn't take very many lines of source code (less than a
million, even) to end needed a build farm, just to be able to do
a clean build over the week-end.

This project was about 300K lines including tests. A distributed clean
build (which included a code generation phase) took about 12 minutes,
which was too long (10 was the design limit). Any longer and
productivity would have been hit enough to add another node.

I've looked into trying to make the C++ compiler multithreaded (so it
could use multi core computers) many times. There just isn't any way to
do it, compiling C++ is fundamentally a sequential operation. The only
thing you can do is farm out the separate source files for separate
builds. The limit achievable there is when there is one node per source
file.

My experiences with trying to accelerate C++ compilation led to many
design decisions in the D programming language. Each pass (lexing,
parsing, semantic analysis, etc.) is logically separate from the others,
meaning that each can be farmed out to a separate thread. The import
file reads can be asynchronous. The lexing, parsing, and semantic
analysis of an imported module is independent of where and how it is
imported.

While the D compiler is not currently multithreaded, the process is
inherently multithreadable, and I'll be very interested to see how fast
it can go with a multicore CPU.

Jun 27 '08 #77

ian-news

On Jun 10, 7:59 am, Walter Bright <wal...@digitalmars-nospamm.com>
wrote:

Ian Collins wrote:

This project was about 300K lines including tests. A distributed clean
build (which included a code generation phase) took about 12 minutes,
which was too long (10 was the design limit). Any longer and
productivity would have been hit enough to add another node.

I've looked into trying to make the C++ compiler multithreaded (so it
could use multi core computers) many times. There just isn't any way to
do it, compiling C++ is fundamentally a sequential operation. The only
thing you can do is farm out the separate source files for separate
builds. The limit achievable there is when there is one node per source
file.

The promlem of distributed building is best soved by a combination of
the build system and the compiler. The build system is responsible for
farming out jobs to cores and the compiler has to be parallel build
aware. Template instantiation is one area where some form of locking
of generated instantiation files may be required.

The two I use are gcc/GNU make which supports parallel building and
Sun CC/dmake which supports parallel and distributed builing.

The number of jobs per core depends on the nature of the code and
should be tuned for each project. Over a number of C++ projects I
have found 2 to 4 jobs per core to be a sweet spot. The projects all
used the many small source file model which works best with parallel
(and more so, distributed) building.

Parallel or distributed building has to be designed in to your process
from day one. Poorly designed makefiles or code layout can loose you
many of the possible gains.

--
Ian

Jun 27 '08 #78

ian-news

On Jun 10, 3:17 am, Noah Roberts <u...@example.netwrote:

James Kanze wrote:
The important thing to realise is that they're a tool. Like
most (or even all) tools, they have a cost. If the advantages
of using the tool outweigh the cost, then you should use it. If
they don't, then you shouldn't.

Well, I can agree with that but you seem to be making stuff up to argue
against using a tool. Like asserting, without basis, that templates are
only useful in "lower level code", only decouple in "lower level code",
and various other things that, quite frankly, make no sense at all.

I think James is pretty clear in his mention of a cost/benefit trade-
off.

If your process is designed for rapid building to offset the cost of
extra coupling then the advantages of templates may outweigh the
cost. If a clean build of your project takes a long time, the
productivity cost will outweigh any benefits.

--

Ian.

Jun 27 '08 #79

On Jun 9, 9:59 pm, Walter Bright <wal...@digitalmars-nospamm.com>
wrote:

Ian Collins wrote:
James Kanze wrote:
On Jun 8, 10:58 pm, Walter Bright <wal...@digitalmars-nospamm.com>
wrote:
Ian Collins wrote:
On my last couple of C++ projects, I was fortunate to enough to be
responsible for both the build farm design and budget as well as the
software design. So neither problem arose :)
Wow, I didn't know people actually used build farms for C++!
How many lines of code was that?
And how many different versions does he need? If you have
separate debug and release versions, for each program, on each
target platform, you can easily end up with ten or fifteen
complete builds. And with enough templates in the header files,
it doesn't take very many lines of source code (less than a
million, even) to end needed a build farm, just to be able to do
a clean build over the week-end.

This project was about 300K lines including tests. A
distributed clean build (which included a code generation
phase) took about 12 minutes, which was too long (10 was the
design limit). Any longer and productivity would have been
hit enough to add another node.

I've looked into trying to make the C++ compiler multithreaded
(so it could use multi core computers) many times. There just
isn't any way to do it, compiling C++ is fundamentally a
sequential operation. The only thing you can do is farm out
the separate source files for separate builds. The limit
achievable there is when there is one node per source file.

The input must be scanned sequentially, I'm pretty sure, since a
#define along the way can clearly affect how the following
source is read. And I rather suspect that it must also be
parsed sequentially, since the grammar is not context
free---whether a symbol is the name of a type, the name of a
template, or something else, affects parsing. But once you've
got your parse trees, couldn't you parallelize the processing of
each function: low-level optimization and code generation?

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 27 '08 #80

On Jun 10, 12:39 am, ian-n...@hotmail.com wrote:

On Jun 10, 3:17 am, Noah Roberts <u...@example.netwrote:James Kanze wrote:

The important thing to realise is that they're a tool.
Like most (or even all) tools, they have a cost. If the
advantages of using the tool outweigh the cost, then you
should use it. If they don't, then you shouldn't.

Well, I can agree with that but you seem to be making stuff
up to argue against using a tool. Like asserting, without
basis, that templates are only useful in "lower level code",
only decouple in "lower level code", and various other
things that, quite frankly, make no sense at all.

I think James is pretty clear in his mention of a cost/benefit
trade- off.

If your process is designed for rapid building to offset the
cost of extra coupling then the advantages of templates may
outweigh the cost. If a clean build of your project takes a
long time, the productivity cost will outweigh any benefits.

The clean build isn't the problem. You can schedule that
overnight, or for a weekend. (For my library, a clean build for
all of the versions I support under Unix takes something like
eight hours. Which doesn't bother me too much.) The problem is
the incremental builds when someone bug-fixes something in the
implementation. For non-templates, that means recompiling a
single .cc file; for templates, recompiling all source files
which include the header. A difference between maybe 5 seconds,
and a couple of minutes. Which is a very significant difference
if you're sitting in front of the computer, waiting for it to
finish.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 27 '08 #81

James Kanze wrote:

On Jun 10, 12:39 am, ian-n...@hotmail.com wrote:

>If your process is designed for rapid building to offset the
cost of extra coupling then the advantages of templates may
outweigh the cost. If a clean build of your project takes a
long time, the productivity cost will outweigh any benefits.

The clean build isn't the problem. You can schedule that
overnight, or for a weekend. (For my library, a clean build for
all of the versions I support under Unix takes something like
eight hours. Which doesn't bother me too much.) The problem is
the incremental builds when someone bug-fixes something in the
implementation. For non-templates, that means recompiling a
single .cc file; for templates, recompiling all source files
which include the header. A difference between maybe 5 seconds,
and a couple of minutes. Which is a very significant difference
if you're sitting in front of the computer, waiting for it to
finish.

You can say the same for a change to any header. There's always
something else to look at for a couple of minutes..

--
Ian Collins.

Jun 27 '08 #82

Matthias Buelow

Walter Bright wrote:

My experiences with trying to accelerate C++ compilation led to many
design decisions in the D programming language. Each pass (lexing,
parsing, semantic analysis, etc.) is logically separate from the others,

Arguably, this is just a workaround for the basic problem that C++ (and
presumably D, aswell) is a language where the program must be completely
recompiled and linked before execution. Incremental development where
new code can be directly loaded and tested in a running object image is
imho a more productive model for large program development.

Jun 27 '08 #83

James Kanze wrote:

The input must be scanned sequentially, I'm pretty sure, since a
#define along the way can clearly affect how the following
source is read.

Token pasting is another feature that mucks up all hope of doing things
non-sequentially.

And I rather suspect that it must also be
parsed sequentially, since the grammar is not context
free---whether a symbol is the name of a type, the name of a
template, or something else, affects parsing.

Take a look at the rules for looking up names. What names the compiler
'sees' depends very much on a sequential view of the input, which
affects overloading, which affects ...

But once you've
got your parse trees, couldn't you parallelize the processing of
each function: low-level optimization and code generation?

Yes, you could probably do that in parallel for each function, though
you'd have to do a complex merge process to turn the result into a
single object file. I decided that wasn't worth the effort, because the
bulk of the time spent was in the front end which wasn't parallelizable.
The big gains would be in asynchronously processing all those header files.

P.S. Even worse for C++ is that header files must be reprocessed for
every source file compilation. So, if you have m source files, each with
a header file, and every source file winds up #include'ing every header
(a normal, if regrettable, situation), compilation times are O(m*m). The
D programming language is designed so that import files compile
independently of where they are imported, so compilation times are O(m).

P.P.S. Yes, I know all about precompiled headers in C++, but there is no
way to make pch perfectly language conformant. You have to accept some
deviation from the standard to use them.

Jun 27 '08 #84

Matthias Buelow wrote:

Walter Bright wrote:

>My experiences with trying to accelerate C++ compilation led to many
design decisions in the D programming language. Each pass (lexing,
parsing, semantic analysis, etc.) is logically separate from the others,

Arguably, this is just a workaround for the basic problem that C++ (and
presumably D, aswell) is a language where the program must be completely
recompiled and linked before execution. Incremental development where
new code can be directly loaded and tested in a running object image is
imho a more productive model for large program development.

Back when vertebrates were just emerging from the slime, when I was
working on compilers for Symantec, the request came in for the linker to
acquire incremental linking ability because the competition's linker
could do incremental builds. When I pointed out that our linker could do
a full link faster than the incremental linkers could do an incremental
link, the point became moot.

Back to the present, I suggest that if the full build can be made fast
enough, there is no reason for incremental builds. I think Borland also
made that point well with their original Turbo Pascal release.

Jun 27 '08 #85

Matthias Buelow wrote:

Walter Bright wrote:

>My experiences with trying to accelerate C++ compilation led to many
design decisions in the D programming language. Each pass (lexing,
parsing, semantic analysis, etc.) is logically separate from the others,

Arguably, this is just a workaround for the basic problem that C++ (and
presumably D, aswell) is a language where the program must be completely
recompiled and linked before execution. Incremental development where
new code can be directly loaded and tested in a running object image is
imho a more productive model for large program development.

A model which isn't unusual in C or C++ development, consider device
drivers and other loadable modules or plugins.

--
Ian Collins.

Jun 27 '08 #86

James Kanze wrote:

The clean build isn't the problem. You can schedule that
overnight, or for a weekend. (For my library, a clean build for
all of the versions I support under Unix takes something like
eight hours. Which doesn't bother me too much.) The problem is
the incremental builds when someone bug-fixes something in the
implementation. For non-templates, that means recompiling a
single .cc file; for templates, recompiling all source files
which include the header. A difference between maybe 5 seconds,
and a couple of minutes. Which is a very significant difference
if you're sitting in front of the computer, waiting for it to
finish.

A full build of the dmd compiler (using dmc++) takes 18 seconds on an
Intel 1.6 GHz machine <g>. 33 seconds for g++ on AMD 64 4000.

Jun 27 '08 #87

Ian Collins wrote:

You can say the same for a change to any header. There's always
something else to look at for a couple of minutes..

Nearly instant rebuilds are a transformative experience for development.
Going off for 2 minutes to get coffee, read slashdot, etc., gets one out
of the 'zone'.

Jun 27 '08 #88

Walter Bright wrote:

Ian Collins wrote:
>You can say the same for a change to any header. There's always
something else to look at for a couple of minutes..

Nearly instant rebuilds are a transformative experience for development.
Going off for 2 minutes to get coffee, read slashdot, etc., gets one out
of the 'zone'.

My "something else" was the next problem or test.

--
Ian Collins.

Jun 27 '08 #89

coal

On Jun 10, 4:24*am, Walter Bright <wal...@digitalmars-nospamm.com>
wrote:

James Kanze wrote:
The clean build isn't the problem. *You can schedule that
overnight, or for a weekend. *(For my library, a clean build for
all of the versions I support under Unix takes something like
eight hours. *Which doesn't bother me too much.) *The problem is
the incremental builds when someone bug-fixes something in the
implementation. *For non-templates, that means recompiling a
single .cc file; for templates, recompiling all source files
which include the header. *A difference between maybe 5 seconds,
and a couple of minutes. *Which is a very significant difference
if you're sitting in front of the computer, waiting for it to
finish.

A full build of the dmd compiler (using dmc++) takes 18 seconds on an
Intel 1.6 GHz machine <g>. 33 seconds for g++ on AMD 64 4000.

Do you give any thought to bringing either of those compilers on-
line?
I think it would be a good idea. I know of two C++ compilers that
have taken small steps toward being available on-line.

Brian Wood
Ebenezer Enterprises
www.webEbenezer.net

Jun 27 '08 #90

Ian Collins wrote:

Walter Bright wrote:
>Ian Collins wrote:
>>You can say the same for a change to any header. There's always
something else to look at for a couple of minutes..
Nearly instant rebuilds are a transformative experience for development.
Going off for 2 minutes to get coffee, read slashdot, etc., gets one out
of the 'zone'.

My "something else" was the next problem or test.

I've tried many times to multitask. I'll have the test suite running in
one window, a compile in a second, and edit documentation in a third.
All closely related, but I find that inevitably I get confabulated
switching mental contexts between them and screw things up.

Jun 27 '08 #91

co**@mailvault.com wrote:

On Jun 10, 4:24 am, Walter Bright <wal...@digitalmars-nospamm.com>
wrote:
>A full build of the dmd compiler (using dmc++) takes 18 seconds on an
Intel 1.6 GHz machine <g>. 33 seconds for g++ on AMD 64 4000.

Do you give any thought to bringing either of those compilers on-
line?
I think it would be a good idea. I know of two C++ compilers that
have taken small steps toward being available on-line.

I'm familiar with Comeau's online C++ compiler, but you cannot link or
run the result, so I don't really see the point in it.

Jun 27 '08 #92

Michael Furman

James Kanze wrote:

....

The clean build isn't the problem. You can schedule that
overnight, or for a weekend. (For my library, a clean build for
all of the versions I support under Unix takes something like
eight hours. Which doesn't bother me too much.) The problem is
the incremental builds when someone bug-fixes something in the
implementation. For non-templates, that means recompiling a
single .cc file; for templates, recompiling all source files
which include the header. A difference between maybe 5 seconds,
and a couple of minutes. Which is a very significant difference
if you're sitting in front of the computer, waiting for it to
finish.

I love when compilation takes more then a couple of seconds: I have
extra time to think! Sometimes it ends with killing the compilation
and doing something else, rather then trying the result.

Michael Furman

>
--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 27 '08 #93

Walter Bright wrote:

Ian Collins wrote:
>Walter Bright wrote:
>>Ian Collins wrote:
You can say the same for a change to any header. There's always
something else to look at for a couple of minutes..
Nearly instant rebuilds are a transformative experience for development.
Going off for 2 minutes to get coffee, read slashdot, etc., gets one out
of the 'zone'.

My "something else" was the next problem or test.

I've tried many times to multitask. I'll have the test suite running in
one window, a compile in a second, and edit documentation in a third.
All closely related, but I find that inevitably I get confabulated
switching mental contexts between them and screw things up.

I know, it's a male thing :(

My builds always run the unit tests, so that's one less window to worry
about.

This all goes to show that what ever you can do the improve build times
is worth the effort!

--
Ian Collins.

Jun 27 '08 #94

Ian Collins wrote:

This all goes to show that what ever you can do the improve build times
is worth the effort!

I agree. I'm looking forward to having a 64 core or whatever CPU and
being able to build huge D programs in the blink of an eye!

Jun 27 '08 #95

On Jun 10, 12:05 pm, Walter Bright <wal...@digitalmars-nospamm.com>
wrote:

James Kanze wrote:
The input must be scanned sequentially, I'm pretty sure,
since a #define along the way can clearly affect how the
following source is read.

Token pasting is another feature that mucks up all hope of
doing things non-sequentially.

Generally speaking, the pre-processor is a real problem for a
lot of reasons.

And I rather suspect that it must also be parsed
sequentially, since the grammar is not context
free---whether a symbol is the name of a type, the name of a
template, or something else, affects parsing.

Take a look at the rules for looking up names. What names the
compiler 'sees' depends very much on a sequential view of the
input, which affects overloading, which affects ...

Yes. Even without the problems in the grammar, name binding
supposes some degree of sequential reading.

But once you've got your parse trees, couldn't you
parallelize the processing of each function: low-level
optimization and code generation?

Yes, you could probably do that in parallel for each function,
though you'd have to do a complex merge process to turn the
result into a single object file.

The standard doesn't require a single object file:-).

I decided that wasn't worth the effort, because the bulk of
the time spent was in the front end which wasn't
parallelizable. The big gains would be in asynchronously
processing all those header files.

It could potentially be a significant gain if you did extensive
optimization. Except that, of course, extensive optimization,
today, means going beyond function boundaries, and we're back to
where we started. There probably are possibilities for
parallelization in some of the most advances optimization
techniques, but I've not studied the issues enough to be sure.
(In the end, much advanced optimization involves visiting nodes
in a graph, and I think that there are ways to parallelize
this, although I don't know whether they are pratical or only
theoretical.)

P.S. Even worse for C++ is that header files must be
reprocessed for every source file compilation. So, if you have
m source files, each with a header file, and every source file
winds up #include'ing every header (a normal, if regrettable,
situation), compilation times are O(m*m).

And for the application headers, even farming the compiles out
to different machines (in parallel) may not work; since the
application headers will normally reside on one machine, you may
end up saturating the network. (I've seen this in real life.
The usual ethernet degrades rapidly when the number of
collisions gets too high.)

The D programming language is designed so that import files
compile independently of where they are imported, so
compilation times are O(m).

P.P.S. Yes, I know all about precompiled headers in C++, but
there is no way to make pch perfectly language conformant. You
have to accept some deviation from the standard to use them.

I'm not sure of that, but you certainly need more infrastructure
than is present in any compiler I know of currently (except
maybe Visual Age). Basically, the compiler needs a data base:
the first time it sees a header, it notes all of the macros (and
name bindings?) used in that header, and stores the information
(including the macro definitions) in a data base. The next time
it sees the header, it checks whether all of the definitions are
the same, and uses the results of the previous compilation if
so.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 27 '08 #96

On Jun 10, 11:36 am, Ian Collins <ian-n...@hotmail.comwrote:

James Kanze wrote:
On Jun 10, 12:39 am, ian-n...@hotmail.com wrote:

If your process is designed for rapid building to offset
the cost of extra coupling then the advantages of templates
may outweigh the cost. If a clean build of your project
takes a long time, the productivity cost will outweigh any
benefits.

The clean build isn't the problem. You can schedule that
overnight, or for a weekend. (For my library, a clean build
for all of the versions I support under Unix takes something
like eight hours. Which doesn't bother me too much.) The
problem is the incremental builds when someone bug-fixes
something in the implementation. For non-templates, that
means recompiling a single .cc file; for templates,
recompiling all source files which include the header. A
difference between maybe 5 seconds, and a couple of minutes.
Which is a very significant difference if you're sitting in
front of the computer, waiting for it to finish.

You can say the same for a change to any header.

Yes. Which is why you don't want to modify headers more often
than necessary. And why you ban as many implementation details
as possible in the headers, and use the compilation firewall
idiom rather regularly. And strictly limit the use of templates
and inline functions, since both require the implementation in
the header.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 27 '08 #97

On Jun 11, 4:54 am, Michael Furman <MichaelFur...@Yahoo.comwrote:

James Kanze wrote:
....

The clean build isn't the problem. You can schedule that
overnight, or for a weekend. (For my library, a clean build
for all of the versions I support under Unix takes something
like eight hours. Which doesn't bother me too much.) The
problem is the incremental builds when someone bug-fixes
something in the implementation. For non-templates, that
means recompiling a single .cc file; for templates,
recompiling all source files which include the header. A
difference between maybe 5 seconds, and a couple of minutes.
Which is a very significant difference if you're sitting in
front of the computer, waiting for it to finish.

I love when compilation takes more then a couple of seconds: I
have extra time to think! Sometimes it ends with killing the
compilation and doing something else, rather then trying the
result.

Interesting development process. I usually try to think before
editing, much less compiling.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 27 '08 #98