By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
449,411 Members | 1,030 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 449,411 IT Pros & Developers. It's quick & easy.

The preprocessor is just a pass

P: n/a
Is it accurate to say that "the preprocessor is just a pass in the parsing
of the source file"?

I responded to that comment by saying that the preprocessor is not just a
pass. It processes statements that the compiler does not process. The good
people in the alt.comp.lang.learn.c-c++ newsgroup insist that the
preprocessor is just one of many passes. The preprocessor processes a
grammer unique to the preprocessor and only that grammer.

The discussion is at:

What in fact is the preprocessor?
http://groups.google.com/group/alt.c...1df10e2b29fbc2
May 27 '07 #1
Share this Question
Share on Google+
31 Replies


P: n/a
Sam of California wrote:
Is it accurate to say that "the preprocessor is just a pass in the parsing
of the source file"?

I responded to that comment by saying that the preprocessor is not just a
pass. It processes statements that the compiler does not process. The good
people in the alt.comp.lang.learn.c-c++ newsgroup insist that the
preprocessor is just one of many passes. The preprocessor processes a
grammer unique to the preprocessor and only that grammer.

The discussion is at:

What in fact is the preprocessor?
http://groups.google.com/group/alt.c...1df10e2b29fbc2

What I don't understand is your statement

"The preprocessor is not just a pass. It processes statements that the
compiler does not process. The language is very clear that the
preprocessor's statements are totally different from the compiler."

The second sentence is true, the third sentence is true. I don't
understand how the first sentence follows.

The preprocessor is just a pass, as far as I am concerned. By 'just a
pass' I mean that the preprocessor can be totally seperated from the
other phases (or passes) that proceed and follow it, i.e. the output of
each pass is the input to the next pass that follows it. Maybe you have
a different definition of 'just a pass'.

john
May 27 '07 #2

P: n/a
"John Harrison" <jo*************@hotmail.comwrote in message
news:Ek*****************@newsfe7-gui.ntli.net...
>
"The preprocessor is not just a pass. It processes statements that the
compiler does not process. The language is very clear that the
preprocessor's statements are totally different from the compiler."

The second sentence is true, the third sentence is true. I don't
understand how the first sentence follows.
Of course the first sentence is vague. It can be interpreted in many ways. I
clarify the first sentence with the subsequent sentences.
May 27 '07 #3

P: n/a
Sam of California wrote:
"John Harrison" <jo*************@hotmail.comwrote in message
news:Ek*****************@newsfe7-gui.ntli.net...
>"The preprocessor is not just a pass. It processes statements that the
compiler does not process. The language is very clear that the
preprocessor's statements are totally different from the compiler."

The second sentence is true, the third sentence is true. I don't
understand how the first sentence follows.

Of course the first sentence is vague. It can be interpreted in many ways. I
clarify the first sentence with the subsequent sentences.

Well it seems to me that you are defining 'a pass' in a certain way. And
as a consequence of the way you have defined 'a pass' it is true that
preprocessing is not just a pass.

No doubt those who disagreed with you (me included) defined 'a pass' in
a different way, so they are right as well.

This kind of argument about definitions is very boring, so I'm not
taking any further part, unless you have a substantive point to make. At
the moment I don't see it.

john
May 27 '07 #4

P: n/a
On Sun, 27 May 2007 08:11:04 -0700, Sam of California wrote:
>Is it accurate to say that "the preprocessor is just a pass in the parsing
of the source file"?

I responded to that comment by saying that the preprocessor is not just a
pass.
How can a processor be a pass; something which performs a pass, at
most.

Informally, I use the terms "preprocessing" or "preprocessing phase"
to identify --roughly-- the sequence of what the ISO standard defines
as phase 3 and phase 4 of translation -- but only when there isn't any
need to be more precise (it is also the name of one of the directories
in the Breeze source tree, for instance). In any case, the standard
doesn't use the term "preprocessor", nor "preprocessing" as a
standalone noun (it uses expressions such as "preprocessing
directive", though, which may make somewhat reasonable the personal
terminology choice explained above. It arose exactly because I didn't
like to use the term "preprocessor").

--
Gennaro Prota -- C++ Developer, For Hire
https://sourceforge.net/projects/breeze/
May 27 '07 #5

P: n/a
Sam of California wrote:
Is it accurate to say that "the preprocessor is just a pass in the parsing
of the source file"?
Very casually, yes.
I responded to that comment by saying that the preprocessor is not just a
pass. It processes statements that the compiler does not process. The good
people in the alt.comp.lang.learn.c-c++ newsgroup insist that the
preprocessor is just one of many passes. The preprocessor processes a
grammer unique to the preprocessor and only that grammer.
Historically, the preprocessor was a separate program. It read
preprocessor-ready code and wrote processed C code, without the extra #
statements and such. Then the C compiler read the raw C code.

Nowadays we naturally use only one compiling program. The vestiges of the
CPP filter have migrated into it, including a separate lexing system. So the
CPP does respect "" quotes and // comment markers, but does not respect
delimiters like {}. It is unaware they delimit blocks.

Explaining the system gets easier if you treat the preprocessor as a second
pass thru the text of the program. The historical note helps.

--
Phlip
http://flea.sourceforge.net/PiglegToo_1.html
May 27 '07 #6

P: n/a
On May 27, 7:36 pm, Gennaro Prota <address@spam_this.comwrote:
On Sun, 27 May 2007 08:11:04 -0700, Sam of California wrote:
Is it accurate to say that "the preprocessor is just a pass in the parsing
of the source file"?
I responded to that comment by saying that the preprocessor is not just a
pass.
How can a processor be a pass; something which performs a pass, at
most.
Informally, I use the terms "preprocessing" or "preprocessing phase"
to identify --roughly-- the sequence of what the ISO standard defines
as phase 3 and phase 4 of translation -- but only when there isn't any
need to be more precise (it is also the name of one of the directories
in the Breeze source tree, for instance). In any case, the standard
doesn't use the term "preprocessor", nor "preprocessing" as a
standalone noun (it uses expressions such as "preprocessing
directive", though, which may make somewhat reasonable the personal
terminology choice explained above. It arose exactly because I didn't
like to use the term "preprocessor").
To quote the standard (§2.1/7): "[In phase 7] Each preprocessing
token is converted into a token." I've always understood
everything which preceded this (i.e. phases 1-6) to be
"preprocessing", and the "preprocessor" whatever does the
"preprocessing". With regards to "passes", I rather suspect
that very few compilers today use a separate pass for this; it's
generally integrated one way or another into the tokenization of
the input. Which doesn't mean that we can't speak of the
preprocessor, just because it isn't a separate pass.

--
James Kanze (Gabi Software) email: ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

May 27 '07 #7

P: n/a
Is it accurate to say that "the preprocessor is just a pass in the parsing
of the source file"?
The C++ preprocessing step is very well defined, and every C++ programmer
knows precisely what it does. On the other hand, a "pass" is not a well
defined term. So, question "is preprocessing a pass" makes no sense. The
question you have to answer first is "what is a pass". Then, the answer to
your original question will follow automatically.

Alternatively, you can spend two weeks discussing whether preprocessor is
something you don't know, or if it isn't it. Good luck.

May 27 '07 #8

P: n/a
"Marcin Kalicinski" <ka****@poczta.onet.plwrote in message
news:pI******************************@eclipse.net. uk...
>
The C++ preprocessing step is very well defined
Yes, thank you.
On the other hand, a "pass" is not a well defined term.
Yes.
So, question "is preprocessing a pass" makes no sense. The question you
have to answer first is "what is a pass". Then, the answer to your
original question will follow automatically.
Yes. The problem is that everyone is putting emphasis on "pass" and
commenting on that only. Then they say, using various terminology, that my
question is nonsense (makes no sense).

Everyone is ignoring the important part, except for brief commenst such as
yours. You say that the "C++ preprocessing step is very well defined", and
that is the important part that is ignored or only briefly commented on.

I am sorry if I fail at using the correct terminology. What I am saying is
that describing the preprocessor as (just) a pass is saying that there is
nothing unique about preprocessor statements or directives. If "preprocessor
statements" is incorrect terminology then what should I say? Is
"preprocessor directives" correct? Is it correct to say that the
preprocessor's grammer is separate from all the rest of the grammer for
C/C++?
May 28 '07 #9

P: n/a
>
I am sorry if I fail at using the correct terminology. What I am saying is
that describing the preprocessor as (just) a pass is saying that there is
nothing unique about preprocessor statements or directives. If "preprocessor
statements" is incorrect terminology then what should I say? Is
"preprocessor directives" correct? Is it correct to say that the
preprocessor's grammer is separate from all the rest of the grammer for
C/C++?
Yes that last statement is true. In fact the tokens that the
preprocessing grammar operates on are different from the tokens that the
main C++ grammar operates on.

But you introduced this terminaolgy 'just a pass'. I still don't think
that 'the preprocessor's grammer is separate from all the rest of the
grammer' justifies the statement 'the preprocessor is not just a pass',
quite the opposite I would say. The very fact that the two grammars are
unrelated *encourages* me to describe preprocessing as just a pass.

At least I'm sure you can agree that using this terminolgy is confusing,
just look at this thread and the last. So if you want to find out more
about preprocessing I suggest you drop it.

john
May 28 '07 #10

P: n/a
On May 27, 11:40 pm, "Marcin Kalicinski" <kal...@poczta.onet.pl>
wrote:
Is it accurate to say that "the preprocessor is just a pass in the parsing
of the source file"?
The C++ preprocessing step is very well defined, and every C++ programmer
knows precisely what it does.
Except that no two know exactly the same thing. The standard
speaks of phases of compilation. Taken literatally, "the C++
preprocessing step" would be phase 4. Generally however, one
tends to speak of the preprocessor for everything through phase
6.
On the other hand, a "pass" is not a well defined term.
It is in compiler technology. It's a separate phase which
treats the entire program. Most compilers today use four
passes: a front-end, which does preprocessing, tokenizing and
parsing; a "middle-end", which does more or less processor
independent optimizations, such as common sub-routine
elimination, a back-end, which does code generation, and a
peephole optimizer, which does peephole optimization. But of
course there are a lot of variants: most compilers will skip the
"middle-end" unless you've asked for optimization, and many
merge the back-end and the peephole optimizer into a single
pass.

The original C compilers, way back when, did use a separate pass
for the preprocessor, but I rather doubt that any compiler does
so today.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

May 28 '07 #11

P: n/a
On 27 May 2007 13:42:33 -0700, James Kanze wrote:
>On May 27, 7:36 pm, Gennaro Prota <address@spam_this.comwrote:
>On Sun, 27 May 2007 08:11:04 -0700, Sam of California wrote:
>Is it accurate to say that "the preprocessor is just a pass in the parsing
of the source file"?
>I responded to that comment by saying that the preprocessor is not just a
pass.
>How can a processor be a pass; something which performs a pass, at
most.
>Informally, I use the terms "preprocessing" or "preprocessing phase"
to identify --roughly-- the sequence of what the ISO standard defines
as phase 3 and phase 4 of translation -- but only when there isn't any
need to be more precise (it is also the name of one of the directories
in the Breeze source tree, for instance). In any case, the standard
doesn't use the term "preprocessor", nor "preprocessing" as a
standalone noun (it uses expressions such as "preprocessing
directive", though, which may make somewhat reasonable the personal
terminology choice explained above. It arose exactly because I didn't
like to use the term "preprocessor").

To quote the standard (§2.1/7): "[In phase 7] Each preprocessing
token is converted into a token." I've always understood
everything which preceded this (i.e. phases 1-6) to be
"preprocessing", and the "preprocessor" whatever does the
"preprocessing".
Yes, that's another possibility. As I said, neither terms is defined
by the standard, and everything is quite vague. People use the terms
quite informally.

Speaking of terminology and personal preferences, I've always felt
that the standard could have given a name to the phases, rather than
just numbering them. In that case, I'd see something like (off the top
of my head: don't focus too much on the names):

Character Mapping (1)
Line Splicing (2)
Pre-tokenization (3)
Preprocessing (4)
Execution Character Set Mapping (5)
Literal Concatenation (6)
Tokenization (7a)
Syntactical and Semantic Analysis (7b)
Translation (7c)
Instantiation (8)
Linking (9)

I have noted in parentheses how they --more or less-- correspond to
the numbers used in the standard; in practice though if one used the
names then the separation would be slightly different and likely end
up in 10/12 items (in effect, what I've always felt odd isn't that
much that there are no names; rather it's the strange grouping of
things --see especially 7-- which in turn becomes manifest if you try
to give a *fitting* name to those groups).

Judging by my perception of the word "preprocessor", I see (3) and (4)
as definitely in its area of concern, though (3) is probably somehow
an "implementation detail"; (1) seems something which is logically
preceding and (2) is somehow borderline.
>With regards to "passes", I rather suspect
that very few compilers today use a separate pass for this; it's
generally integrated one way or another into the tokenization of
the input. Which doesn't mean that we can't speak of the
preprocessor, just because it isn't a separate pass.
One could speak of it as a conceptual entity (for those who like to
show off: the "abstract machine" doing preprocessing :-)). But the
facts are... the one rigorous specification we have, the standard,
doesn't define the term.

To sum it up, the original question is simply ill-posed. The
translation of a C++ program conceptually happens in phases, as
described in the standard. One may decide to call preprocessing some
specific sub-sequence, and compilation some other, but there's no such
official terminology. To some, a compiler is what performs phases from
(7a) to 8, included. A linker what performs (9). Others mean by
"compiler", or "translator", the executor of the whole translation.

--
Gennaro Prota -- C++ Developer, For Hire
https://sourceforge.net/projects/breeze/
May 28 '07 #12

P: n/a
On May 28, 7:48 pm, Gennaro Prota <address@spam_this.comwrote:
On 27 May 2007 13:42:33 -0700, James Kanze wrote:
On May 27, 7:36 pm, Gennaro Prota <address@spam_this.comwrote:
On Sun, 27 May 2007 08:11:04 -0700, Sam of California wrote:
Is it accurate to say that "the preprocessor is just a pass in the parsing
of the source file"?
I responded to that comment by saying that the preprocessor is not just a
pass.
How can a processor be a pass; something which performs a pass, at
most.
Informally, I use the terms "preprocessing" or "preprocessing phase"
to identify --roughly-- the sequence of what the ISO standard defines
as phase 3 and phase 4 of translation -- but only when there isn't any
need to be more precise (it is also the name of one of the directories
in the Breeze source tree, for instance). In any case, the standard
doesn't use the term "preprocessor", nor "preprocessing" as a
standalone noun (it uses expressions such as "preprocessing
directive", though, which may make somewhat reasonable the personal
terminology choice explained above. It arose exactly because I didn't
like to use the term "preprocessor").
To quote the standard (§2.1/7): "[In phase 7] Each preprocessing
token is converted into a token." I've always understood
everything which preceded this (i.e. phases 1-6) to be
"preprocessing", and the "preprocessor" whatever does the
"preprocessing".
Yes, that's another possibility. As I said, neither terms is defined
by the standard, and everything is quite vague. People use the terms
quite informally.
There's also the historical context to be considered. In
Johnson's pcc, the preprocessor was a separate pass, before the
compiler front-end pass. Roughly speaking, the preprocessor
read your code, broke it up into preprocessing tokens, did what
it did, and then spit out text. The front end then read this
text, and broke it up into language tokens, and parsed it. Line
breaks had significance in the pre-processor, but not in the
front-end.

Based on this, it seems logical to make the break at the point
where preprocessor tokens are converted into language tokens,
and all white space (including new-lines) ceases to have any
significance.
Speaking of terminology and personal preferences, I've always felt
that the standard could have given a name to the phases, rather than
just numbering them.
The sole role of the phases in the standard is to define the
order in which the different actions take place. Numbers are
very good for defining order. Everyone knows that 1 comes
before 2, but it must be explicitly stated that character
mapping comes before line splicing.
In that case, I'd see something like (off the top
of my head: don't focus too much on the names):
Character Mapping (1)
Line Splicing (2)
Pre-tokenization (3)
Preprocessing (4)
Execution Character Set Mapping (5)
Literal Concatenation (6)
Tokenization (7a)
Syntactical and Semantic Analysis (7b)
Translation (7c)
Instantiation (8)
Linking (9)
I have noted in parentheses how they --more or less-- correspond to
the numbers used in the standard; in practice though if one used the
names then the separation would be slightly different and likely end
up in 10/12 items (in effect, what I've always felt odd isn't that
much that there are no names; rather it's the strange grouping of
things --see especially 7-- which in turn becomes manifest if you try
to give a *fitting* name to those groups).
The order of the different operations in 7 is implicit---you
can't translate without having "syntactically and semantically
analyzed", and you can't syntactically and semantically analyse
without having language tokens. Since there are no alternatives
in the order, there's no need to separate into separate phases
to define the order.
Judging by my perception of the word "preprocessor", I see (3) and (4)
as definitely in its area of concern, though (3) is probably somehow
an "implementation detail"; (1) seems something which is logically
preceding and (2) is somehow borderline.
So what are phases 1 and 2: a prepreprocessor?
With regards to "passes", I rather suspect
that very few compilers today use a separate pass for this; it's
generally integrated one way or another into the tokenization of
the input. Which doesn't mean that we can't speak of the
preprocessor, just because it isn't a separate pass.
One could speak of it as a conceptual entity (for those who like to
show off: the "abstract machine" doing preprocessing :-)). But the
facts are... the one rigorous specification we have, the standard,
doesn't define the term.
To sum it up, the original question is simply ill-posed. The
translation of a C++ program conceptually happens in phases, as
described in the standard. One may decide to call preprocessing some
specific sub-sequence, and compilation some other, but there's no such
official terminology. To some, a compiler is what performs phases from
(7a) to 8, included. A linker what performs (9). Others mean by
"compiler", or "translator", the executor of the whole translation.
The traditional break (in C) has been: preprocessor: phases 1
through 6, compiler: phase 7, linker phase 9. (Phase 8 is
concerned with instantiating templates, and doesn't have a place
in traditional C.) If you're talking about passes, however,
most compilers today will use a single pass for everything
through your 7b, above, then up to three passes for 7c, and
linking remains separate. Where phase 8 fits in varies, but I
suspect that a lot of modern compilers cram it into the first
pass as well.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

May 29 '07 #13

P: n/a
On 29 May 2007 00:46:24 -0700, James Kanze wrote:
>To quote the standard (§2.1/7): "[In phase 7] Each preprocessing
token is converted into a token." I've always understood
everything which preceded this (i.e. phases 1-6) to be
"preprocessing", and the "preprocessor" whatever does the
"preprocessing".
>Yes, that's another possibility. As I said, neither terms is defined
by the standard, and everything is quite vague. People use the terms
quite informally.

There's also the historical context to be considered. In
Johnson's pcc, the preprocessor was a separate pass, before the
compiler front-end pass. Roughly speaking, the preprocessor
read your code, broke it up into preprocessing tokens, did what
it did, and then spit out text. The front end then read this
text, and broke it up into language tokens, and parsed it. Line
breaks had significance in the pre-processor, but not in the
front-end.
Yes. There's also another context (many refer to this one, I guess):
those of separate preprocessor executables. Borland CPP32 is an
example (and actually the most conformant preprocessor I know of --for
some odd reason the one builtin in the compiler isn't even close).
They usually take your source code as input and produce a textual file
which can be fed to the compiler proper (which is just what most
compilers can do with the appropriate command line switches, too); but
as far as I remember (it's not that I really use them) they won't
output UCNs, for instance. Which means that conceptually the compiler
proper has to perform (1) again. So, I see (1) as something which
logically comes "first": it is more or less necessary each time you
read a textual file, even if you start with phase 7 directly. (But let
me know if my point is clear :-))
[...]
The order of the different operations in 7 is implicit---you
can't translate without having "syntactically and semantically
analyzed", and you can't syntactically and semantically analyse
without having language tokens. Since there are no alternatives
in the order, there's no need to separate into separate phases
to define the order.
I see, but I'd have liked it anyway, stylistically/logically.
>Judging by my perception of the word "preprocessor", I see (3) and (4)
as definitely in its area of concern, though (3) is probably somehow
an "implementation detail"; (1) seems something which is logically
preceding and (2) is somehow borderline.

So what are phases 1 and 2: a prepreprocessor?
Sort of :-) Seriously, I think of "preprocessing" as the
transformation which happens on the source text by executing all the
preprocessing directives. That is *my own* perception of the word, as
I said, and it probably originates from the fact that, before I had
the standard, I thought that all that preprocessing is about was
executing #includes and expanding #defines.

--
Gennaro Prota -- C++ Developer, For Hire
https://sourceforge.net/projects/breeze/
May 29 '07 #14

P: n/a
(My post is pretty much "an aside" to the highly technical nature of the
thread. How should I actually post in that way? Putting an "Aside:" in front
of the "Re:"?).

"James Kanze" <ja*********@gmail.comwrote in message
news:11*********************@q69g2000hsb.googlegro ups.com...
On May 28, 7:48 pm, Gennaro Prota <address@spam_this.comwrote:
On 27 May 2007 13:42:33 -0700, James Kanze wrote:
On May 27, 7:36 pm, Gennaro Prota <address@spam_this.comwrote:
On Sun, 27 May 2007 08:11:04 -0700, Sam of California wrote:
Is it accurate to say that "the preprocessor is just a pass in the
parsing
of the source file"?
I responded to that comment by saying that the preprocessor is not
just a
pass.
How can a processor be a pass; something which performs a pass, at
most.
Informally, I use the terms "preprocessing" or "preprocessing phase"
to identify --roughly-- the sequence of what the ISO standard defines
as phase 3 and phase 4 of translation -- but only when there isn't any
need to be more precise (it is also the name of one of the directories
in the Breeze source tree, for instance). In any case, the standard
doesn't use the term "preprocessor", nor "preprocessing" as a
standalone noun (it uses expressions such as "preprocessing
directive", though, which may make somewhat reasonable the personal
terminology choice explained above. It arose exactly because I didn't
like to use the term "preprocessor").
To quote the standard (§2.1/7): "[In phase 7] Each preprocessing
token is converted into a token." I've always understood
everything which preceded this (i.e. phases 1-6) to be
"preprocessing", and the "preprocessor" whatever does the
"preprocessing".
Yes, that's another possibility. As I said, neither terms is defined
by the standard, and everything is quite vague. People use the terms
quite informally.
"There's also the historical context to be considered. In
Johnson's pcc, the preprocessor was a separate pass, before the
compiler front-end pass. Roughly speaking, the preprocessor
read your code, broke it up into preprocessing tokens, did what
it did, and then spit out text. The front end then read this
text, and broke it up into language tokens, and parsed it. Line
breaks had significance in the pre-processor, but not in the
front-end."

I'm all for "the good ol' days" if it makes compiler system construction
easier. Do you high-end engineers feel that only the current "state of the
art" machinery is worthy of building upon? (If you know me yet, you know
what I think: that if its really complex, then it's not foundational).

"Based on this, it seems logical to make the break at the point
where preprocessor tokens are converted into language tokens,
and all white space (including new-lines) ceases to have any
significance."

See, now that's something I could grok if I wanted to learn about compiler
construction and wanted to develop a compiler. (I hope people still want to
build "simple" compilers, because I surely don't want to do it!)

All "remote references" aside, aren't things like "optimizing compilers" for
scientific computing and the like only now? I mean, I want to build my
program with multiple threads (!) (yes, and with C++!). Pretty risky once
you turn on the optimizations huh?

<Thoughts about "saving the preprocessor's life".... no wait, "giving it its
life back!", omitted>.
Speaking of terminology and personal preferences, I've always felt
that the standard could have given a name to the phases, rather than
just numbering them.
"The sole role of the phases in the standard is to define the
order in which the different actions take place. Numbers are
very good for defining order. Everyone knows that 1 comes
before 2, but it must be explicitly stated that character
mapping comes before line splicing."

Isn't that a programmer's dream (!): a sequential list of things to program.
(I admit, a bit boring, but fine work when the brain is only at half
capacity).
In that case, I'd see something like (off the top
of my head: don't focus too much on the names):
Character Mapping (1)
Line Splicing (2)
Pre-tokenization (3)
Preprocessing (4)
Execution Character Set Mapping (5)
Literal Concatenation (6)
Tokenization (7a)
Syntactical and Semantic Analysis (7b)
Translation (7c)
Instantiation (8)
Linking (9)
Damn, I'm learning too much about this stuff now and feel like I'm "going
backwards" again! :P (No worries though, I'm never going to write a
compiler!)
To sum it up, the original question is simply ill-posed. The
translation of a C++ program conceptually happens in phases, as
described in the standard. One may decide to call preprocessing some
specific sub-sequence, and compilation some other, but there's no such
official terminology. To some, a compiler is what performs phases from
(7a) to 8, included. A linker what performs (9). Others mean by
"compiler", or "translator", the executor of the whole translation.
"The traditional break (in C) has been: preprocessor: phases 1
through 6, compiler: phase 7, linker phase 9. (Phase 8 is
concerned with instantiating templates, and doesn't have a place
in traditional C.) If you're talking about passes, however,
most compilers today will use a single pass for everything
through your 7b, above, then up to three passes for 7c, and
linking remains separate. Where phase 8 fits in varies, but I
suspect that a lot of modern compilers cram it into the first
pass as well."

And if one wanted to get that kind of info formally, where would someone get
that? Certainly not the dragon compiler book (?). (Or would "one" just hire
you or you company to use that knowledge?)

John

May 30 '07 #15

P: n/a
On May 30, 6:22 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:

[It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]
"James Kanze" <james.ka...@gmail.comwrote in message
[...]
All "remote references" aside, aren't things like "optimizing
compilers" for scientific computing and the like only now?
They're for anyone who needs the performance. Modern RISC
processors are designed so that you almost always need some
optimization.
I mean, I want to build my program with multiple threads (!)
(yes, and with C++!). Pretty risky once you turn on the
optimizations huh?
Not really. Even after optimizing, the compiler must respect
the appropriate guarantees. And not optimizing doesn't give you
any additional guarantees.

[...]
"The traditional break (in C) has been: preprocessor: phases 1
through 6, compiler: phase 7, linker phase 9. (Phase 8 is
concerned with instantiating templates, and doesn't have a place
in traditional C.) If you're talking about passes, however,
most compilers today will use a single pass for everything
through your 7b, above, then up to three passes for 7c, and
linking remains separate. Where phase 8 fits in varies, but I
suspect that a lot of modern compilers cram it into the first
pass as well."
And if one wanted to get that kind of info formally, where
would someone get that?
I don't know. The "traditional break" is just one of those
things you knew if you were programming under Unix back in the
1980's. What modern compilers do is a result of observation,
using a number of different modern compilers. (I seem to recall
having seen it actually mentionned in the documentation for g++,
but I could easily be wrong.)
Certainly not the dragon compiler book (?). (Or would "one" just hire
you or you company to use that knowledge?)
What would a company want to do with that kind of knowledge.
It's really purely annecdotal, and certainly has no effect on
how you write C++ or develop programs. (That doesn't mean that
it is uninteresting. Just that it doesn't have any real
financial value, that a company would pay for.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

May 30 '07 #16

P: n/a

"James Kanze" <ja*********@gmail.comwrote in message
news:11*********************@h2g2000hsg.googlegrou ps.com...
On May 30, 6:22 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:

" [It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]"

For some reason, my newsgroup reader doesn't insert the symbols in certain
responses (on yours for instance!). I don't know why!
"James Kanze" <james.ka...@gmail.comwrote in message
[...]
All "remote references" aside, aren't things like "optimizing
compilers" for scientific computing and the like only now?
"They're for anyone who needs the performance. Modern RISC
processors are designed so that you almost always need some
optimization."

I was suggesting that most programs don't really need it because processors
are so fast these days and the majority of programs aren't of the
"scientific-computing" kind.
I mean, I want to build my program with multiple threads (!)
(yes, and with C++!). Pretty risky once you turn on the
optimizations huh?
"Not really. Even after optimizing, the compiler must respect
the appropriate guarantees. And not optimizing doesn't give you
any additional guarantees."

Preventing some reordering of statements within blocks by not turning on
optimization is what I was thinking.

[...]
"The traditional break (in C) has been: preprocessor: phases 1
through 6, compiler: phase 7, linker phase 9. (Phase 8 is
concerned with instantiating templates, and doesn't have a place
in traditional C.) If you're talking about passes, however,
most compilers today will use a single pass for everything
through your 7b, above, then up to three passes for 7c, and
linking remains separate. Where phase 8 fits in varies, but I
suspect that a lot of modern compilers cram it into the first
pass as well."
And if one wanted to get that kind of info formally, where
would someone get that?
I don't know. The "traditional break" is just one of those
things you knew if you were programming under Unix back in the
1980's. What modern compilers do is a result of observation,
using a number of different modern compilers. (I seem to recall
having seen it actually mentionned in the documentation for g++,
but I could easily be wrong.)
Certainly not the dragon compiler book (?). (Or would "one" just hire
you or you company to use that knowledge?)
"What would a company want to do with that kind of knowledge."

Build a compiler for a new language.

"It's really purely annecdotal, and certainly has no effect on
how you write C++ or develop programs. (That doesn't mean that
it is uninteresting. Just that it doesn't have any real
financial value, that a company would pay for.)"
Building a compiler for a new language?

John
Jun 1 '07 #17

P: n/a
On Jun 1, 10:44 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@h2g2000hsg.googlegrou ps.com...
On May 30, 6:22 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
" [It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]"
For some reason, my newsgroup reader doesn't insert the >
symbols in certain responses (on yours for instance!). I don't
know why!
Then it's time to change newsreaders. Even Google gets this
right.
"James Kanze" <james.ka...@gmail.comwrote in message
[...]
All "remote references" aside, aren't things like "optimizing
compilers" for scientific computing and the like only now?
"They're for anyone who needs the performance. Modern RISC
processors are designed so that you almost always need some
optimization."
I was suggesting that most programs don't really need it
because processors are so fast these days and the majority of
programs aren't of the "scientific-computing" kind.
Most programs are IO bound, and of course, they won't use
optimization. But you don't have to be in scientific computing
to find exceptions.
I mean, I want to build my program with multiple threads (!)
(yes, and with C++!). Pretty risky once you turn on the
optimizations huh?
"Not really. Even after optimizing, the compiler must respect
the appropriate guarantees. And not optimizing doesn't give you
any additional guarantees."
Preventing some reordering of statements within blocks by not
turning on optimization is what I was thinking.
But you don't have any more guarantees. I'm most familiar with
the Posix environment, and Posix compliant compilers give the
guarantees you need, regardless of optimization. If you don't
want things like write order rearranged, you need special
hardware instructions---the compiler inserts these where needed
according to the guarantees it gives, and not otherwise.
Regardless of the level of optimization.
[...]
"What would a company want to do with that kind of knowledge."
Build a compiler for a new language.
If you want to write a compiler, you hire someone who knows
about compilers, and modern machine architecture. It is a
specialized field.
"It's really purely annecdotal, and certainly has no effect on
how you write C++ or develop programs. (That doesn't mean that
it is uninteresting. Just that it doesn't have any real
financial value, that a company would pay for.)"
Building a compiler for a new language?
I don't know many companies in that business:-). But seriously,
you don't hire beginners for that. You hire people who know
compilers.

--
James Kanze (Gabi Software) email: ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 2 '07 #18

P: n/a
On Fri, 1 Jun 2007 15:44:35 -0500, JohnQ wrote:
" [It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]"

For some reason, my newsgroup reader doesn't insert the symbols in certain
responses (on yours for instance!). I don't know why!
<OT>
You might want to google for "GNKSA" and "Outlook Express 6"
</OT>

--
Genny
Jun 2 '07 #19

P: n/a

"James Kanze" <ja*********@gmail.comwrote in message
news:11*********************@q69g2000hsb.googlegro ups.com...
On Jun 1, 10:44 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@h2g2000hsg.googlegrou ps.com...
On May 30, 6:22 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
" [It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]"
For some reason, my newsgroup reader doesn't insert the >
symbols in certain responses (on yours for instance!). I don't
know why!
"Then it's time to change newsreaders. Even Google gets this
right."

I'll keep an eye on it now, but I think it may only be happening
in response to your posts.
"James Kanze" <james.ka...@gmail.comwrote in message
[...]
All "remote references" aside, aren't things like "optimizing
compilers" for scientific computing and the like only now?
"They're for anyone who needs the performance. Modern RISC
processors are designed so that you almost always need some
optimization."
I was suggesting that most programs don't really need it
because processors are so fast these days and the majority of
programs aren't of the "scientific-computing" kind.
"Most programs are IO bound, and of course, they won't use
optimization. But you don't have to be in scientific computing
to find exceptions."

So, optimization can be avoided is what you're saying.
I mean, I want to build my program with multiple threads (!)
(yes, and with C++!). Pretty risky once you turn on the
optimizations huh?
"Not really. Even after optimizing, the compiler must respect
the appropriate guarantees. And not optimizing doesn't give you
any additional guarantees."
Preventing some reordering of statements within blocks by not
turning on optimization is what I was thinking.
"But you don't have any more guarantees. I'm most familiar with
the Posix environment, and Posix compliant compilers give the
guarantees you need, regardless of optimization. If you don't
want things like write order rearranged, you need special
hardware instructions---the compiler inserts these where needed
according to the guarantees it gives, and not otherwise.
Regardless of the level of optimization."

That's the theory at least. Better safe than sorry. Turning on
optimization for testing whether funny things happen is a good
idea though.
[...]
"What would a company want to do with that kind of knowledge."
Build a compiler for a new language.
"If you want to write a compiler, you hire someone who knows
about compilers, and modern machine architecture. It is a
specialized field."

Or learn it yourself it you are young enough.
"It's really purely annecdotal, and certainly has no effect on
how you write C++ or develop programs. (That doesn't mean that
it is uninteresting. Just that it doesn't have any real
financial value, that a company would pay for.)"
Building a compiler for a new language?
"I don't know many companies in that business:-). But seriously,
you don't hire beginners for that. You hire people who know
compilers."

Isn't it dependendent on how complex the language is? Are front
ends or back ends harder to create? Don't both of those exist
premade already (EDG?).

John
Jun 5 '07 #20

P: n/a
On Jun 5, 7:34 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@q69g2000hsb.googlegro ups.com...
On Jun 1, 10:44 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@h2g2000hsg.googlegrou ps.com...
On May 30, 6:22 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
" [It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]"
For some reason, my newsgroup reader doesn't insert the >
symbols in certain responses (on yours for instance!). I don't
know why!
"Then it's time to change newsreaders. Even Google gets this
right."
I'll keep an eye on it now, but I think it may only be happening
in response to your posts.
No one else seems to have that problem.
"James Kanze" <james.ka...@gmail.comwrote in message
[...]
All "remote references" aside, aren't things like "optimizing
compilers" for scientific computing and the like only now?
"They're for anyone who needs the performance. Modern RISC
processors are designed so that you almost always need some
optimization."
I was suggesting that most programs don't really need it
because processors are so fast these days and the majority of
programs aren't of the "scientific-computing" kind.
"Most programs are IO bound, and of course, they won't use
optimization. But you don't have to be in scientific computing
to find exceptions."
So, optimization can be avoided is what you're saying.
Sure. In practice, you don't activate optimization unless you
need it. Many systems are IO bound, and there's no point in
being complicated when you don't have to be.
I mean, I want to build my program with multiple threads (!)
(yes, and with C++!). Pretty risky once you turn on the
optimizations huh?
"Not really. Even after optimizing, the compiler must respect
the appropriate guarantees. And not optimizing doesn't give you
any additional guarantees."
Preventing some reordering of statements within blocks by not
turning on optimization is what I was thinking.
"But you don't have any more guarantees. I'm most familiar with
the Posix environment, and Posix compliant compilers give the
guarantees you need, regardless of optimization. If you don't
want things like write order rearranged, you need special
hardware instructions---the compiler inserts these where needed
according to the guarantees it gives, and not otherwise.
Regardless of the level of optimization."
That's the theory at least.
In practice, as well.

Optimization is an additional complication for the compiler, and
so does increase the risk of error. But that's independant of
multi-threading or not---I don't think I've ever seen a case
where optimization caused a threading problem that wasn't
already there before.
Better safe than sorry. Turning on optimization for testing
whether funny things happen is a good idea though.
[...]
"It's really purely annecdotal, and certainly has no effect on
how you write C++ or develop programs. (That doesn't mean that
it is uninteresting. Just that it doesn't have any real
financial value, that a company would pay for.)"
Building a compiler for a new language?
"I don't know many companies in that business:-). But seriously,
you don't hire beginners for that. You hire people who know
compilers."
Isn't it dependendent on how complex the language is?
Certainly. It takes about six man-months to develop a C
compiler front-end (without optimizer); I'd guess off hand that
it takes at least four times that for C++, maybe even a lot
more.
Are front ends or back ends harder to create?
They're different. A good optimizer can be extremely difficult.
The real difference, perhaps, is that having chosen your
language, you've chosen the complexity of the front end; the
complexity of what follows depends largely on how much you want
to optimize. Traditionally, you'd count on about two man-months
to develop a simple back-end, but on a modern pipelined
architecture, such a back-end is likely to have very poor
performance.
Don't both of those exist premade already (EDG?).
EDG will sell you a front end for some languages, probably for a
lot, lot less than it would cost you to develop it yourself.
There are still reasons why some compilers don't use it,
however: g++, of course, because it is not GPL; Sun and
Microsoft, probably because they want to maintain more control
in house; other companies for perhaps other reasons.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 6 '07 #21

P: n/a

"James Kanze" <ja*********@gmail.comwrote in message
news:11*********************@q69g2000hsb.googlegro ups.com...
On Jun 5, 7:34 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@q69g2000hsb.googlegro ups.com...
On Jun 1, 10:44 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@h2g2000hsg.googlegrou ps.com...
On May 30, 6:22 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
" [It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]"
For some reason, my newsgroup reader doesn't insert the >
symbols in certain responses (on yours for instance!). I don't
know why!
"Then it's time to change newsreaders. Even Google gets this
right."
I'll keep an eye on it now, but I think it may only be happening
in response to your posts.
"No one else seems to have that problem."

Weird huh?
"James Kanze" <james.ka...@gmail.comwrote in message
[...]
All "remote references" aside, aren't things like "optimizing
compilers" for scientific computing and the like only now?
"They're for anyone who needs the performance. Modern RISC
processors are designed so that you almost always need some
optimization."
I was suggesting that most programs don't really need it
because processors are so fast these days and the majority of
programs aren't of the "scientific-computing" kind.
"Most programs are IO bound, and of course, they won't use
optimization. But you don't have to be in scientific computing
to find exceptions."
So, optimization can be avoided is what you're saying.
"Sure. In practice, you don't activate optimization unless you
need it. Many systems are IO bound, and there's no point in
being complicated when you don't have to be."

Well that was part of _my_ point to begin with.
I mean, I want to build my program with multiple threads (!)
(yes, and with C++!). Pretty risky once you turn on the
optimizations huh?
"Not really. Even after optimizing, the compiler must respect
the appropriate guarantees. And not optimizing doesn't give you
any additional guarantees."
Preventing some reordering of statements within blocks by not
turning on optimization is what I was thinking.
"But you don't have any more guarantees. I'm most familiar with
the Posix environment, and Posix compliant compilers give the
guarantees you need, regardless of optimization. If you don't
want things like write order rearranged, you need special
hardware instructions---the compiler inserts these where needed
according to the guarantees it gives, and not otherwise.
Regardless of the level of optimization."
That's the theory at least.
"In practice, as well."

You're making the assumption that the additional complexity of
optimization added on top of the non-optimizing compiler does
not introduce any more potential for error (MT case or other).
I would err toward the risk-averse side OTOH.

"Optimization is an additional complication for the compiler, and
so does increase the risk of error. But that's independant of
multi-threading or not---I don't think I've ever seen a case
where optimization caused a threading problem that wasn't
already there before."

OK, you agree then, good. Same thought exactly.
Better safe than sorry. Turning on optimization for testing
whether funny things happen is a good idea though.
[...]
"It's really purely annecdotal, and certainly has no effect on
how you write C++ or develop programs. (That doesn't mean that
it is uninteresting. Just that it doesn't have any real
financial value, that a company would pay for.)"
Building a compiler for a new language?
"I don't know many companies in that business:-). But seriously,
you don't hire beginners for that. You hire people who know
compilers."
**********************

<QUOTE>
Isn't it dependendent on how complex the language is?
Certainly. It takes about six man-months to develop a C
compiler front-end (without optimizer); I'd guess off hand that
it takes at least four times that for C++, maybe even a lot
more.
Are front ends or back ends harder to create?
They're different. A good optimizer can be extremely difficult.
The real difference, perhaps, is that having chosen your
language, you've chosen the complexity of the front end; the
complexity of what follows depends largely on how much you want
to optimize. Traditionally, you'd count on about two man-months
to develop a simple back-end, but on a modern pipelined
architecture, such a back-end is likely to have very poor
performance.
Don't both of those exist premade already (EDG?).
EDG will sell you a front end for some languages, probably for a
lot, lot less than it would cost you to develop it yourself.
There are still reasons why some compilers don't use it,
however: g++, of course, because it is not GPL; Sun and
Microsoft, probably because they want to maintain more control
in house; other companies for perhaps other reasons.
</QUOTE>

So, ... if I have a "simpler language" and don't need optimization,
my implementation costs will be WAY less (than C or C++,
especially the latter). :) The ideal would be to be able to feed
a grammar, or better yet, parameters, to a tool and have it spit out
the compiler: an end to "languages of great complexity"? There's
a research potential: "Investigation of Parametric Language
Specification and Implications on Compiler Generation". The
intermediate code for the purposes of the research could be C++
(actually, very controlled constructs and usage of C++).

John
Jun 6 '07 #22

P: n/a

"James Kanze" <ja*********@gmail.comwrote in message
news:11*********************@q69g2000hsb.googlegro ups.com...
On Jun 5, 7:34 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@q69g2000hsb.googlegro ups.com...
On Jun 1, 10:44 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@h2g2000hsg.googlegrou ps.com...
On May 30, 6:22 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
" [It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]"
For some reason, my newsgroup reader doesn't insert the >
symbols in certain responses (on yours for instance!). I don't
know why!
"Then it's time to change newsreaders. Even Google gets this
right."
I'll keep an eye on it now, but I think it may only be happening
in response to your posts.
"No one else seems to have that problem."

I tried to find another post/poster in which OE 6 wouldn't put the '>'
symbol in front of the lines but I couldn't find any. Your posts are the
only ones that cause the anomaly. FYI.

John
Jun 7 '07 #23

P: n/a
On Jun 8, 12:56 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@q69g2000hsb.googlegro ups.com...
On Jun 5, 7:34 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@q69g2000hsb.googlegro ups.com...
On Jun 1, 10:44 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
>news:11*********************@h2g2000hsg.googlegro ups.com...
On May 30, 6:22 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
" [It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]"
For some reason, my newsgroup reader doesn't insert the >
symbols in certain responses (on yours for instance!). I don't
know why!
"Then it's time to change newsreaders. Even Google gets this
right."
I'll keep an eye on it now, but I think it may only be happening
in response to your posts.
"No one else seems to have that problem."
I tried to find another post/poster in which OE 6 wouldn't put the '>'
symbol in front of the lines but I couldn't find any. Your posts are the
only ones that cause the anomaly. FYI.
So what is non-conformant in my posts? If nothing, then you
probably have to change newsreaders.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 8 '07 #24

P: n/a

"James Kanze" <ja*********@gmail.comwrote in message
news:11**********************@k79g2000hse.googlegr oups.com...
On Jun 8, 12:56 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@q69g2000hsb.googlegro ups.com...
On Jun 5, 7:34 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@q69g2000hsb.googlegro ups.com...
On Jun 1, 10:44 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
>news:11*********************@h2g2000hsg.googlegro ups.com...
On May 30, 6:22 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
" [It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]"
For some reason, my newsgroup reader doesn't insert the >
symbols in certain responses (on yours for instance!). I don't
know why!
"Then it's time to change newsreaders. Even Google gets this
right."
I'll keep an eye on it now, but I think it may only be happening
in response to your posts.
"No one else seems to have that problem."
I tried to find another post/poster in which OE 6 wouldn't put the '>'
symbol in front of the lines but I couldn't find any. Your posts are the
only ones that cause the anomaly. FYI.
"So what is non-conformant in my posts?"

I don't know. Maybe nothing. But obviously something is different about them
since I can't find anyone else's posts that causes the same anomaly. It
could be my OE 6 or the default settings. I dunno. But since it's only your
posts that are causing it, I tend to think you have some kind of "special
setup" on your end. Do you? I mean are you using something bizarre or do you
have something setup "specially"? From your post headers, I see the extended
ASCII character set is being used. That's a shot in the dark, but I can see
how that could maybe be an issue possibly. Why not use just 7-bit ASCII (is
it an option?)? I don't know what NNTP expects, but I do know that Windoze
has it's own interpretation of "extended ASCII" (Latin-1).

"If nothing, then you probably have to change newsreaders."

You say that so casually. If I was a heavy user of USENET, I might seek out
another program. I used to use Gravity a loooong time ago, but I'm not going
to change up now because I use newsgroups so lightly right now.

If you wanna go back and forth on this issue, we should take it out of the
group bandwidth probably.

John
Jun 9 '07 #25

P: n/a
On Jun 9, 7:13 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11**********************@k79g2000hse.googlegr oups.com...
On Jun 8, 12:56 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@q69g2000hsb.googlegro ups.com...
On Jun 5, 7:34 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
>news:11*********************@q69g2000hsb.googlegr oups.com...
On Jun 1, 10:44 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@h2g2000hsg.googlegrou ps.com...
On May 30, 6:22 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
" [It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]"
For some reason, my newsgroup reader doesn't insert the >
symbols in certain responses (on yours for instance!). I don't
know why!
"Then it's time to change newsreaders. Even Google gets this
right."
I'll keep an eye on it now, but I think it may only be happening
in response to your posts.
"No one else seems to have that problem."
I tried to find another post/poster in which OE 6 wouldn't put the '>'
symbol in front of the lines but I couldn't find any. Your posts are the
only ones that cause the anomaly. FYI.
"So what is non-conformant in my posts?"
I don't know. Maybe nothing. But obviously something is
different about them since I can't find anyone else's posts
that causes the same anomaly.
So what's different?
It could be my OE 6 or the default settings. I dunno. But
since it's only your posts that are causing it, I tend to
think you have some kind of "special setup" on your end. Do
you?
If you think that Google is a "special setup". (It is, in many
ways, but I don't think I'm the only one posting through
Google.)
I mean are you using something bizarre or do you have
something setup "specially"? From your post headers, I see the
extended ASCII character set is being used. That's a shot in
the dark, but I can see how that could maybe be an issue
possibly. Why not use just 7-bit ASCII (is it an option?)?
Because it doesn't have all of the characters I need. Check out
me .sig---there are accents in my address.

Note that from what little I can see, Google translates what I
send into UTF-8 anyway; what you see is NOT what I have on my
machines.
I don't know what NNTP expects, but I do know that Windoze
has it's own interpretation of "extended ASCII" (Latin-1).
That should be irrelevant. There shouldn't be any special
characters in my headers, and normally, there aren't any in the
body of my post. I'll modify my usually .sig for this one, so
there won't be any special characters there, either, and we'll
see.
"If nothing, then you probably have to change newsreaders."
You say that so casually.
IF it's not conform, you shouldn't be using it. It's not like
you don't have a choice.
If I was a heavy user of USENET, I might seek out another
program. I used to use Gravity a loooong time ago, but I'm not
going to change up now because I use newsgroups so lightly
right now.
If you wanna go back and forth on this issue, we should take
it out of the group bandwidth probably.
Logically, yes, but since the problem only seems to appear when
you respond to my postings, it's a little difficult.

What you might try for starters is to send me an exact copy of
the last message you received from me (one that causes this
problem), including any headers. At least like that, I can see
if there is something unusual about what ends up under NNTP from
me (after Google has gotten through with it).

--
James Kanze (Gabi Software) email: ja*********@gmail.com
Conseils en informatique orientee objet/
Beratung in objektorientierter Datenverarbeitung
9 place Semard, 78210 St.-Cyr-l'Ecole, France, +33 (0)1 30 23 00 34

Jun 9 '07 #26

P: n/a
James Kanze wrote:
On Jun 9, 7:13 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
>I don't know what NNTP expects, but I do know that Windoze
has it's own interpretation of "extended ASCII" (Latin-1).

That should be irrelevant. There shouldn't be any special
characters in my headers, and normally, there aren't any in the
body of my post. I'll modify my usually .sig for this one, so
there won't be any special characters there, either, and we'll
see.
That's odd. The message I am just replying has the header
Content-Type: text/plain; charset="iso-8859-1"
whereas the one I read immediately afterwards
<11**********************@c77g2000hse.googlegroups .comcontains
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
although it has accented characters in your .sig and I can read them
probably because iso-8891-15 is my default reading character set. I
wonder if this depends on the server I use for reading your posts as
well?

Ralf
Jun 10 '07 #27

P: n/a
On Jun 10, 12:45 pm, Ralf Goertz
<r_goe...@expires-2006-11-30.arcornews.dewrote:
James Kanze wrote:
On Jun 9, 7:13 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
I don't know what NNTP expects, but I do know that Windoze
has it's own interpretation of "extended ASCII" (Latin-1).
That should be irrelevant. There shouldn't be any special
characters in my headers, and normally, there aren't any in the
body of my post. I'll modify my usually .sig for this one, so
there won't be any special characters there, either, and we'll
see.
That's odd. The message I am just replying has the header
Content-Type: text/plain; charset="iso-8859-1"
whereas the one I read immediately afterwards
<1181425516.429532.273...@c77g2000hse.googlegroups .comcontains
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
although it has accented characters in your .sig and I can read them
probably because iso-8891-15 is my default reading character set. I
wonder if this depends on the server I use for reading your posts as
well?
That is wierd. My local set-up is supposed to use ISO 8859-1.
The message you are replying to was sent from home; perhaps the
other was sent from work, and there is something different in
the way Firefox is set up there. I notice that in my local
environment here, LC_CTYPE is set to "en_US". I'll check what
I've got set at work---I thought I was using "C", so that file
names would be sorted normally.

--
James Kanze (Gabi Software) email: ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 10 '07 #28

P: n/a
On 2007-06-10 04:16:00 -0700, James Kanze <ja*********@gmail.comsaid:
On Jun 10, 12:45 pm, Ralf Goertz
<r_goe...@expires-2006-11-30.arcornews.dewrote:
>James Kanze wrote:
>>On Jun 9, 7:13 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
I don't know what NNTP expects, but I do know that Windoze
has it's own interpretation of "extended ASCII" (Latin-1).
>>That should be irrelevant. There shouldn't be any special
characters in my headers, and normally, there aren't any in the
body of my post. I'll modify my usually .sig for this one, so
there won't be any special characters there, either, and we'll
see.
>That's odd. The message I am just replying has the header
>>Content-Type: text/plain; charset="iso-8859-1"
>whereas the one I read immediately afterwards
<1181425516.429532.273...@c77g2000hse.googlegroup s.comcontains
>>Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
>although it has accented characters in your .sig and I can read them
probably because iso-8891-15 is my default reading character set. I
wonder if this depends on the server I use for reading your posts as
well?

That is wierd. My local set-up is supposed to use ISO 8859-1.
The message you are replying to was sent from home; perhaps the
other was sent from work, and there is something different in
the way Firefox is set up there. I notice that in my local
environment here, LC_CTYPE is set to "en_US". I'll check what
I've got set at work---I thought I was using "C", so that file
names would be sorted normally.
FYI this message also had a 'Content-Type: text/plain;
charset="us-ascii"' header. Perhaps this is (yet another) bug in Google
groups. *sigh*

--
Clark S. Cox III
cl*******@gmail.com

Jun 10 '07 #29

P: n/a
Ralf Goertz wrote:
James Kanze wrote:
On Jun 9, 7:13 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
I don't know what NNTP expects, but I do know that Windoze
has it's own interpretation of "extended ASCII" (Latin-1).
That should be irrelevant. There shouldn't be any special
characters in my headers, and normally, there aren't any in the
body of my post. I'll modify my usually .sig for this one, so
there won't be any special characters there, either, and we'll
see.

That's odd. The message I am just replying has the header
Content-Type: text/plain; charset="iso-8859-1"

whereas the one I read immediately afterwards
<11**********************@c77g2000hse.googlegroups .comcontains
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
As I recall, Google Groups uses quoted-printable in all replies. That's
been a on-going concern, especially when coupled with some version of
Outlook Express having a bug that doesn't quote those messages
correctly (leaves out the ).


Brian
Jun 10 '07 #30

P: n/a
Headers from the post that works, followed by headers from a post that
causes the anomaly.

Here are the headers from the post that works:

Path:
newsdbm02.news.prodigy.net!newsdst02.news.prodigy. net!prodigy.com!newscon02.news.prodigy.net!prodigy .net!border1.nntp.dca.giganews.com!nntp.giganews.c om!postnews.google.com!p47g2000hsd.googlegroups.co m!not-for-mail
From: James Kanze <ja*********@gmail.com>
Newsgroups: comp.lang.c++
Subject: Re: The preprocessor is just a pass (OT: Problem responding to your
posts)
Date: Sat, 09 Jun 2007 19:37:47 -0000
Organization: http://groups.google.com
Lines: 103
Message-ID: <11**********************@p47g2000hsd.googlegroups .com>
References: <46**********************@roadrunner.com>
<el********************************@4ax.com>
<11*********************@m36g2000hse.googlegroups. com>
<he********************************@4ax.com>
<11*********************@q69g2000hsb.googlegroups. com>
<k_*****************@newssvr22.news.prodigy.net>
<11*********************@h2g2000hsg.googlegroups.c om>
<Ly****************@newssvr23.news.prodigy.net>
<11*********************@q69g2000hsb.googlegroups. com>
<n8******************@newssvr23.news.prodigy.net >
<11*********************@q69g2000hsb.googlegroups. com>
<62******************@newssvr11.news.prodigy.net >
<11**********************@k79g2000hse.googlegroups .com>
<_E******************@newssvr19.news.prodigy.net >
NNTP-Posting-Host: 86.70.189.206
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Trace: posting.google.com 1181417867 4781 127.0.0.1 (9 Jun 2007 19:37:47
GMT)
X-Complaints-To: gr**********@google.com
NNTP-Posting-Date: Sat, 9 Jun 2007 19:37:47 +0000 (UTC)
In-Reply-To: <_E******************@newssvr19.news.prodigy.net >
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US;
rv:1.8.1.1) Gecko/20061208 Firefox/2.0.0.1,gzip(gfe),gzip(gfe)
Complaints-To: gr**********@google.com
Injection-Info: p47g2000hsd.googlegroups.com; posting-host=86.70.189.206;
posting-account=uN4QgA0AAAC_qk3WofNKjyjXNSBMXL2b
X-Received-Date: Sat, 09 Jun 2007 15:37:50 EDT (newsdbm02.news.prodigy.net)

Here are the headers from a post that causes the problem:

Path:
newsdbm02.news.prodigy.net!newsdst02.news.prodigy. net!prodigy.com!newscon02.news.prodigy.net!prodigy .net!border1.nntp.dca.giganews.com!nntp.giganews.c om!postnews.google.com!k79g2000hse.googlegroups.co m!not-for-mail
From: James Kanze <ja*********@gmail.com>
Newsgroups: comp.lang.c++
Subject: Re: The preprocessor is just a pass (OT: Problem responding to your
posts)
Date: Fri, 08 Jun 2007 09:25:45 -0000
Organization: http://groups.google.com
Lines: 42
Message-ID: <11**********************@k79g2000hse.googlegroups .com>
References: <46**********************@roadrunner.com>
<el********************************@4ax.com>
<11*********************@m36g2000hse.googlegroups. com>
<he********************************@4ax.com>
<11*********************@q69g2000hsb.googlegroups. com>
<k_*****************@newssvr22.news.prodigy.net>
<11*********************@h2g2000hsg.googlegroups.c om>
<Ly****************@newssvr23.news.prodigy.net>
<11*********************@q69g2000hsb.googlegroups. com>
<n8******************@newssvr23.news.prodigy.net >
<11*********************@q69g2000hsb.googlegroups. com>
<62******************@newssvr11.news.prodigy.net >
NNTP-Posting-Host: 62.160.54.162
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1181294746 26013 127.0.0.1 (8 Jun 2007 09:25:46
GMT)
X-Complaints-To: gr**********@google.com
NNTP-Posting-Date: Fri, 8 Jun 2007 09:25:46 +0000 (UTC)
In-Reply-To: <62******************@newssvr11.news.prodigy.net >
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.4)
Gecko/20070515 Firefox/2.0.0.4,gzip(gfe),gzip(gfe)
Complaints-To: gr**********@google.com
Injection-Info: k79g2000hse.googlegroups.com; posting-host=62.160.54.162;
posting-account=uN4QgA0AAAC_qk3WofNKjyjXNSBMXL2b
X-Received-Date: Fri, 08 Jun 2007 05:25:46 EDT (newsdbm02.news.prodigy.net)
Jun 10 '07 #31

P: n/a
Yep! This post works.

John

"James Kanze" <ja*********@gmail.comwrote in message
news:11**********************@p47g2000hsd.googlegr oups.com...
On Jun 9, 7:13 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
>"James Kanze" <james.ka...@gmail.comwrote in message
>news:11**********************@k79g2000hse.googleg roups.com...
On Jun 8, 12:56 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@q69g2000hsb.googlegr oups.com...
On Jun 5, 7:34 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@q69g2000hsb.googlegr oups.com...
On Jun 1, 10:44 pm, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@h2g2000hsg.googlegrou ps.com...
On May 30, 6:22 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
" [It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]"
For some reason, my newsgroup reader doesn't insert the >
symbols in certain responses (on yours for instance!). I don't
know why!
"Then it's time to change newsreaders. Even Google gets this
right."
I'll keep an eye on it now, but I think it may only be happening
in response to your posts.
"No one else seems to have that problem."
I tried to find another post/poster in which OE 6 wouldn't put the '>'
symbol in front of the lines but I couldn't find any. Your posts are
the
only ones that cause the anomaly. FYI.
>"So what is non-conformant in my posts?"
>I don't know. Maybe nothing. But obviously something is
different about them since I can't find anyone else's posts
that causes the same anomaly.

So what's different?
>It could be my OE 6 or the default settings. I dunno. But
since it's only your posts that are causing it, I tend to
think you have some kind of "special setup" on your end. Do
you?

If you think that Google is a "special setup". (It is, in many
ways, but I don't think I'm the only one posting through
Google.)
>I mean are you using something bizarre or do you have
something setup "specially"? From your post headers, I see the
extended ASCII character set is being used. That's a shot in
the dark, but I can see how that could maybe be an issue
possibly. Why not use just 7-bit ASCII (is it an option?)?

Because it doesn't have all of the characters I need. Check out
me .sig---there are accents in my address.

Note that from what little I can see, Google translates what I
send into UTF-8 anyway; what you see is NOT what I have on my
machines.
>I don't know what NNTP expects, but I do know that Windoze
has it's own interpretation of "extended ASCII" (Latin-1).

That should be irrelevant. There shouldn't be any special
characters in my headers, and normally, there aren't any in the
body of my post. I'll modify my usually .sig for this one, so
there won't be any special characters there, either, and we'll
see.
>"If nothing, then you probably have to change newsreaders."
>You say that so casually.

IF it's not conform, you shouldn't be using it. It's not like
you don't have a choice.
>If I was a heavy user of USENET, I might seek out another
program. I used to use Gravity a loooong time ago, but I'm not
going to change up now because I use newsgroups so lightly
right now.
>If you wanna go back and forth on this issue, we should take
it out of the group bandwidth probably.

Logically, yes, but since the problem only seems to appear when
you respond to my postings, it's a little difficult.

What you might try for starters is to send me an exact copy of
the last message you received from me (one that causes this
problem), including any headers. At least like that, I can see
if there is something unusual about what ends up under NNTP from
me (after Google has gotten through with it).

--
James Kanze (Gabi Software) email: ja*********@gmail.com
Conseils en informatique orientee objet/
Beratung in objektorientierter Datenverarbeitung
9 place Semard, 78210 St.-Cyr-l'Ecole, France, +33 (0)1 30 23 00 34

Jun 11 '07 #32

This discussion thread is closed

Replies have been disabled for this discussion.