roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson) writes:[color=blue]
> In article <1126630993.723249.185830@g49g2000cwa.googlegroups .com>,
> Lucas Zimmerman <netbogus@gmail.com> wrote:[color=green]
>>I tried to compile the following code with gcc:
>>------
>>#include <stdio.h>
>>@
>>
>>int main(void) {
>> return 0;
>>}
>>-------[/color]
>[color=green]
>>the output was:
>>t.c:2: error: syntax error at '@' token[/color]
>[color=green]
>>My question then is: why gcc says `syntax error'?[/color]
>
> Why not?
>[color=green]
>>I'm not
>>sure what is happening here but I think the lexical analyzer
>>is passing '@' as a valid token to the parser and then parser
>>says `ok, I'm not expecting a @ so, syntax error'.[/color]
>[color=green]
>>am I missing something? I thought lex would be responsible
>>for giving this error message since '@' is (AFAIC) not a valid
>>C token.[/color]
>
> It appears to me that you are assuming that the program 'lex' is
> being used to do lexical analysis, and that the result is passed
> to gcc. gcc does not, however, use 'lex': it has its own built-in
> lexical analyzer as -part- of its processing. gcc doesn't even
> have a seperate preprocessing program (e.g., "cpp"): it does
> everything up to an intermediate code representation in a single
> unified program. There might be a bunch of different routines
> that that unified program calls upon, but that part is all one
> program, so all the error messages are going to appear to be
> from the same program.[/color]
Or perhaps he was using "lex" as an abbreviation of "lexical
analyzer". (In any case, the "lex" program *generates* a lexical
analyzer.)
Some versions of gcc do use a separate preprocessor. For example,
"gcc -v" with version 2.95.2 shows that it invokes "cpp" followed by
"cc1". Later versions just invoke "cc1". (Later phases aren't
invoked if there's a failure in an earlier phase.)
This is off-topic, except that it illustrates that a compiler has a
lot of freedom in how it implements the translation phases described
in section 5.1.1.2 of the standard.
With gcc versions 3.4.4 and 4.0.0, the error message I get is
"error: stray '@' in program".
Also, note that a lone @ character *is* a valid preprocessor token,
though it isn't a valid token. This means that this:
#if 0
@
#endif
int main(void){}
is a legal program, but this:
#if 0
"
#endif
int main(void){}
isn't (it invokes undefined behavior).
The point of all this is that, although the standard defines 8
distinct translation phases, an implementation is not required to
implement them as separate sequential phases. As long as it processes
legal programs correctly and issues diagnostics where required, it can
do whatever it likes.
--
Keith Thompson (The_Other_Keith)
kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.