How to remove // comments

jacob navia

Recently, a heated debate started because of poor mr heathfield
was unable to compile a program with // comments.

Here is a utility for him, so that he can (at last) compile my
programs :-)

More seriously, this code takes 560 bytes. Amazing isn't it? C is very
ompact, you can do great things in a few bytes.

Obviously I have avoided here, in consideration for his pedantic
compiler flags, any C99 issues, so it will compile in obsolete
compilers, and with only ~600 bytes you can run it in the toaster!

--------------------------------------------------------------cut here

/* This program reads a C source file and writes it modified to stdout
All // comments will be replaced by /* ... */ comments, to easy the
porting to old environments or to post it in usenet, where
// comments can be broken in several lines, and messed up.
*/

#include <stdio.h>

/* This function reads a character and writes it to stdout */
static int Fgetc(FILE *f)
{
int c = fgetc(f);
if (c != EOF)
putchar(c);
return c;
}

/* This function skips strings */
static int ParseString(FILE *f)
{
int c = Fgetc(f);
while (c != EOF && c != '"') {
if (c == '\\')
c = Fgetc(f);
if (c != EOF)
c = Fgetc(f);
}
if (c == '"')
c = Fgetc(f);
return c;
}
/* Skips multi-line comments */
static int ParseComment(FILE *f)
{
int c = Fgetc(f);

while (1) {
while (c != '*') {
c = Fgetc(f);
if (c == EOF)
return EOF;
}
c = Fgetc(f);
if (c == '/')
break;
}
return Fgetc(f);
}

/* Skips // comments. Note that we use fgetc here and NOT Fgetc */
/* since we want to modify the output before gets echoed */
static int ParseCppComment(FILE *f)
{
int c = fgetc(f);

while (c != EOF && c != '\n') {
putchar(c);
c = fgetc(f);
}
if (c == '\n') {
puts(" */");
c = Fgetc(f);
}
return c;
}

/* Checks if a comment is followed after a '/' char */
static int CheckComment(int c,FILE *f)
{
if (c == '/') {
c = fgetc(f);
if (c == '*') {
putchar('*');
c = ParseComment(f);
}
else if (c == '/') {
putchar('*');
c = ParseCppComment(f);
}
else {
putchar(c);
c = Fgetc(f);
}
}
return c;
}

/* Skips chars between simple quotes */
static int ParseQuotedChar(FILE *f)
{
int c = Fgetc(f);
while (c != EOF && c != '\'') {
if (c == '\\')
c = Fgetc(f);
if (c != EOF)
c = Fgetc(f);
}
if (c == '\'')
c = Fgetc(f);
return c;
}
int main(int argc,char *argv[])
{
FILE *f;
int c;
if (argc == 1) {
fprintf(stderr,"Usage: %s <file.c>\n",argv[0]);
return EXIT_FAILURE;
}
f = fopen(argv[1],"r");
if (f == NULL) {
fprintf(stderr,"Can't find %s\n",argv[1]);
return EXIT_FAILURE;
}
c = Fgetc(f);
while (c != EOF) {
/* Note that each of the switches must advance the character */
/* read so that we avoid an infinite loop. */
switch (c) {
case '"':
c = ParseString(f);
break;
case '/':
c = CheckComment(c,f);
break;
case '\'':
c = ParseQuotedChar(f);
break;
default:
c = Fgetc(f);
}
}
fclose(f);
return 0;
}

Oct 19 '06 #1

Subscribe Post Reply

100

5007

Richard Heathfield

jacob navia said:

Recently, a heated debate started because of poor mr heathfield
was unable to compile a program with // comments.

Not so. It's not difficult to compile a program with // "comments" under
gcc. All I have to do is invoke gcc in non-conforming mode, thus foregoing
opportunities for useful diagnostic messages - something I'm not prepared
to do lightly.

Here is a utility for him, so that he can (at last) compile my
programs :-)

Alas, not yet. You see, the utility itself won't compile:

foo.c: In function `main':
foo.c:104: `EXIT_FAILURE' undeclared (first use in this function)
foo.c:104: (Each undeclared identifier is reported only once
foo.c:104: for each function it appears in.)
make: *** [foo.o] Error 1

Sometimes, words fail me.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Oct 20 '06 #2

Peter Nilsson

jacob navia wrote:

... poor mr heathfield ... Here is a utility for him ...

Tediously childish.

--------------------------------------------------------------cut here

/* This program reads a C source file and writes it modified to stdout
All // comments will be replaced by /* ... */ comments, to easy the

Perhaps you should write a utility that also fixes nested comments that
are not allowed by C90 or C99.

porting to old environments or to post it in usenet, where
// comments can be broken in several lines, and messed up.
*/

I'm sure there are alternative one line perl scripts floating around.

#include <stdio.h>

Does this header define the identifier EXIT_FAILURE which you use
further on? If so, your implementation is not conforming.

<snip>

Some test cases for you to consider...

int c = a //* ... */
b;
int d = '??''; // this is a // comment, is it translated?

--
Peter

Oct 20 '06 #3

Old Wolf

jacob navia wrote:

Recently, a heated debate started because of poor mr heathfield
was unable to compile a program with // comments.

Here is a utility for him, so that he can (at last) compile my
programs :-)

Hey, thanks :) One of the things on my TODO list was to
write such a utility, so I can compile a large project in ANSI
conformance mode and see if the compiler throws up any
errors. The project source is conforming (afaik!) except for the
use of // comments.

Oct 20 '06 #4

Richard Heathfield

Peter Nilsson said:

<snip>

>
Some test cases for you to consider...

int c = a //* ... */
b;
int d = '??''; // this is a // comment, is it translated?

After I hacked the code to get it to compile, it failed both those tests,
and it also failed the following two tests:

/\
/ this is a BCPL-style comment

and

// /* Comment */

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Oct 20 '06 #5

Keith Thompson

jacob navia <ja***@jacob.remcomp.frwrites:
[...]

Obviously I have avoided here, in consideration for his pedantic
compiler flags, any C99 issues,

Yes, you have.

so it will compile in obsolete
compilers,

No, it won't.

[...]

You *really* *really* need to try compiling your code before you post
it.

If whatever compiler you used actually accepted the code you posted,
then it's buggy.

You might also consider not acting as if portability is some horrible
burden being imposed on you personally, rather than just a good idea.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Oct 20 '06 #6

Walter Bright

Peter Nilsson wrote:

Some test cases for you to consider...

int c = a //* ... */
b;
int d = '??''; // this is a // comment, is it translated?

A trigraph case:

char* d = "??/""; // "

but of course I've never seen trigraphs outside of a test suite.

There's also backslash line splicing:

// this is the start of a comment \
that continues on this line

Oct 20 '06 #7

jacob navia

Walter Bright wrote:

Peter Nilsson wrote:

>Some test cases for you to consider...

int c = a //* ... */
b;
int d = '??''; // this is a // comment, is it translated?

A trigraph case:

char* d = "??/""; // "

but of course I've never seen trigraphs outside of a test suite.

Me neither. But I do not support trigraphs anyway. They are an
unnecessary feature. We had several lebgthy discussions about this in
comp.std.c.

There's also backslash line splicing:

// this is the start of a comment \
that continues on this line

Yes, I added that one.

Oct 20 '06 #8

Ben Bacarisse

Richard Heathfield <in*****@invalid.invalidwrites:

jacob navia said:

>Recently, a heated debate started because of poor mr heathfield
was unable to compile a program with // comments.

Not so. It's not difficult to compile a program with // "comments" under
gcc. All I have to do is invoke gcc in non-conforming mode, thus foregoing
opportunities for useful diagnostic messages - something I'm not prepared
to do lightly.

>Here is a utility for him, so that he can (at last) compile my
programs :-)

Alas, not yet. You see, the utility itself won't compile:

foo.c: In function `main':
foo.c:104: `EXIT_FAILURE' undeclared (first use in this function)
foo.c:104: (Each undeclared identifier is reported only once
foo.c:104: for each function it appears in.)
make: *** [foo.o] Error 1

Sometimes, words fail me.

I think there is a deeper irony. Did you relax you compiler options get
this far? If so, it allowed the non standard nested comment to pass.
I get a syntax error at the word "easy".

A program to correct non-C89 comments relies on an extension to the
comment syntax to compile!

--
Ben.

Oct 20 '06 #9

Ben Bacarisse

jacob navia <ja***@jacob.remcomp.frwrites:

Recently, a heated debate started because of poor mr heathfield
was unable to compile a program with // comments.

Here is a utility for him, so that he can (at last) compile my
programs :-)

More seriously, this code takes 560 bytes. Amazing isn't it? C is very
ompact, you can do great things in a few bytes.

You can do *some* things in 560 bytes. Great things, it seems need a
few more. You need to:

(a) include <stdlib.h>
(b) remove the nested comment.
(c) fix the logic bugs.

On given the admittedly clumsy but valid:

return 1//* what divisor? */2;

you program produces the invalid:

return 1/** what divisor? */2; */

And on the more plausible:

// I don't like */ delimiters

we get:

/* I don't like */ delimiters */

--
Ben.

Oct 20 '06 #10

jacob navia

Ben Bacarisse wrote:

jacob navia <ja***@jacob.remcomp.frwrites:

>>Recently, a heated debate started because of poor mr heathfield
was unable to compile a program with // comments.

Here is a utility for him, so that he can (at last) compile my
programs :-)

More seriously, this code takes 560 bytes. Amazing isn't it? C is very
ompact, you can do great things in a few bytes.

You can do *some* things in 560 bytes. Great things, it seems need a
few more. You need to:

(a) include <stdlib.h>

yeah

(b) remove the nested comment.

yes

(c) fix the logic bugs.

On given the admittedly clumsy but valid:

return 1//* what divisor? */2;

This is the same as finding a spurious */ in
a cpp comment. Corrected

>
you program produces the invalid:

return 1/** what divisor? */2; */

And on the more plausible:

// I don't like */ delimiters

we get:

/* I don't like */ delimiters */

!!!!

Corrected, thanks

Oct 20 '06 #11

Walter Bright

jacob navia wrote:

Walter Bright wrote:
>A trigraph case:

char* d = "??/""; // "

but of course I've never seen trigraphs outside of a test suite.

Me neither. But I do not support trigraphs anyway. They are an
unnecessary feature. We had several lebgthy discussions about this in
comp.std.c.

Trigraphs are a worthless feature. Nevertheless, they are in the
standard, and it's much less effort to implement them than it is to
constantly have to justify otherwise.

Aside from such trivial defects, overall the C standard was a vast
improvement over existing practice at the time: having multiple compiler
switches to be quirk-compatible with this or that dialect.

Oct 20 '06 #12

jacob navia

I edited the code before posting, without recompiling it.
Big mistake. I added a nested comment, and when replacing the
EXIT_FAILURE because of Keith's remarks in another thread
I forgot to add the stdlib.h include.

Besides, I have fixed the few logic bugs pointed out by you:
1) continuation lines that become comments
e.g. /\
/ comment
will become
/*
comment */
2) If a sequence */ is found in a cpp comment it will be replaced by
* /, i.e. a blank will be inserted. There is no other way to do that.
3) Trigraphs are NOT supported.

Thanks to all people that participated. Updated program below.
----------------------------------------------------cut here
#include <stdio.h>
#include <stdlib.h>
/* This function reads a character and writes it to stdout */
static int Fgetc(FILE *f)
{
int c = fgetc(f);
if (c != EOF)
putchar(c);
return c;
}

/* Skips strings */
static int ParseString(FILE *f)
{
int c = Fgetc(f);
while (c != EOF && c != '"') {
if (c == '\\')
c = Fgetc(f);
if (c != EOF)
c = Fgetc(f);
}
if (c == '"')
c = Fgetc(f);
return c;
}
/* Skips multi-line comments */
static int ParseComment(FILE *f)
{
int c = Fgetc(f);

while (1) {
while (c != '*') {
c = Fgetc(f);
if (c == EOF)
return EOF;
}
c = Fgetc(f);
if (c == '/')
break;
}
return Fgetc(f);
}

/* Skips // comments */
static int ParseCppComment(FILE *f)
{
int c = fgetc(f);

while (c != EOF && c != '\n') {
int last;
putchar(c);
last = c;
c = fgetc(f);
if (c == '/' && last == '*')
putchar(' ');
}
if (c == '\n') {
puts(" */");
c = Fgetc(f);
}
return c;
}

/* Checks if a comment is followed after a '/' char */
static int CheckComment(int c,FILE *f)
{
c = fgetc(f);
if (c == '*') {
putchar('*');
c = ParseComment(f);
}
else if (c == '/') {
putchar('*');
c = ParseCppComment(f);
}
else if (c == '\\') {
c = fgetc(f);
if (c == '\n') {
c = fgetc(f);
if (c == '/') {
printf("*\n");
ParseCppComment(f);
}
else printf("\\\n%c",c);
}
else {
putchar('\\');
putchar(c);
}
}
else {
putchar(c);
c = Fgetc(f);
}
return c;
}

/* Skips chars between simple quotes */
static int ParseQuotedChar(FILE *f)
{
int c = Fgetc(f);
while (c != EOF && c != '\'') {
if (c == '\\')
c = Fgetc(f);
if (c != EOF)
c = Fgetc(f);
}
if (c == '\'')
c = Fgetc(f);
return c;
}
int main(int argc,char *argv[])
{
FILE *f;
int c;
if (argc == 1) {
fprintf(stderr,"Usage: %s <file.c>\n",argv[0]);
return EXIT_FAILURE;
}
f = fopen(argv[1],"r");
if (f == NULL) {
fprintf(stderr,"Can't find %s\n",argv[1]);
return EXIT_FAILURE;
}
c = Fgetc(f);
while (c != EOF) {
/* Note that each of the switches must advance the character */
/* read so that we avoid an infinite loop. */
switch (c) {
case '"':
c = ParseString(f);
break;
case '/':
c = CheckComment(c,f);
break;
case '\'':
c = ParseQuotedChar(f);
break;
default:
c = Fgetc(f);
}
}
fclose(f);
return 0;
}

Oct 20 '06 #13

Ben Bacarisse

jacob navia <ja***@jacob.remcomp.frwrites:

Ben Bacarisse wrote:
>jacob navia <ja***@jacob.remcomp.frwrites:

>>>Recently, a heated debate started because of poor mr heathfield
was unable to compile a program with // comments.

Here is a utility for him, so that he can (at last) compile my
programs :-)

More seriously, this code takes 560 bytes. Amazing isn't it? C is very
ompact, you can do great things in a few bytes.
You can do *some* things in 560 bytes. Great things, it seems need a
few more. You need to:
(a) include <stdlib.h>
yeah

>(b) remove the nested comment.
yes

>(c) fix the logic bugs.
On given the admittedly clumsy but valid:
return 1//* what divisor? */2;

This is the same as finding a spurious */ in
a cpp comment.

I don't think so. The point of this test case is that it does not
*have* a CPP comment at all.

Corrected

No. Your new version produces:

return 1/** what divisor? * /2; */

(which is not a valid statement) from

return 1//* what divisor? */2;

which is, I think, a valid way to write return 1/2;

--
Ben.

Oct 20 '06 #14

jacob navia

Ben Bacarisse wrote:

jacob navia <ja***@jacob.remcomp.frwrites:

>>Ben Bacarisse wrote:

>>>jacob navia <ja***@jacob.remcomp.frwrites:
Recently, a heated debate started because of poor mr heathfield
was unable to compile a program with // comments.

Here is a utility for him, so that he can (at last) compile my
programs :-)

More seriously, this code takes 560 bytes. Amazing isn't it? C is very
ompact, you can do great things in a few bytes.

You can do *some* things in 560 bytes. Great things, it seems need a
few more. You need to:
(a) include <stdlib.h>

yeah

>>>(b) remove the nested comment.

yes

>>>(c) fix the logic bugs.
On given the admittedly clumsy but valid:
return 1//* what divisor? */2;

This is the same as finding a spurious */ in
a cpp comment.

I don't think so. The point of this test case is that it does not
*have* a CPP comment at all.

>>Corrected

No. Your new version produces:

return 1/** what divisor? * /2; */

(which is not a valid statement) from

return 1//* what divisor? */2;

which is, I think, a valid way to write return 1/2;

No. MSVC for instance will pre-proccess your statement to
return 1

without anything beyond the //
gcc will do the same
lcc-win32 will do the same

Oct 20 '06 #15

Walter Bright

Ben Bacarisse wrote:

No. Your new version produces:

return 1/** what divisor? * /2; */

(which is not a valid statement) from

return 1//* what divisor? */2;

which is, I think, a valid way to write return 1/2;

Jacob has that right. //* is lexed as the start of a // comment, not a
divide followed by the start of a /* comment. It's the same reason that:

i/*p;
i++; /* comment */
*p+3;

is parsed as:

(i * p) + 3;

i.e. the maximal munch rule.

Walter Bright
www.digitalmars.com C, C++, D programming language compilers

Oct 20 '06 #16

Ben Bacarisse

jacob navia <ja***@jacob.remcomp.frwrites:

Ben Bacarisse wrote:
>jacob navia <ja***@jacob.remcomp.frwrites:

>>>Ben Bacarisse wrote:

jacob navia <ja***@jacob.remcomp.frwrites:
>Recently, a heated debate started because of poor mr heathfield
>was unable to compile a program with // comments.
>
>Here is a utility for him, so that he can (at last) compile my
>programs :-)
>
>More seriously, this code takes 560 bytes. Amazing isn't it? C is very
>ompact, you can do great things in a few bytes.

You can do *some* things in 560 bytes. Great things, it seems need a
few more. You need to:
(a) include <stdlib.h>

yeah
(b) remove the nested comment.

yes
(c) fix the logic bugs.
On given the admittedly clumsy but valid:
return 1//* what divisor? */2;

This is the same as finding a spurious */ in
a cpp comment.
I don't think so. The point of this test case is that it does not
*have* a CPP comment at all.

>>>Corrected
No. Your new version produces:
return 1/** what divisor? * /2; */
(which is not a valid statement) from
return 1//* what divisor? */2;
which is, I think, a valid way to write return 1/2;

No. MSVC for instance will pre-proccess your statement to
return 1

without anything beyond the //
gcc will do the same
lcc-win32 will do the same

Yes, I had assumed you program would be "C89 safe", but I can see now that
there is no reasonable way that is could be.

--
Ben.

Oct 20 '06 #17

Ben Bacarisse

Walter Bright <wa****@digitalmars-nospamm.comwrites:

Ben Bacarisse wrote:

>No. Your new version produces:
return 1/** what divisor? * /2; */
(which is not a valid statement) from
return 1//* what divisor? */2;
which is, I think, a valid way to write return 1/2;

Jacob has that right. //* is lexed as the start of a // comment

Yes, ack'd already. I had stupidly thought the program should be C89
neutral, but the input will never be C89 if it has // intended as a
comment.

So, who wants to do moving declarations up to the top on the enclosing
block? :-)

--
Ben.

Oct 20 '06 #18

Bart

jacob navia wrote:

Besides, I have fixed the few logic bugs pointed out by you:
1) continuation lines that become comments
e.g. /\
/ comment
will become
/*
comment */

But the more likely

//\
comment

Won't work.

You also forgot the case:

#include <ftp://domain.com/myfile.h>

And your program output is very misleading when given the input:

#error // comments not allowed

Regards,
Bart.

Oct 20 '06 #19

jacob navia

Bart wrote:

jacob navia wrote:

>>Besides, I have fixed the few logic bugs pointed out by you:
1) continuation lines that become comments
e.g. /\
/ comment
will become
/*
comment */

But the more likely

//\
comment

Won't work.

Fixed

>
You also forgot the case:

#include <ftp://domain.com/myfile.h>

????
Well, URLs in #include directives...

Not yet.

And your program output is very misleading when given the input:

#error // comments not allowed

If they are not allowed...

Oct 20 '06 #20

Bart

jacob navia wrote:

Bart wrote:
You also forgot the case:

#include <ftp://domain.com/myfile.h>

????
Well, URLs in #include directives...

I don't remember seeing anything that forbids it.

And your program output is very misleading when given the input:

#error // comments not allowed

If they are not allowed...

That was just an example to show that your little program may entirely
change the meaning of an #error message. What if you had:

#error This is never supposed to happen (possible cause: // comments).

Regards,
Bart.

Oct 20 '06 #21

Bart

Bart wrote:

jacob navia wrote:
Bart wrote:
You also forgot the case:
>
#include <ftp://domain.com/myfile.h>
????
Well, URLs in #include directives...

I don't remember seeing anything that forbids it.

And your program output is very misleading when given the input:
>
#error // comments not allowed
If they are not allowed...

That was just an example to show that your little program may entirely
change the meaning of an #error message. What if you had:

#error This is never supposed to happen (possible cause: // comments).

Or since we're already talking about URLs, the more likely:

#error Please see http://domain.com/xyz for more information about this
error.

Regards,
Bart.

Oct 20 '06 #22

Jalapeno

Walter Bright wrote:

Peter Nilsson wrote:
Some test cases for you to consider...

int c = a //* ... */
b;
int d = '??''; // this is a // comment, is it translated?

A trigraph case:

char* d = "??/""; // "

but of course I've never seen trigraphs outside of a test suite.

Haven't worked in a z/OS shop before, huh? (or a Sys 370 one either)

It only takes an hour or two of working with int a??(8??); to get used
to them (and they become second nature quickly when you see them all
day long).

Oct 20 '06 #23

Jalapeno

Jalapeno wrote:

Walter Bright wrote:
Peter Nilsson wrote:
Some test cases for you to consider...
>
int c = a //* ... */
b;
int d = '??''; // this is a // comment, is it translated?
A trigraph case:

char* d = "??/""; // "

but of course I've never seen trigraphs outside of a test suite.

Haven't worked in a z/OS shop before, huh? (or a Sys 370 one either)

It only takes an hour or two of working with int a??(8??); to get used
to them (and they become second nature quickly when you see them all
day long).

Just for kicks I created a terminal emulator macro that put the '[' and
']' into a source file and the resultant int aÝ8¨; is more
difficult to read than int a??(8??); (at least to me).

The code compiles exactly the same.

Oct 20 '06 #24

CBFalconer

jacob navia wrote:

Walter Bright wrote:
>Peter Nilsson wrote:

>>Some test cases for you to consider...

int c = a //* ... */
b;
int d = '??''; // this is a // comment, is it translated?

A trigraph case:

char* d = "??/""; // "

but of course I've never seen trigraphs outside of a test suite.

Me neither. But I do not support trigraphs anyway. They are an
unnecessary feature. We had several lebgthy discussions about
this in comp.std.c.

I guess you have never seen a system without the following chars in
its char set. From N869:

5.2.1.1 Trigraph sequences

[#1] All occurrences in a source file of the following
sequences of three characters (called trigraph sequences11))
are replaced with the corresponding single character.

??= # ??) ] ??! |
??( [ ??' ^ ?? }
??/ \ ??< { ??- ~

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Oct 20 '06 #25

Walter Bright

Jalapeno wrote:

Walter Bright wrote:
>Peter Nilsson wrote:
>>Some test cases for you to consider...

int c = a //* ... */
b;
int d = '??''; // this is a // comment, is it translated?
A trigraph case:

char* d = "??/""; // "

but of course I've never seen trigraphs outside of a test suite.

Haven't worked in a z/OS shop before, huh? (or a Sys 370 one either)

No, I haven't. Nor has anyone I've worked with.

It only takes an hour or two of working with int a??(8??); to get used
to them (and they become second nature quickly when you see them all
day long).

I suppose one can get used to anything <g>.

Do you need to run non-trigraph C code through a source translater to
get it on to your z/OS system?

Oct 20 '06 #26

Walter Bright

Bart wrote:

jacob navia wrote:
>Bart wrote:
>>And your program output is very misleading when given the input:

#error // comments not allowed
If they are not allowed...

That was just an example to show that your little program may entirely
change the meaning of an #error message. What if you had:

#error This is never supposed to happen (possible cause: // comments).

I don't think that's a reasonable test case, since presumably C code
that uses // comments would not reasonably expect that #error line to work.

Oct 20 '06 #27

Andrey Koptyaev

try this:

#include <stdio.h>
#define BSIZE 200

int main (int argc,char *argv[]){
char *buf;
FILE *in,*out;
void comm(char *);
char *str1="//";
char *str2="\x22\x2f\x2f\x22";
char *buf1,*substr;
int i;

if (argc<3){
printf("to low parameters\n");
return 1;
}
in=fopen(argv[1],"rb");
if (in==NULL){
printf("file not opening %s\n",argv[1]);
return 1;
}
out=fopen(argv[2],"wb");
buf=malloc(BSIZE);
while(fgets(buf,BSIZE,in) != NULL){
if (!(substr=strstr(buf,str2))){
if (substr=strstr(buf,str1)){
buf1=calloc(strlen(buf)+3,1);
for (i=0;i<(strlen(buf)-strlen(substr));i++)
buf1[i]=buf[i];
buf1=strcat(buf1,"/*");
for (i=strlen(buf)-strlen(substr)+2;i<(strlen(buf)-2);i++)
buf1[i]=buf[i];
buf1=strcat(buf1,"*/");
buf1=strcat(buf1,"\x0d\x0a");
fputs(buf1,out);
free(buf1);
}
else
fputs(buf,out);
}
else
fputs(buf,out);
}
fclose(in);
fclose(out);
free(buf);
return 0;
}

Oct 20 '06 #28

Jalapeno

Walter Bright wrote:

Jalapeno wrote:
Walter Bright wrote:
but of course I've never seen trigraphs outside of a test suite.

Haven't worked in a z/OS shop before, huh? (or a Sys 370 one either)

No, I haven't. Nor has anyone I've worked with.

It only takes an hour or two of working with int a??(8??); to get used
to them (and they become second nature quickly when you see them all
day long).

I suppose one can get used to anything <g>.

Do you need to run non-trigraph C code through a source translater to
get it on to your z/OS system?

Not so much to _get_ the source text to the mainframe but for it to be
usable it'll need to be in EBCDIC.

A standard ASCII to EBCDIC conversion utility (like one used in a
typical terminal emulator) that uploads source text from a PC to the
mainframe will see the '[' as 0x5B and the ']' as 0x5D and will
translate them to the EBCDIC '[' as 0xAD and EBCDIC ']' as 0xBD.

so the ASCII text statement:

char x[8]; which in ASCII is

0x63 0x68 0x61 0x72 0x20 0x78 0x5B 0x38 0x5D 0x3B

will be translated in a "typical" terminal emulator utility to:

0x83 0x88 0x81 0x99 0x40 0xA7 0xAD 0xF8 0xBD 0x5E

but on the screen that looks like:

char xÝ8¨; and not char x[8];

this compiles but looks horrible on the screen and you can't type those
characters when you edit, you have to copy and paste those characters
(or create a macro). Even though the '[' and ']' exist in EBCDIC the
3270 family of terminals do not have those characters to type in or to
display.

If I manually change the characters Ý and ¨ using the terminal
emulator keyboard to '[' and ']', which the Windows keyboard has, the
encoding becomes 0xBA for '[' and 0xBB for ']' and you have

0x83 0x88 0x81 0x99 0x40 0xA7 0xBA 0xF8 0xBB 0x5E

which becomes a syntax error and won't compile.

Our code base apparently contains "vendor" supplied source in the char
xÝ8¨; format and "home grown" (and IBM supplied sample) source in the
char x??(8??); format. We don't normally modify the vendor source so
there isn't any need to replace the ugly "screen" characters with
trigraphs but the "home grown" code is edited much more frequently and
I've become used to dealing with trigraphs.

Oct 20 '06 #29

Walter Bright

Jalapeno wrote:

Walter Bright wrote:
>Do you need to run non-trigraph C code through a source translater to
get it on to your z/OS system?

Not so much to _get_ the source text to the mainframe but for it to be
usable it'll need to be in EBCDIC.

That's what I expected. That pretty much means that trigraphs are a
reasonable solution for such systems, but that since the characters
must be translated anyway, there's not much reason to support trigraphs
in the C language standard itself.

Our code base apparently contains "vendor" supplied source in the char
xÝ8¨; format and "home grown" (and IBM supplied sample) source in the
char x??(8??); format. We don't normally modify the vendor source so
there isn't any need to replace the ugly "screen" characters with
trigraphs but the "home grown" code is edited much more frequently and
I've become used to dealing with trigraphs.

Oct 20 '06 #30

jacob navia

Walter Bright wrote:

>
That's what I expected. That pretty much means that trigraphs are a
reasonable solution for such systems, but that since the characters must
be translated anyway, there's not much reason to support trigraphs in
the C language standard itself.

EXACTLY.

Why should the language specs be cluttered with such details?
Why should *I* bother about that?

jacob

Oct 20 '06 #31

Jalapeno

Walter Bright wrote:

Jalapeno wrote:
Walter Bright wrote:
Do you need to run non-trigraph C code through a source translater to
get it on to your z/OS system?
Not so much to _get_ the source text to the mainframe but for it to be
usable it'll need to be in EBCDIC.

That's what I expected. That pretty much means that trigraphs are a
reasonable solution for such systems, but that since the characters
must be translated anyway, there's not much reason to support trigraphs
in the C language standard itself.

Character translation is only necessary if the text originates on an
ASCII system. Since all the "home grown" code here (and that supplied
by IBM) originates on EBCDIC systems absolutly no translations are
necessary and trigraphs are useful. All the world is not a PC. The
standard acknowledges that. I also understand that you don't find much
reason to have trigraphs supported. Some people use them, a lot. IBM's
Mainframes have'nt disappeared, they've just been renamed "Servers" ;o).

Oct 20 '06 #32

jacob navia

Andrey Koptyaev wrote:

try this:

#include <stdio.h>
#define BSIZE 200

int main (int argc,char *argv[]){
char *buf;
FILE *in,*out;
void comm(char *);
char *str1="//";
char *str2="\x22\x2f\x2f\x22";
char *buf1,*substr;
int i;

if (argc<3){
printf("to low parameters\n");
return 1;
}
in=fopen(argv[1],"rb");
if (in==NULL){
printf("file not opening %s\n",argv[1]);
return 1;
}
out=fopen(argv[2],"wb");
buf=malloc(BSIZE);
while(fgets(buf,BSIZE,in) != NULL){
if (!(substr=strstr(buf,str2))){
if (substr=strstr(buf,str1)){
buf1=calloc(strlen(buf)+3,1);
for (i=0;i<(strlen(buf)-strlen(substr));i++)
buf1[i]=buf[i];
buf1=strcat(buf1,"/*");
for (i=strlen(buf)-strlen(substr)+2;i<(strlen(buf)-2);i++)
buf1[i]=buf[i];
buf1=strcat(buf1,"*/");
buf1=strcat(buf1,"\x0d\x0a");
fputs(buf1,out);
free(buf1);
}
else
fputs(buf,out);
}
else
fputs(buf,out);
}
fclose(in);
fclose(out);
free(buf);
return 0;
}

Excuse me but this will blindly search a // sequence anywhere in the
line you get. Even within character strings:

char *a = "cpp coment is // isn't it?";
and there you go, you destroy the source.

You ignore all the discussion, and you put this program...

C'mon...

You can't do this in such a BRUTE force fashion...
If I write
char *a = "//";
it will replace it with

Oct 20 '06 #33

Yevgen Muntyan

jacob navia wrote:

Walter Bright wrote:

>>
That's what I expected. That pretty much means that trigraphs are a
reasonable solution for such systems, but that since the characters
must be translated anyway, there's not much reason to support
trigraphs in the C language standard itself.

EXACTLY.

Why should the language specs be cluttered with such details?
Why should *I* bother about that?

You should do whatever you like; but note that you are fooling people
if you are saying you are producing a C compiler, since people do not
expect a C compiler to intentionally ignore some parts of C standard.

You are not saying "a compiler system adding some sugar to C language
and removing some standard parts from it" on your web site, are you?
Web site says "lcc-win32 C compiler system".

Regards,
Yevgen

Oct 20 '06 #34

jacob navia

Yevgen Muntyan wrote:

jacob navia wrote:

>Walter Bright wrote:

>>>
That's what I expected. That pretty much means that trigraphs are a
reasonable solution for such systems, but that since the characters
must be translated anyway, there's not much reason to support
trigraphs in the C language standard itself.

EXACTLY.

Why should the language specs be cluttered with such details?
Why should *I* bother about that?

You should do whatever you like; but note that you are fooling people
if you are saying you are producing a C compiler, since people do not
expect a C compiler to intentionally ignore some parts of C standard.

You are not saying "a compiler system adding some sugar to C language
and removing some standard parts from it" on your web site, are you?
Web site says "lcc-win32 C compiler system".

Regards,
Yevgen

Who is tallking about the C compiler?
We are talking (and is the subject of this thread) about this utility
to eliminate // comments!!!

lcc-win32, by the way, will warn you about any trigraphs it sees by
default. If you want to use trigraphs you have to set the option
-ansic.

This is NONSENSE for all users that are NOT EBCDIC and do NOT work in
mainframes. By the way, the venerable 3270 is DEAD SINCE CONCEPTION
and one of the nice things of the microcomputers that appeared in the
eighties was this wonderful KEYBOARDS where we could type any character
we wish... Nice isn't it?

Oct 20 '06 #35

Yevgen Muntyan

jacob navia wrote:

Yevgen Muntyan wrote:

>jacob navia wrote:

>>Walter Bright wrote:
That's what I expected. That pretty much means that trigraphs are a
reasonable solution for such systems, but that since the characters
must be translated anyway, there's not much reason to support
trigraphs in the C language standard itself.
EXACTLY.

Why should the language specs be cluttered with such details?
Why should *I* bother about that?

You should do whatever you like; but note that you are fooling people
if you are saying you are producing a C compiler, since people do not
expect a C compiler to intentionally ignore some parts of C standard.

You are not saying "a compiler system adding some sugar to C language
and removing some standard parts from it" on your web site, are you?
Web site says "lcc-win32 C compiler system".

Regards,
Yevgen

Who is tallking about the C compiler?

Below is what made me think your compiler does not support trigraphs
(this your reply elsethread). If you meant "my compiler supports
trigraphs but I do not support them" (not sure what you actually meant
then), then I apologize.
Walter Bright wrote:

Peter Nilsson wrote:

>Some test cases for you to consider...

int c = a //* ... */
b;
int d = '??''; // this is a // comment, is it translated?

A trigraph case:

char* d = "??/""; // "

but of course I've never seen trigraphs outside of a test suite.

Me neither. But I do not support trigraphs anyway. They are an
unnecessary feature. We had several lebgthy discussions about this in
comp.std.c.

Oct 20 '06 #36

Keith Thompson

"Jalapeno" <ja*******@mac.comwrites:

Walter Bright wrote:
>Peter Nilsson wrote:
Some test cases for you to consider...

int c = a //* ... */
b;
int d = '??''; // this is a // comment, is it translated?

A trigraph case:

char* d = "??/""; // "

but of course I've never seen trigraphs outside of a test suite.

Haven't worked in a z/OS shop before, huh? (or a Sys 370 one either)

It only takes an hour or two of working with int a??(8??); to get used
to them (and they become second nature quickly when you see them all
day long).

Fascinating. There have been raging arguments about trigraphs both
here and in comp.std.c for years. I think you're the first person
I've seen who actually *uses* them. Maybe mainframe users just don't
post to Usenet very often?

In my own experience, and that of most people here, trigraphs have
caused far more problems than they solve; if a trigraph appears in a C
source file, it's far more likely to be accidental than intentional
(unless the code is deliberately obfuscated). For example:

fprintf(stderr, "Unexpected error, what happened??!\n");

Since there is currently no active effort to publish a new C standard,
it looks like we're stuck with the current situation for the
forseeable future, but some of us are still trying to come up with a
better solution. For example, I've proposed *disabling* trigraphs by
default, but enabling them if there's some unique marker at the top of
the file.

For any change like this, there's a danger of breaking existing code,
but for those of us outside the IBM mainframe world, it would probably
accidentally *fix* more code than it would break.

Also, why do you use trigraphs rather than digraphs? They were added
in a 1995 update to the standard (I think that's right); you could
write a[8] as a<:8:rather than as a??(8??).

Any thoughts?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Oct 20 '06 #37

Walter Roberson

In article <45**********************@news.orange.fr>,
jacob navia <ja***@jacob.remcomp.frwrote:

>This is NONSENSE for all users that are NOT EBCDIC and do NOT work in
mainframes. By the way, the venerable 3270 is DEAD SINCE CONCEPTION

It was? You only had to wait 3 years for DEC to introduce the VT52,
whose 9600 bps serial interface wasn't up to the task of
connecting 17500 terminals to a single 16 megabyte computer.

>and one of the nice things of the microcomputers that appeared in the
eighties was this wonderful KEYBOARDS where we could type any character
we wish... Nice isn't it?

"In the eighties" was literally a decade after the introduction
of the "dead since conception" 3270. And it took another decade (at least)
before all the codepages were in place.

--
If you lie to the compiler, it will get its revenge. -- Henry Spencer

Oct 20 '06 #38

Richard Heathfield

Walter Bright said:

jacob navia wrote:
>Walter Bright wrote:
>>A trigraph case:

char* d = "??/""; // "

but of course I've never seen trigraphs outside of a test suite.

Me neither. But I do not support trigraphs anyway. They are an
unnecessary feature. We had several lebgthy discussions about this in
comp.std.c.

Trigraphs are a worthless feature.

This "worthless feature" is sometimes the only way you can get C code to
compile on a particular implementation, because the native character set of
the implementation doesn't contain such fancy characters as { or [ - so to
dismiss it as worthless is to display mere parochialism. I've worked on a
system that had no end of trouble with [ and ] but was quite at home with
??( and ??)

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Oct 20 '06 #39

Richard Heathfield

Walter Bright said:

Jalapeno wrote:
>Walter Bright wrote:
>>Do you need to run non-trigraph C code through a source translater to
get it on to your z/OS system?

Not so much to _get_ the source text to the mainframe but for it to be
usable it'll need to be in EBCDIC.

That's what I expected. That pretty much means that trigraphs are a
reasonable solution for such systems, but that since the characters
must be translated anyway, there's not much reason to support trigraphs
in the C language standard itself.

If trigraphs were *not* supported in the Standard, you'd have a heck of a
job getting the same source base to run on, say, MS-DOS (or, nowadays,
Windows) and MVS. Just because you don't use 'em yourself, that doesn't
mean they're not useful.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Oct 20 '06 #40

Richard Heathfield

Keith Thompson said:

<snip>

Also, why do you use trigraphs rather than digraphs?

Because not all compiler vendors have caught up with 1995 (let alone 1999!).

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Oct 20 '06 #41

Richard Heathfield

Ben Bacarisse said:

Richard Heathfield <in*****@invalid.invalidwrites:

<snip>

>>
Sometimes, words fail me.

I think there is a deeper irony. Did you relax you compiler options get
this far?

No.

If so, it allowed the non standard nested comment to pass.
I get a syntax error at the word "easy".

Ah, that may explain it. I appear to have omitted to grab that introductory
comment when compiling the code.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Oct 20 '06 #42

Keith Thompson

Richard Heathfield <in*****@invalid.invalidwrites:

Walter Bright said:
>Jalapeno wrote:
>>Walter Bright wrote:
Do you need to run non-trigraph C code through a source translater to
get it on to your z/OS system?

Not so much to _get_ the source text to the mainframe but for it to be
usable it'll need to be in EBCDIC.

That's what I expected. That pretty much means that trigraphs are a
reasonable solution for such systems, but that since the characters
must be translated anyway, there's not much reason to support trigraphs
in the C language standard itself.

If trigraphs were *not* supported in the Standard, you'd have a heck of a
job getting the same source base to run on, say, MS-DOS (or, nowadays,
Windows) and MVS. Just because you don't use 'em yourself, that doesn't
mean they're not useful.

The source would have to be translated between EBCDIC and ASCII
anyway. If trigraphs weren't supported by the standard, some other
solution (or even the same one?) would undoubtedly be supported by
mainframe compilers, and there would be utilities that would peform
both EBCDIC<->ASCII translation and whatever mapping is necessary.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Oct 20 '06 #43

Keith Thompson

jacob navia <ja***@jacob.remcomp.frwrites:
[...]

This is NONSENSE for all users that are NOT EBCDIC and do NOT work in
mainframes. By the way, the venerable 3270 is DEAD SINCE CONCEPTION
and one of the nice things of the microcomputers that appeared in the
eighties was this wonderful KEYBOARDS where we could type any character
we wish... Nice isn't it?

Tell that to "Jalapeno", a real live trigraph user who's been posting
in this very thread.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Oct 20 '06 #44

Walter Bright

Jalapeno wrote:

Character translation is only necessary if the text originates on an
ASCII system. Since all the "home grown" code here (and that supplied
by IBM) originates on EBCDIC systems absolutly no translations are
necessary and trigraphs are useful. All the world is not a PC. The
standard acknowledges that. I also understand that you don't find much
reason to have trigraphs supported. Some people use them, a lot. IBM's
Mainframes have'nt disappeared, they've just been renamed "Servers" ;o).

I understand that. My (badly explained) point was that since trigraphs
failed to make C source code portable, trigraphs shouldn't have been
part of the C standard.

Oct 20 '06 #45

Mark McIntyre

On Fri, 20 Oct 2006 08:33:32 +0200, in comp.lang.c , jacob navia
<ja***@jacob.remcomp.frwrote:

>Me neither. But I do not support trigraphs anyway.

Just to be clear, you confirm that your C implementation is
deliberately nonconforming.

>They are an unnecessary feature.

And you feel able to speak for the *entire* C programming community
when you make that statement, and the C standards committee of experts
from throughout the world are wrong.

>We had several lebgthy discussions about this in
comp.std.c.

No doubt. :-)

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Oct 20 '06 #46

Mark McIntyre

On Fri, 20 Oct 2006 07:56:46 -0400, in comp.lang.c , CBFalconer
<cb********@yahoo.comwrote:

>jacob navia wrote:
>Me neither. But I do not support trigraphs anyway. They are an
unnecessary feature.

I guess you have never seen a system without the following chars in
its char set.

ISTR that Jacob believes that only Intel 32-bit windows platforms
exist, and all other Osen are a figment of everyone's imagination.

I mean, who could possibly build a machine doesn't have a # or {
symbol on the keyboard or in the character set? Other than IBM, Dec
and Apple of course. Who don't exist.

And he wonders why he attracts flames.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Oct 20 '06 #47

Mark McIntyre

On Fri, 20 Oct 2006 20:52:19 +0200, in comp.lang.c , jacob navia
<ja***@jacob.remcomp.frwrote:

>By the way, the venerable 3270 is DEAD SINCE CONCEPTION

What a complete mutt you are. There are entire banks out there whose
entire back offices run entirely on IBM mainframes with 3270s hanging
off them, Sure, emulators these days but still 3270s.

>and one of the nice things of the microcomputers that appeared in the
eighties was this wonderful KEYBOARDS where we could type any character
we wish..

Go on then, type a # on a UK G3 Apple Mac keyboard. Or on a Tektronix
4100 keyboard, if memory serves me correctly (or was it { and } they
don't have?). And while we're at it, try £ on any US keyboard.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Oct 20 '06 #48

Mark McIntyre

On Fri, 20 Oct 2006 20:52:19 +0200, in comp.lang.c , jacob navia
<ja***@jacob.remcomp.frwrote:

>This is NONSENSE

Have you noticed that by making a series of pointless throwaway
inflammatory remarks, you have diverted all attention from your code?

Nobody is bothering to read it any more. Thats a shame as it might
have been interesting.

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Oct 20 '06 #49

Mark McIntyre

On Fri, 20 Oct 2006 09:46:32 +0200, in comp.lang.c , jacob navia
<ja***@jacob.remcomp.frwrote:

> return 1//* what divisor? */2;

which is, I think, a valid way to write return 1/2;

No. MSVC for instance will pre-proccess your statement to
return 1

Only if invoked in non-conforming mode. Remember that MSVC is not a
C99 compiler, it adheres to C89.

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Oct 20 '06 #50

How to remove // comments

Similar topics