473,902 Members | 3,411 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to remove // comments

Recently, a heated debate started because of poor mr heathfield
was unable to compile a program with // comments.

Here is a utility for him, so that he can (at last) compile my
programs :-)

More seriously, this code takes 560 bytes. Amazing isn't it? C is very
ompact, you can do great things in a few bytes.

Obviously I have avoided here, in consideration for his pedantic
compiler flags, any C99 issues, so it will compile in obsolete
compilers, and with only ~600 bytes you can run it in the toaster!

--------------------------------------------------------------cut here

/* This program reads a C source file and writes it modified to stdout
All // comments will be replaced by /* ... */ comments, to easy the
porting to old environments or to post it in usenet, where
// comments can be broken in several lines, and messed up.
*/

#include <stdio.h>

/* This function reads a character and writes it to stdout */
static int Fgetc(FILE *f)
{
int c = fgetc(f);
if (c != EOF)
putchar(c);
return c;
}

/* This function skips strings */
static int ParseString(FIL E *f)
{
int c = Fgetc(f);
while (c != EOF && c != '"') {
if (c == '\\')
c = Fgetc(f);
if (c != EOF)
c = Fgetc(f);
}
if (c == '"')
c = Fgetc(f);
return c;
}
/* Skips multi-line comments */
static int ParseComment(FI LE *f)
{
int c = Fgetc(f);

while (1) {
while (c != '*') {
c = Fgetc(f);
if (c == EOF)
return EOF;
}
c = Fgetc(f);
if (c == '/')
break;
}
return Fgetc(f);
}

/* Skips // comments. Note that we use fgetc here and NOT Fgetc */
/* since we want to modify the output before gets echoed */
static int ParseCppComment (FILE *f)
{
int c = fgetc(f);

while (c != EOF && c != '\n') {
putchar(c);
c = fgetc(f);
}
if (c == '\n') {
puts(" */");
c = Fgetc(f);
}
return c;
}

/* Checks if a comment is followed after a '/' char */
static int CheckComment(in t c,FILE *f)
{
if (c == '/') {
c = fgetc(f);
if (c == '*') {
putchar('*');
c = ParseComment(f) ;
}
else if (c == '/') {
putchar('*');
c = ParseCppComment (f);
}
else {
putchar(c);
c = Fgetc(f);
}
}
return c;
}

/* Skips chars between simple quotes */
static int ParseQuotedChar (FILE *f)
{
int c = Fgetc(f);
while (c != EOF && c != '\'') {
if (c == '\\')
c = Fgetc(f);
if (c != EOF)
c = Fgetc(f);
}
if (c == '\'')
c = Fgetc(f);
return c;
}
int main(int argc,char *argv[])
{
FILE *f;
int c;
if (argc == 1) {
fprintf(stderr, "Usage: %s <file.c>\n",arg v[0]);
return EXIT_FAILURE;
}
f = fopen(argv[1],"r");
if (f == NULL) {
fprintf(stderr, "Can't find %s\n",argv[1]);
return EXIT_FAILURE;
}
c = Fgetc(f);
while (c != EOF) {
/* Note that each of the switches must advance the character */
/* read so that we avoid an infinite loop. */
switch (c) {
case '"':
c = ParseString(f);
break;
case '/':
c = CheckComment(c, f);
break;
case '\'':
c = ParseQuotedChar (f);
break;
default:
c = Fgetc(f);
}
}
fclose(f);
return 0;
}

Oct 19 '06
100 5176
On Sun, 22 Oct 2006 18:54:58 -0700, in comp.lang.c , Walter Bright
<wa****@digital mars-nospamm.comwrot e:
>Mark McIntyre wrote:
>On Sat, 21 Oct 2006 19:35:34 GMT, in comp.lang.c , Keith Thompson
<ks***@mib.org wrote:
>>Mark McIntyre <ma**********@s pamcop.netwrite s:
On Fri, 20 Oct 2006 16:40:37 -0700, in comp.lang.c , Walter Bright
<wa****@digi talmars-nospamm.comwrot e:
[...]
No sane person is going to invent a new character encoding
that doesn't include ASCII.
Apparently nobody told IBM.
It's unlikely *now* that anyone would invent a new encoding that's not
based on ASCII.

I'm not even sure that's true. I can see the Chinese deciding on some
totally new encoding scheme more suitable for their needs.

If their needs don't include communicating with the rest of the world
One could argue, that since there's more of them than us, we should
adapt...
>or the internet
Puhleeze. There are already many thousands of websites which are paged
entirely exclusively in non-ASCII. In a few years, I predict a
majority of websites will have non-ASCII names.
>or using the C, C++, Perl, Java, Ruby, Python, or D
programming languages, then they should go for it.
It may surprise you to learn this, but nations using Western lettering
are in a minority.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Oct 23 '06 #81
Keith Thompson wrote:
"Jalapeno" <ja*******@mac. comwrites:
Walter Bright wrote:
Peter Nilsson wrote:
Some test cases for you to consider...

int c = a //* ... */
b;
int d = '??''; // this is a // comment, is it translated?

A trigraph case:

char* d = "??/""; // "

but of course I've never seen trigraphs outside of a test suite.
Haven't worked in a z/OS shop before, huh? (or a Sys 370 one either)

It only takes an hour or two of working with int a??(8??); to get used
to them (and they become second nature quickly when you see them all
day long).

Fascinating. There have been raging arguments about trigraphs both
here and in comp.std.c for years. I think you're the first person
I've seen who actually *uses* them. Maybe mainframe users just don't
post to Usenet very often?
The first? Wow. I can't speak for anyone but myself. I came to Usenet
looking for information and people interested in old hardware, and
"discovered " comp.lang.c as a side effect. C isn't and never was the
most popular way to program mainframes. There are large code bases but
they are miniscule compared to the COBOL and PL/I code bases. In the
early 1980's we started using Pascal but it died fairly quickly. I have
seen a lot of C code on mainframes that is nothing more than "portable
assembler". The specific nature of the coding techniques for the MVS
system would make comp.lang.c fairly useless as a resource to those
programmers, I suppose.

In my own experience, and that of most people here, trigraphs have
caused far more problems than they solve; if a trigraph appears in a C
source file, it's far more likely to be accidental than intentional
(unless the code is deliberately obfuscated). For example:

fprintf(stderr, "Unexpected error, what happened??!\n") ;
When I first started in C octal numbers caused some subtle bugs. ;o)
Since there is currently no active effort to publish a new C standard,
it looks like we're stuck with the current situation for the
forseeable future, but some of us are still trying to come up with a
better solution. For example, I've proposed *disabling* trigraphs by
default, but enabling them if there's some unique marker at the top of
the file.

For any change like this, there's a danger of breaking existing code,
but for those of us outside the IBM mainframe world, it would probably
accidentally *fix* more code than it would break.
I have neither a love nor hate for trigraphs. They are just the syntax
used. I originally responded to a poster who said he had never seen
trigraphs outside of a test suite. I have. That doesn't mean I advocate
using them. But they are in use.
Also, why do you use trigraphs rather than digraphs? They were added
in a 1995 update to the standard (I think that's right); you could
write a[8] as a<:8:rather than as a??(8??).

Any thoughts?
Well, why didn't you tell me in 1995? ;o) Looking at the docs for
the compiler (which is C92 compliant, i.e ANSI/ISO 9899:1990[1992]
(formerly ANSI X3.159-1989 C)) digraphs are available but the default
compiler switch is NODIGRAPH. So, since apparently nobody who has
worked here knew of digraphs, the compiler switch was never turned on.
IBM claims their newest compiler is C99 compliant, but it requires an
operating system upgrade to at least z/OS 1.7 to use that compiler. We
won't be upgrading the OS for at least another year.

Really, it is all just syntax. I got used to them and can go back and
forth without any trouble. YMMV, of course. Like anything in C, if you
know the pitfalls, it's easier to avoid them.

Oct 23 '06 #82

Walter Bright wrote:
Jalapeno wrote:
Character translation is only necessary if the text originates on an
ASCII system. Since all the "home grown" code here (and that supplied
by IBM) originates on EBCDIC systems absolutly no translations are
necessary and trigraphs are useful. All the world is not a PC. The
standard acknowledges that. I also understand that you don't find much
reason to have trigraphs supported. Some people use them, a lot. IBM's
Mainframes have'nt disappeared, they've just been renamed "Servers" ;o).

I understand that. My (badly explained) point was that since trigraphs
failed to make C source code portable, trigraphs shouldn't have been
part of the C standard.
I am not sure I understand your point. Portability is supposed to be a
two way street.

On the IBM mainframe, the 3270 terminal (really 91.9% is terminal
emulation on windows these days) does not have certain characters from
the C basic execution character set. The 3270 has many (IMO) better
characters.

EBCDIC however has, for instance, the '[' and ']' symbols in its set of
characters. They are there. It isn't a translation problem per se. It
is just that when the C standard was being formulated there was no way
to type them from a 3270 terminal.

There was absolutely no problem taking C source code from Unix or
Windows, for example, and translating the ASCII to EBCDIC and compiling
the source. Trigraphs mean that source typed in on a 3270 can be sent
to a Unix system via EBCDIC to ASCII translation and still compile
without having to edit the source. (system specific parts excepted)

I am not advocating trigraphs. I do see your point. There were
realities in the hardware in the 1980's and 1990's that were there. I
am sure IBM had a presence with the Standards committee.

Just understand that my whole existence in this thread is because you
said you had never seen trigraphs outside a test suite. They do exist.
It is legacy code, I know, but it is there. And it is updated
periodically.

Oct 23 '06 #83
Walter Bright wrote:
>
.... snip ...
>
3) Please explain how C99 makes it possible to make a conforming C
implementation for RADIX50 encoding,
http://en.wikipedia.org/wiki/RADIX-50.
Assuming you meant 'impossible', RADIX-50 can only hold 40
characters, 26 alpha, 10 numeric, space, and three others. No room
for the fundamental C char set.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net>

Oct 23 '06 #84
jxh

CBFalconer wrote:
jxh wrote:
jacob navia wrote:
Recently, a heated debate started because of poor mr heathfield
was unable to compile a program with // comments.

Here is a utility for him, so that he can (at last) compile my
programs :-)
The code below is considerably larger, but it should get the job
done. It actually removes all comments.

... snip code ...

If you just want to delete all comments, my public domain uncmnt.c
is considerably shorter. ...

<http://cbfalconer.home .att.net/download/>
Very nice. It doesn't handle other cases besides trigraphs, though.

--
-- James

Oct 23 '06 #85
Mark McIntyre <ma**********@s pamcop.netwrite s:
On Sun, 22 Oct 2006 18:54:58 -0700, in comp.lang.c , Walter Bright
<wa****@digital mars-nospamm.comwrot e:
>>Mark McIntyre wrote:
>>On Sat, 21 Oct 2006 19:35:34 GMT, in comp.lang.c , Keith Thompson
<ks***@mib.or gwrote:

Mark McIntyre <ma**********@s pamcop.netwrite s:
On Fri, 20 Oct 2006 16:40:37 -0700, in comp.lang.c , Walter Bright
<wa****@dig italmars-nospamm.comwrot e:
[...]
>No sane person is going to invent a new character encoding
>that doesn't include ASCII.
Apparentl y nobody told IBM.
It's unlikely *now* that anyone would invent a new encoding that's not
based on ASCII.

I'm not even sure that's true. I can see the Chinese deciding on some
totally new encoding scheme more suitable for their needs.

If their needs don't include communicating with the rest of the world

One could argue, that since there's more of them than us, we should
adapt...
>>or the internet

Puhleeze. There are already many thousands of websites which are paged
entirely exclusively in non-ASCII. In a few years, I predict a
majority of websites will have non-ASCII names.
Obviously the encodings used for Chinese and/or Japanese characters
are non-ASCII, but are they necessarily *incompatible* with ASCII?
Chinese in particular has a *lots* of characters it has to represent;
reserving the first 128 codes for ASCII (including digits and
punctuation marks, which can be used in Chinese text) doesn't seem too
onerous.

Unicode is a superset of ASCII, and it can represent Chinese
characters easily enough. *If* it catches on world-wide, we can
continue to assume that the ASCII subset needed by C will be
available.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Oct 23 '06 #86
On Mon, 23 Oct 2006 20:14:19 GMT, in comp.lang.c , Keith Thompson
<ks***@mib.orgw rote:
>Obviously the encodings used for Chinese and/or Japanese characters
are non-ASCII, but are they necessarily *incompatible* with ASCII?
Quite possibly not, although people have in the past been known to
deliberately write for incompatibility , due to personal, commercial or
nationalistic reasons. This is however probably offtopic in CLC...
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Oct 23 '06 #87
"Jalapeno" <ja*******@mac. comwrites:
[...]
EBCDIC however has, for instance, the '[' and ']' symbols in its set of
characters. They are there. It isn't a translation problem per se. It
is just that when the C standard was being formulated there was no way
to type them from a 3270 terminal.
Really? My understanding is that there are multiple versions of
EBCDIC, some of which *don't* have '[' and ']' characters. Wikipedia
<http://en.wikipedia.or g/wiki/EBCDICshows a table of something
called CCSID 500, which does have '[' and ']', along with accented
characters (which, if I understand correctly, "classic" EBCDIC didn't
have).

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Oct 23 '06 #88
Mark McIntyre <ma**********@s pamcop.netwrite s:
On Mon, 23 Oct 2006 20:14:19 GMT, in comp.lang.c , Keith Thompson
<ks***@mib.orgw rote:
>>Obviously the encodings used for Chinese and/or Japanese characters
are non-ASCII, but are they necessarily *incompatible* with ASCII?

Quite possibly not, although people have in the past been known to
deliberately write for incompatibility , due to personal, commercial or
nationalistic reasons. This is however probably offtopic in CLC...
It's not entirely off-topic. The future evolution of character sets
could have a major effect on future C standards. If we can't assume,
for example, that the '?' character will always be available, we'll
have to think about alternatives. Though there's probably not much
point in inventing specific solutions until and unless we see an
actual character set that *doesn't* have '?', and that people want to
use to write C programs.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Oct 23 '06 #89
Mark McIntyre wrote:
On Sun, 22 Oct 2006 18:54:58 -0700, in comp.lang.c , Walter Bright
>>I'm not even sure that's true. I can see the Chinese deciding on some
totally new encoding scheme more suitable for their needs.
If their needs don't include communicating with the rest of the world
One could argue, that since there's more of them than us, we should
adapt...
You can argue that. But don't expect to be taken seriously. The Chinese
and Japanese regularly mix in western letters in their web pages, books,
and magazines.

You're suggesting that we (and the Chinese) should throw out the entire
computer infrastructure, and rewrite/rebuild everything from scratch.
>or the internet
Puhleeze. There are already many thousands of websites which are paged
entirely exclusively in non-ASCII. In a few years, I predict a
majority of websites will have non-ASCII names.
The internet encodings are all supersets of ascii. That is not going to
change.
>or using the C, C++, Perl, Java, Ruby, Python, or D
programming languages, then they should go for it.
It may surprise you to learn this, but nations using Western lettering
are in a minority.
How can a C99 compiler work with totally non-western lettering?
Oct 23 '06 #90

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
1790
by: qwweeeit | last post by:
For a python code I am writing I need to remove all strings definitions from source and substitute them with a place-holder. To make clearer: line 45 sVar="this is the string assigned to sVar" must be converted in: line 45 sVar=s00001 Such substitution is recorded in a file under: s0001="this is the string assigned to sVar"
3
2661
by: Markus | last post by:
Hi! I wanted to select a subset of nodes (list = selectNodes("parent/child") from a XmlDocument, then remove all (parentNode.removeAll();) child-nodes and insert the previous selected nodes back. Of course the XmlNodes in the nodelist ref. where lost once the removeAll() were executed, this is expected. So the obvious way to achive the result would be to clone the nodes or remove the nodes not qualifying the Xpath.
9
5763
by: Frank Potter | last post by:
I only want to remove the comments which begin with "//". I did like this, but it doesn't work. r=re.compile(ur"//+$", re.UNICODE|re.VERBOSE) f=file.open("mycpp.cpp","r") f=unicode(f,"utf8") r.sub(ur"",f) Will somebody show me the right way? Thanks~~
1
14355
by: Andrus | last post by:
I need to remove all comments ( between <!-- and --tags) from XML string. I tried the following code but comments are still present. Or is it better to open xml string with a StreamReader, read all the text inside and remove all the "<!--" and "-->" substrings? How to remove comments ? string RemoveComments(string sDoc) {
3
6296
by: Laurence | last post by:
Hi folks, How to remove the transaction logs which are out of date in HADR environment? DB2 command PRUNE, it can only run on primary database but cannot run on standby database. Because the database conection is not permitted on standby database. USEREXIT, I don't know whether it can run on standby database or
1
4307
by: Andrus | last post by:
I have SQL strings which contain comments starting with -- and continuing to end of line like string mystring=@"SELECT data, -- ending comment -- line comment FROM mytable "; How to remove comment from those strings ? I need to delete charactes starting from -- and ending with line end in
4
9873
by: FullBandwidth | last post by:
I have been perusing various blogs and MSDN pages discussing the use of event properties and the EventHandlerList class. I don't believe there's anything special about the EventHandlerList class in this context, in fact some articles from pre-2.0 suggest using any collection class of your choice. So my questions focus more on the syntax of event properties provided by the "event" keyword in C#. (Disclaimer - I am a C++ programmer working...
61
3313
by: arnuld | last post by:
I have created a program which creates and renames files. I have described everything in comments. All I have is the cod-duplication. function like fopen, sprint and fwrite are being called again and again. I know to remove code-duplication I have to make functions and pass arguments to them but I am not able to think of a way doing it. Can you post some example for me, out of this code:
3
5289
by: Allen Chen [MSFT] | last post by:
Hi Richard, Quote from Richard================================================== However I also want to be able to remove the panes. I have tried to include this, but find that when I first add the pane the remove event does not get handled, though thereafter it is handled without problems. ==================================================
0
9997
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
11277
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10866
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10978
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9672
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
8045
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7204
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
4724
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4305
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.