473,807 Members | 2,830 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

comments accross platforms


Hi everybody,

this may be a stupid question:
i want to strip comments from a .cpp
file.

cpp comments look like:

// (two slashes) .... comment until newline -->|

but how do i catch newlines in different os
( UNIX / Mac / Win -> CR, LF, CR+LF)? are they always defined as '\n',
and the compilers will take care of it? or do i have to check
the file type in advance? how can i assure i strip the comments
of a MacOS file correctly under Win32?

best regards + many thanks,
hendrik

Aug 8 '05 #1
9 1409
Hendrik Wendler wrote:
Hi everybody,

this may be a stupid question:
i want to strip comments from a .cpp
file.

cpp comments look like:

// (two slashes) .... comment until newline -->|

but how do i catch newlines in different os
( UNIX / Mac / Win -> CR, LF, CR+LF)? are they always defined as '\n',
and the compilers will take care of it?
That depends on (a) how you open the file and (b) what platform the file
was written on: one of the challenges is to read, say, a UNIX file on
Windows, or vice versa.
or do i have to check
the file type in advance?
Usually not. Besides, there is no sure way.
how can i assure i strip the comments
of a MacOS file correctly under Win32?


The three platforms that have the line breaks differently all have the \n
symbol in there somewhere. Our approach was always to look for that (BTW,
that's probably why 'std::getline' has '\n' as the default value for the
terminator symbol), and always weed out \r from the string obtained. Of
course, that requires opening the file as _binary_, not "text".

V
Aug 8 '05 #2

"Hendrik Wendler" <we************ @spatialknowled ge.com> wrote in message
news:dd******** *****@news.t-online.com...

Hi everybody,

this may be a stupid question:
i want to strip comments from a .cpp
file.

cpp comments look like:

// (two slashes) .... comment until newline -->|

but how do i catch newlines in different os
( UNIX / Mac / Win -> CR, LF, CR+LF)? are they always defined as '\n',
Yes, within your program, newline characters are expressed
as '\n', regardless of host platform.
and the compilers will take care of it?
Yes, in 'text' mode (the default for iostreams). In 'binary'
mode, no translation occurs.
or do i have to check
the file type in advance? how can i assure i strip the comments
of a MacOS file correctly under Win32?


It's not a simple as you might imagine (regardless of the
newline issue). Comments in C++ can also be expressed within
the 'C-style' delimiters /* and */. You'll need to keep track
of those and make sure each 'start' delimiter is matched by
exactly one 'end' delimiter.
-Mike
Aug 8 '05 #3
Is that all you want to do or is it part of something bigger and more
complex? For the former, although it is an interesting exercise to do
it in C++ (you _do_ want to do it in C++ no? :) I would suggest perl.
For the latter, if you are writing a more complex program parser or
something of that sort, then a lexical analyzer generator is better (a
lexical analyzer will usually have states. So for example, you can look
for a \r followed by a \n or a plain \n and so on).

-vijai.

Aug 8 '05 #4
Victor Bazarov wrote:
The three platforms that have the line breaks differently all have the \n
symbol in there somewhere. Our approach was always to look for that (BTW,
that's probably why 'std::getline' has '\n' as the default value for the
terminator symbol), and always weed out \r from the string obtained. Of
course, that requires opening the file as _binary_, not "text".


I thought:
Windows: "\r\n"
*nix: "\n"
mac: "\r"

A positive PITA.

Ben
--
I'm not just a number. To many, I'm known as a String...
Aug 8 '05 #5
* Hendrik Wendler:

this may be a stupid question:
i want to strip comments from a .cpp
file.

cpp comments look like:

// (two slashes) .... comment until newline -->|

but how do i catch newlines in different os
( UNIX / Mac / Win -> CR, LF, CR+LF)? are they always defined as '\n',
and the compilers will take care of it?
The compiler's associated standard library takes care on it for that
particular platform's convention.

or do i have to check
the file type in advance? how can i assure i strip the comments
of a MacOS file correctly under Win32?


In that scenario you'll need to open the file in binary mode, and check for
either '\r' (Mac), '\n' (Unix), or '\r' followed '\n' (Windows). One simple
algorithm for your cross-plattform application is to simply regard any of
'\r' or '\n' as end-of-line, and copy the characters faithfully. Of course
that may not work for some obscure platform where, say, files are
record-oriented with fixed length lines, no end-of-line character (e.g., the
HP3000 under MPE I think it was called, early eighties...).

OT extra note: I now checked that MPE thing via Google, and found to my
astonishment (Wikipedia) that the HP3000 series, introduced in 1973, was
still sold up till 2003 (!), with service available until 2007! Ouch. It
must hurt to use those old beasties, not to mention _buying_ them -- I
wonder if anyone's running old PDP-11s, and perhaps even buying them? Must
be some pointy-haired bosses doing this. It's really scary.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Aug 8 '05 #6
Hendrik Wendler wrote:

Hi everybody,

this may be a stupid question:
i want to strip comments from a .cpp
file.

cpp comments look like:

// (two slashes) .... comment until newline -->|

but how do i catch newlines in different os
( UNIX / Mac / Win -> CR, LF, CR+LF)? are they always defined as '\n',
and the compilers will take care of it? or do i have to check
the file type in advance? how can i assure i strip the comments
of a MacOS file correctly under Win32?


If you're transferring files between systems use ftp in ascii mode. It
will translate line endings, except on certain brain-damaged versions of
Linux. Once you've transferred a file, use

sed -fs!//.*$!! < source-file > target-file

I think that's the right script command, but you should probably check.
And you might need quotation marks around the script.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Aug 9 '05 #7
Ben Pope wrote:
Victor Bazarov wrote:
The three platforms that have the line breaks differently all have
the \n symbol in there somewhere. Our approach was always to look
for that (BTW, that's probably why 'std::getline' has '\n' as the
default value for the terminator symbol), and always weed out \r
from the string obtained. Of course, that requires opening the file
as _binary_, not "text".


I thought:
Windows: "\r\n"
*nix: "\n"
mac: "\r"

A positive PITA.


You're right. Up to MacOS 9 it had \r only. Now, OsX and after
they switched to "normal" UNIX \n.

V
Aug 9 '05 #8
On Mon, 08 Aug 2005 15:27:06 +0200, Hendrik Wendler
<we************ @spatialknowled ge.com> wrote in comp.lang.c++:

Hi everybody,

this may be a stupid question:
i want to strip comments from a .cpp
file.

cpp comments look like:

// (two slashes) .... comment until newline -->|

but how do i catch newlines in different os
( UNIX / Mac / Win -> CR, LF, CR+LF)? are they always defined as '\n',
and the compilers will take care of it? or do i have to check
the file type in advance? how can i assure i strip the comments
of a MacOS file correctly under Win32?

best regards + many thanks,
hendrik


There is nothing guaranteed by the C++ standard, but there's a
technique I have used for more than 20 years in C that will handle the
three platforms you mention. It is a little more work than just using
std::getline(), however.

Open your file in binary mode. Read the file character by character,
or read it in chunks into a buffer and step through it character by
character.

A '\r' character is always considered an end of line token. A '\n'
character is also considered an end of line token, EXCEPT when it is
immediately preceded by a '\r'.

This will work for *x, Window/MS-DOS, and Mac.

Where it won't work is for platforms where end of line is not
indicated by a character, i.e., fixed block text files, and there are
probably still a few dinosaurs like that still out there. It also
won't work if some pathological file systems stores text files with
"\n\r" in that order.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.l earn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Aug 9 '05 #9
Victor Bazarov wrote:
Ben Pope wrote:
Victor Bazarov wrote:
The three platforms that have the line breaks differently all have
the \n symbol in there somewhere. Our approach was always to look
for that (BTW, that's probably why 'std::getline' has '\n' as the
default value for the terminator symbol), and always weed out \r
from the string obtained. Of course, that requires opening the file
as _binary_, not "text".


I thought:
Windows: "\r\n"
*nix: "\n"
mac: "\r"

A positive PITA.

You're right. Up to MacOS 9 it had \r only. Now, OsX and after
they switched to "normal" UNIX \n.


Yeah, I was wondering that with a colleague today. We presumed that since it was basically *nix (BSD) it would be \n in OSX.

Ben
--
I'm not just a number. To many, I'm known as a String...
Aug 9 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
14234
by: Alex | last post by:
Hi all, We're looking at a vendor who uses the InterSystems Cache Database Platform, but our IT department has zero experience with this system. This software package will have a pivotal and mission critical roll in our organization, so I'd like some comments on what others think of this database platform. Mainly I'm curious how easy/difficult it is to query a Cache Database, and does it use standard SQL calls like Oracle and MS SQL? ...
0
1985
by: SQLServer007 | last post by:
25 more days until the "get it free" promotion runs out for xSQL Object (you can get it from http://www.x-sql.com) Here are just some of the great features packed in the product: - Compare SQL Server objects (databases, tables, views, stored procedures, user defined data functions etc.) accross servers. - view and print dependencies; - generate color coded scripts for any object in the database or many of them at once (many configurable...
28
3466
by: Benjamin Niemann | last post by:
Hello, I've been just investigating IE conditional comments - hiding things from non-IE/Win browsers is easy, but I wanted to know, if it's possible to hide code from IE/Win browsers. I found <!> in the original MSDN documentation, but this is (although it is working) unfortunately non-validating gibberish. So I fooled around trying to find a way to make it valid. And voila: <!--><!><!-->
3
1382
by: Lyle Fairfield | last post by:
Have you installed and used a new development platform such as Visual Studio.Net on your Access development machine(s)? Were there problems? Were you able to maintain and revise your old MDPs? ADPs? Connections to Server DBs?
19
5702
by: Alex Vinokur | last post by:
Is there any tool to count C-program lines except comments? Thanks, ===================================== Alex Vinokur mailto:alexvn@connect.to http://mathforum.org/library/view/10978.html news://news.gmane.org/gmane.comp.lang.c++.perfometer =====================================
26
2550
by: ceo | last post by:
Hi, I picked up the itoa example code from K&R and am trying to modify it to as per these conditions: 1) input integer is always +ve 2) cannot assume the length of the integer parameter Following is the modified code, please comment. Thanks,
5
5894
by: Naveen Mukkelli | last post by:
Hi, I'm writing a server applicaiton using C# and .NET Framework. This server sends out time to all the clients. The clients are expected to be written in various platforms for example, Delphi, VB 6.0, C/C++, Java etc. How I can send the DateTime from .NET Framework to all these other platforms. I heard that sending time in "double" format would be accepted by all the platforms.
6
1692
by: planetthoughtful | last post by:
Hi All, I've written my first piece of practical Python code (included below), and would appreciate some comments. My situation was that I had a directory with a number of subdirectories that contained one or more zip files in each. Many of the zipfiles had the same filename (which is why they had previously been stored in separate directories). I wanted to bring all of the zip files (several hundrd in total) down to the common parent...
2
1202
by: Vivek Kumar | last post by:
Hi all, I have to write a network server (sort of) and I am looking for your valuable comments. Currently I have written a prototype in VB6 but it can only handle up to 30 or so clients. I need to upgrade the application so that it can handle up to 1000-1500 clients at a time (later if every thing works fine then hoping for 5000-6000 client). The scenario is as follows.
0
9721
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10628
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10113
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9195
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7651
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5547
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5685
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4331
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3859
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.