Hi everybody,
this may be a stupid question:
i want to strip comments from a .cpp
file.
cpp comments look like:
// (two slashes) .... comment until newline -->|
but how do i catch newlines in different os
( UNIX / Mac / Win -> CR, LF, CR+LF)? are they always defined as '\n',
and the compilers will take care of it? or do i have to check
the file type in advance? how can i assure i strip the comments
of a MacOS file correctly under Win32?
best regards + many thanks,
hendrik 9 1409
Hendrik Wendler wrote: Hi everybody,
this may be a stupid question: i want to strip comments from a .cpp file.
cpp comments look like:
// (two slashes) .... comment until newline -->|
but how do i catch newlines in different os ( UNIX / Mac / Win -> CR, LF, CR+LF)? are they always defined as '\n', and the compilers will take care of it?
That depends on (a) how you open the file and (b) what platform the file
was written on: one of the challenges is to read, say, a UNIX file on
Windows, or vice versa.
or do i have to check the file type in advance?
Usually not. Besides, there is no sure way.
how can i assure i strip the comments of a MacOS file correctly under Win32?
The three platforms that have the line breaks differently all have the \n
symbol in there somewhere. Our approach was always to look for that (BTW,
that's probably why 'std::getline' has '\n' as the default value for the
terminator symbol), and always weed out \r from the string obtained. Of
course, that requires opening the file as _binary_, not "text".
V
"Hendrik Wendler" <we************ @spatialknowled ge.com> wrote in message
news:dd******** *****@news.t-online.com... Hi everybody,
this may be a stupid question: i want to strip comments from a .cpp file.
cpp comments look like:
// (two slashes) .... comment until newline -->|
but how do i catch newlines in different os ( UNIX / Mac / Win -> CR, LF, CR+LF)? are they always defined as '\n',
Yes, within your program, newline characters are expressed
as '\n', regardless of host platform.
and the compilers will take care of it?
Yes, in 'text' mode (the default for iostreams). In 'binary'
mode, no translation occurs.
or do i have to check the file type in advance? how can i assure i strip the comments of a MacOS file correctly under Win32?
It's not a simple as you might imagine (regardless of the
newline issue). Comments in C++ can also be expressed within
the 'C-style' delimiters /* and */. You'll need to keep track
of those and make sure each 'start' delimiter is matched by
exactly one 'end' delimiter.
-Mike
Is that all you want to do or is it part of something bigger and more
complex? For the former, although it is an interesting exercise to do
it in C++ (you _do_ want to do it in C++ no? :) I would suggest perl.
For the latter, if you are writing a more complex program parser or
something of that sort, then a lexical analyzer generator is better (a
lexical analyzer will usually have states. So for example, you can look
for a \r followed by a \n or a plain \n and so on).
-vijai.
Victor Bazarov wrote: The three platforms that have the line breaks differently all have the \n symbol in there somewhere. Our approach was always to look for that (BTW, that's probably why 'std::getline' has '\n' as the default value for the terminator symbol), and always weed out \r from the string obtained. Of course, that requires opening the file as _binary_, not "text".
I thought:
Windows: "\r\n"
*nix: "\n"
mac: "\r"
A positive PITA.
Ben
--
I'm not just a number. To many, I'm known as a String...
* Hendrik Wendler: this may be a stupid question: i want to strip comments from a .cpp file.
cpp comments look like:
// (two slashes) .... comment until newline -->|
but how do i catch newlines in different os ( UNIX / Mac / Win -> CR, LF, CR+LF)? are they always defined as '\n', and the compilers will take care of it?
The compiler's associated standard library takes care on it for that
particular platform's convention.
or do i have to check the file type in advance? how can i assure i strip the comments of a MacOS file correctly under Win32?
In that scenario you'll need to open the file in binary mode, and check for
either '\r' (Mac), '\n' (Unix), or '\r' followed '\n' (Windows). One simple
algorithm for your cross-plattform application is to simply regard any of
'\r' or '\n' as end-of-line, and copy the characters faithfully. Of course
that may not work for some obscure platform where, say, files are
record-oriented with fixed length lines, no end-of-line character (e.g., the
HP3000 under MPE I think it was called, early eighties...).
OT extra note: I now checked that MPE thing via Google, and found to my
astonishment (Wikipedia) that the HP3000 series, introduced in 1973, was
still sold up till 2003 (!), with service available until 2007! Ouch. It
must hurt to use those old beasties, not to mention _buying_ them -- I
wonder if anyone's running old PDP-11s, and perhaps even buying them? Must
be some pointy-haired bosses doing this. It's really scary.
--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Hendrik Wendler wrote: Hi everybody,
this may be a stupid question: i want to strip comments from a .cpp file.
cpp comments look like:
// (two slashes) .... comment until newline -->|
but how do i catch newlines in different os ( UNIX / Mac / Win -> CR, LF, CR+LF)? are they always defined as '\n', and the compilers will take care of it? or do i have to check the file type in advance? how can i assure i strip the comments of a MacOS file correctly under Win32?
If you're transferring files between systems use ftp in ascii mode. It
will translate line endings, except on certain brain-damaged versions of
Linux. Once you've transferred a file, use
sed -fs!//.*$!! < source-file > target-file
I think that's the right script command, but you should probably check.
And you might need quotation marks around the script.
--
Pete Becker
Dinkumware, Ltd. ( http://www.dinkumware.com)
Ben Pope wrote: Victor Bazarov wrote: The three platforms that have the line breaks differently all have the \n symbol in there somewhere. Our approach was always to look for that (BTW, that's probably why 'std::getline' has '\n' as the default value for the terminator symbol), and always weed out \r from the string obtained. Of course, that requires opening the file as _binary_, not "text".
I thought: Windows: "\r\n" *nix: "\n" mac: "\r"
A positive PITA.
You're right. Up to MacOS 9 it had \r only. Now, OsX and after
they switched to "normal" UNIX \n.
V
On Mon, 08 Aug 2005 15:27:06 +0200, Hendrik Wendler
<we************ @spatialknowled ge.com> wrote in comp.lang.c++: Hi everybody,
this may be a stupid question: i want to strip comments from a .cpp file.
cpp comments look like:
// (two slashes) .... comment until newline -->|
but how do i catch newlines in different os ( UNIX / Mac / Win -> CR, LF, CR+LF)? are they always defined as '\n', and the compilers will take care of it? or do i have to check the file type in advance? how can i assure i strip the comments of a MacOS file correctly under Win32?
best regards + many thanks, hendrik
There is nothing guaranteed by the C++ standard, but there's a
technique I have used for more than 20 years in C that will handle the
three platforms you mention. It is a little more work than just using
std::getline(), however.
Open your file in binary mode. Read the file character by character,
or read it in chunks into a buffer and step through it character by
character.
A '\r' character is always considered an end of line token. A '\n'
character is also considered an end of line token, EXCEPT when it is
immediately preceded by a '\r'.
This will work for *x, Window/MS-DOS, and Mac.
Where it won't work is for platforms where end of line is not
indicated by a character, i.e., fixed block text files, and there are
probably still a few dinosaurs like that still out there. It also
won't work if some pathological file systems stores text files with
"\n\r" in that order.
--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.l earn.c-c++ http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Victor Bazarov wrote: Ben Pope wrote:
Victor Bazarov wrote:
The three platforms that have the line breaks differently all have the \n symbol in there somewhere. Our approach was always to look for that (BTW, that's probably why 'std::getline' has '\n' as the default value for the terminator symbol), and always weed out \r from the string obtained. Of course, that requires opening the file as _binary_, not "text".
I thought: Windows: "\r\n" *nix: "\n" mac: "\r"
A positive PITA.
You're right. Up to MacOS 9 it had \r only. Now, OsX and after they switched to "normal" UNIX \n.
Yeah, I was wondering that with a colleague today. We presumed that since it was basically *nix (BSD) it would be \n in OSX.
Ben
--
I'm not just a number. To many, I'm known as a String... This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Alex |
last post by:
Hi all,
We're looking at a vendor who uses the InterSystems Cache Database
Platform, but our IT department has zero experience with this system.
This software package will have a pivotal and mission critical roll in
our organization, so I'd like some comments on what others think of
this database platform.
Mainly I'm curious how easy/difficult it is to query a Cache Database,
and does it use standard SQL calls like Oracle and MS SQL? ...
|
by: SQLServer007 |
last post by:
25 more days until the "get it free" promotion runs out for xSQL
Object (you can get it from http://www.x-sql.com)
Here are just some of the great features packed in the product:
- Compare SQL Server objects (databases, tables, views, stored
procedures, user defined data functions etc.) accross servers.
- view and print dependencies;
- generate color coded scripts for any object in the database or many
of them at once (many configurable...
|
by: Benjamin Niemann |
last post by:
Hello,
I've been just investigating IE conditional comments - hiding things from
non-IE/Win browsers is easy, but I wanted to know, if it's possible to hide
code from IE/Win browsers.
I found <!> in the original MSDN documentation, but this is (although
it is working) unfortunately non-validating gibberish.
So I fooled around trying to find a way to make it valid. And voila:
<!--><!><!-->
|
by: Lyle Fairfield |
last post by:
Have you installed and used a new development platform such as Visual
Studio.Net on your Access development machine(s)?
Were there problems?
Were you able to maintain and revise your old
MDPs?
ADPs?
Connections to Server DBs?
|
by: Alex Vinokur |
last post by:
Is there any tool to count C-program lines except comments?
Thanks,
=====================================
Alex Vinokur
mailto:alexvn@connect.to
http://mathforum.org/library/view/10978.html
news://news.gmane.org/gmane.comp.lang.c++.perfometer
=====================================
| |
by: ceo |
last post by:
Hi,
I picked up the itoa example code from K&R and am trying to modify it
to as per these conditions:
1) input integer is always +ve
2) cannot assume the length of the integer parameter
Following is the modified code, please comment.
Thanks,
|
by: Naveen Mukkelli |
last post by:
Hi,
I'm writing a server applicaiton using C# and .NET Framework.
This server sends out time to all the clients. The clients are expected to
be
written in various platforms for example, Delphi, VB 6.0, C/C++, Java etc.
How I can send the DateTime from .NET Framework to all these other platforms.
I heard that sending time in "double" format would be accepted by all the
platforms.
|
by: planetthoughtful |
last post by:
Hi All,
I've written my first piece of practical Python code (included below),
and would appreciate some comments. My situation was that I had a
directory with a number of subdirectories that contained one or more
zip files in each. Many of the zipfiles had the same filename (which is
why they had previously been stored in separate directories). I wanted
to bring all of the zip files (several hundrd in total) down to the
common parent...
|
by: Vivek Kumar |
last post by:
Hi all,
I have to write a network server (sort of) and I am
looking for your valuable comments. Currently I have
written a prototype in VB6 but it can only handle up
to 30 or so clients. I need to upgrade the application
so that it can handle up to 1000-1500 clients at a
time (later if every thing works fine then hoping for
5000-6000 client). The scenario is as follows.
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
| |
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |