473,385 Members | 1,324 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

White space and >>

Hello all,

[ Disclaimer: I am a complete C++ newbie ]

I want to read lines from a text file, where each line has the
following syntax:

token1:token2:token3

There could be white space between tokens and ':'
There could be white space before token1 or after token3.

Because I will need to access every line several times, later in my
program, I first store every line in a string vector:

// Do you guys put the & near the type or near the parameter name?
static void read_lines(vector<string> &v)
{
ifstream ifs(INFILE); // input file stream

if (ifs == NULL)
{
cerr << "Unable to open input file " << INFILE << ".\n";
exit(-1);
}

string line;

while (getline(ifs, line))
{
// Ignore empty lines and comments.
if (line.empty() || line[0]==HASH) continue;

v.push_back(line);
}
}

Does that part look OK?

Later on, when I am dealing with a specific line, I create a
stringstream object so I can use the >> operator.

Ideally, I would simply write:

{
istringstream myss(mystring);
string token1, token2, token3;

myss >> token1;
myss >> token2;
myss >> token3;
}

But this doesn't work because ':' is not treated as white space. Is
there a simple solution?

Is my approach completely wrong?

Nudge

Jul 19 '05 #1
6 16142
Grumble wrote:
Hello all,

[ Disclaimer: I am a complete C++ newbie ]

I want to read lines from a text file, where each line has the following
syntax:

token1:token2:token3

There could be white space between tokens and ':'
There could be white space before token1 or after token3.


I forgot to mention that it is valid for token1 to be empty, but it
is not valid for token2 and token3 to be empty.

I see a problem. Consider

: t2 : t3

myss >> token1;
myss >> token2;
myss >> token3;

If the >> operator considers ':' to be white space, then I will end
up with token1 = "t2" which is not what I want...

On the other hand, consider

t1:t2:t3

If ':' is not treated as white space, or perhaps some kind of
special delimiter, then I will end up with token1="t1:t2:t3" which
is wrong too...

Errr, how can I get the "ignore white space" behavior, along with
the "split at the delimiter" behavior together?

Nudge

Jul 19 '05 #2


Grumble wrote:

If ':' is not treated as white space, or perhaps some kind of
special delimiter, then I will end up with token1="t1:t2:t3" which
is wrong too...

Errr, how can I get the "ignore white space" behavior, along with
the "split at the delimiter" behavior together?


I think you are barking up the wrong tree.

Take your string.

Locate the 2 ':' characters.

Split the string into 3 seperate strings using the ':' positions
you have determined earlier.

You now have 3 strings, each one containing maybe some
leading whitespace, the token, maybe some trailing whitespace.

Get rid of leading and trailing whitespace in each string
and you are left with the tokens alone.

Not every problem is worth to be solved with clever uses of streams.
Sometimes simple string manipulation is simpler.

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 19 '05 #3
Karl Heinz Buchegger wrote:

Grumble wrote:
If ':' is not treated as white space, or perhaps some kind of
special delimiter, then I will end up with token1="t1:t2:t3" which
is wrong too...

Errr, how can I get the "ignore white space" behavior, along with
the "split at the delimiter" behavior together?

I think you are barking up the wrong tree.

Take your string.

Locate the 2 ':' characters.

Split the string into 3 seperate strings using the ':' positions
you have determined earlier.

You now have 3 strings, each one containing maybe some
leading whitespace, the token, maybe some trailing whitespace.

Get rid of leading and trailing whitespace in each string
and you are left with the tokens alone.

Not every problem is worth to be solved with clever uses of streams.
Sometimes simple string manipulation is simpler.


How disappointing :-)

What you describe is what I have done, but I was hoping for shorter
a solution (in terms of lines of code).
void extract_field(string &field, string &line, size_t lpos, size_t
rpos)
{
string temp = line.substr(lpos, rpos-lpos);

lpos = temp.find_first_not_of(WHITESPACE);
rpos = temp.find_first_of(WHITESPACE, lpos);

if (lpos == string::npos) // temp contains only white space.
{
field.erase();
}
else
{
field = temp.substr(lpos, rpos-lpos);
}
}

{
string opt_name, opt_type, opt_val;

size_t lpos = 0, rpos; // left and right position.

// Extract option name from line and strip white space.
rpos = line.find_first_of(COLON, lpos);
extract_field(opt_name, line, lpos, rpos);
lpos = rpos+1;

// Extract option type from line and strip white space.
rpos = line.find_first_of(COLON, lpos);
extract_field(opt_type, line, lpos, rpos);
lpos = rpos+1;

// Extract option value list from line.
opt_val = line.substr(lpos);
}

IMO, the above is far less elegant than:

myss >> opt_name;
myss >> opt_type;
myss >> opt_val;
// modulo error handling of course

I might use getline() to split my line into 3 strings... then use an
istringstream to strip white leading and trailing white space...

I have a related question: at some point I have a string, and I want
to concatenate an int at the end.

string s("toto");
int n=7;

s = s + n; // It would be nice if this resulted in s = "toto7" :-)

Am I supposed to use C's sprintf? A stringstream?

Nudge

Jul 19 '05 #4
Hi Grumble,

"Grumble" <in*****@kma.eu.org> schrieb im Newsbeitrag
news:bp**********@news-rocq.inria.fr...
How disappointing :-)

I was hoping for shorter
a solution (in terms of lines of code).


you could take an intensive look at the C++ stream library. There are ways
to do it, if you really want to. ;-)

If not's a life-or-death matter of doing it in an object-oriented way or if
you want to be short, reading a single text line using "cin" followed by a
sscanf() on the input buffer might be shorter than writing classes for
sorting out stream input.

The best solution would be using a class for regular expressions (perhaps
with streams support).

Your problem could be parsed by a regular expression like "/\w+[ \t]*:[
\t]*\w+[ \t]*:[ \t]*\w+/", this means "one or more word characters followed
by zero or more blank or tab characters, followed by a colon, followed by
.... etc."

Languages like Perl or PHP have regular expression support on language or
library level, and I'm sure there's a regexp library for C++ as well. :-)

I hope that helps.

Regards,
Ekkehard Morgenstern.
Jul 19 '05 #5
In article <bp**********@news-rocq.inria.fr>,
Grumble <in*****@kma.eu.org> wrote:

I want to read lines from a text file, where each line has the
following syntax:

token1:token2:token3
[snip code that reads the file into a vector of strings, one line per
string]
Does that part look OK?
Looks OK to me.
Later on, when I am dealing with a specific line, I create a
stringstream object so I can use the >> operator.

Ideally, I would simply write:

{
istringstream myss(mystring);
string token1, token2, token3;

myss >> token1;
myss >> token2;
myss >> token3;
}

But this doesn't work because ':' is not treated as white space. Is
there a simple solution?


Use getline() on myss, and tell it to use ':' as the separator, where
appropriate.

getline (myss, token1, ':');
getline (myss, token2, ':');
getline (myss, token3);

The tokens you pick up will also include whatever whitespace happens to
lie in between the colons.

--
Jon Bell <jt*******@presby.edu> Presbyterian College
Dept. of Physics and Computer Science Clinton, South Carolina USA
Jul 19 '05 #6


Grumble wrote:

Karl Heinz Buchegger wrote:

Grumble wrote:
If ':' is not treated as white space, or perhaps some kind of
special delimiter, then I will end up with token1="t1:t2:t3" which
is wrong too...

Errr, how can I get the "ignore white space" behavior, along with
the "split at the delimiter" behavior together?

I think you are barking up the wrong tree.

Take your string.

Locate the 2 ':' characters.

Split the string into 3 seperate strings using the ':' positions
you have determined earlier.

You now have 3 strings, each one containing maybe some
leading whitespace, the token, maybe some trailing whitespace.

Get rid of leading and trailing whitespace in each string
and you are left with the tokens alone.

Not every problem is worth to be solved with clever uses of streams.
Sometimes simple string manipulation is simpler.


How disappointing :-)


Depends :-)

What you describe is what I have done, but I was hoping for shorter
a solution (in terms of lines of code).

void extract_field(string &field, string &line, size_t lpos, size_t
rpos)
{
string temp = line.substr(lpos, rpos-lpos);

lpos = temp.find_first_not_of(WHITESPACE);
rpos = temp.find_first_of(WHITESPACE, lpos);

if (lpos == string::npos) // temp contains only white space.
{
field.erase();
}
else
{
field = temp.substr(lpos, rpos-lpos);
}
}

I would refactor the above into 2 functions:

A function TrimWhitespace
and a function ExtractField (which uses TrimWhitespace)

The reason?
A function for trimming a string is a good thing to have in your
toolbox and will come in handy a hundred of times.

And the function has gotten shorter and your toolbox has grown
by one additional function :-)

[snip]

I might use getline() to split my line into 3 strings...
OK
then use an
istringstream to strip white leading and trailing white space...
Or use your know function TrimWhitespace() from your personal
toolbox :-)
A good programmer has a collected a bag of little helper functions
like this one over the years.

I have a related question: at some point I have a string, and I want
to concatenate an int at the end.

string s("toto");
int n=7;

s = s + n; // It would be nice if this resulted in s = "toto7" :-)

Am I supposed to use C's sprintf? A stringstream?


stringstream.
you also might look at boost for it's lexical_cast.
www.boost.org

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 19 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Jonathan Daggar | last post by:
Hello, I'm trying to put together a form with a very tight table formatting. However, every time I put an text-type input field in, the browser pads the area to the right of it with space. I've...
11
by: Les Paul | last post by:
I'm trying to design an HTML page that can edit itself. In essence, it's just like a Wiki page, but my own very simple version. It's a page full of plain old HTML content, and then at the bottom,...
5
by: Michael Shell | last post by:
Greetings, Consider the XHTML document attached at the end of this post. When viewed under Firefox 1.0.5 on Linux, highlighting and pasting (into a text editor) the <pre> tag listing will...
38
by: Xah Lee | last post by:
sometimes i wish to add white space in <p> as to achived effects similar to tab. what should i do? using empty image seems the sure way but rather complicated. (and dosen't change size with...
7
by: noor.rahman | last post by:
I have an XML file that stores data from an HTML form. I use XSL to display the data in HTML format. The data may have newline characters. However, XSL is not displaying the newlines properly in...
1
by: Dan | last post by:
Hi. I want to be able to use the ListView::GetItemAt function from within the ListView's DragDrop event callback. My problem is that GetItemAt requires screen-space coordinates, and the...
11
by: namenotgivenhere | last post by:
My design goal is to make the white space above and below <p> and <ul> equal to the height of my font. The first step to achieving this I believe is to have an equal amount of white space above or...
4
by: asnowfall | last post by:
If I have white space in the <atag, IE interpretes it as line break. I tried setting "whie-space: pre" and it did not seem to affect. Here is a sample. ...
1
by: munnaj | last post by:
hai all, Help me with a style code for firefox. Below is the code which is working fine in IE. But does not work well in firefox. I have also tried with another class for firefox that too not...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.