473,320 Members | 1,939 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

parsing a string

I have a string like:

"FL:1234ABCD:3:FileName With Spaces.txt\n"

and I want to read the values separated by ':' into variables. I tried to
use sscanf like this:

sscanf("FL:%s:%d:%s\n", lGuid, &lID, lFileName);

but the lGUID just continues until whitespace or \0 is found. I need a way
to set the ':' as a separator, and a way for lFileName to be readable even
with whitespace in it.
Basically, a version of sscanf with specifiable speparators/terminators is
what I require. Is there anything to help me? At the moment, I am skipping
colons and using memcpy() which is messy.

Thanks
Allan
Jul 22 '05 #1
6 18439
"Allan Bruce" <al*****@TAKEAWAYf2s.com> wrote in message
news:c5**********@news.freedom2surf.net...
I have a string like:

"FL:1234ABCD:3:FileName With Spaces.txt\n"

and I want to read the values separated by ':' into variables. I tried to
use sscanf like this:

sscanf("FL:%s:%d:%s\n", lGuid, &lID, lFileName);

but the lGUID just continues until whitespace or \0 is found. I need a
way
to set the ':' as a separator, and a way for lFileName to be readable even
with whitespace in it.
Basically, a version of sscanf with specifiable speparators/terminators is
what I require. Is there anything to help me? At the moment, I am
skipping
colons and using memcpy() which is messy.


How's about something like this?

#include <string>
#include <sstream>
using namespace std;

int main()
{
string s = "FL:1234ABCD:3:FileName With Spaces.txt\n";
string FileName;
int ID;
string GUID;
istringstream stream(s);
stream.ignore(3); // ignore "FL:"
getline(stream, GUID, ':');
stream >> ID;
stream.ignore(); // ignore the ':' after id
getline(stream, FileName, '\n');

return 0;
}
Jul 22 '05 #2
Allan Bruce <al*****@TAKEAWAYf2s.com> wrote:
I have a string like:

"FL:1234ABCD:3:FileName With Spaces.txt\n"

and I want to read the values separated by ':' into variables. I tried to
use sscanf like this:

sscanf("FL:%s:%d:%s\n", lGuid, &lID, lFileName);

but the lGUID just continues until whitespace or \0 is found. I need a way
to set the ':' as a separator, and a way for lFileName to be readable even
with whitespace in it.
Basically, a version of sscanf with specifiable speparators/terminators is
what I require. Is there anything to help me? At the moment, I am skipping
colons and using memcpy() which is messy.
const std::string str = "FL:1234ABCD:3:FileName With Spaces.txt\n";
std::istringstream iss( str );

std::string FL;
std::getline( iss, FL, ':' );
if( FL!="FL" || !iss.good() ) throw invalid_input(str);

std::string GUID;
std::getline( iss, GUID, ':' );
if( !isValid(GUID) || !iss.good() ) throw invalid_input(str);

int ID;
iss >> ID;
if( !iss.good() ) throw invalid_input(str);

char ch = '\0';
iss >> ch;
if( ch!=':' || !iss.good() ) throw invalid_input(str);

std::string fname;
std::getline( iss, fname );
if( !iss.eof() ) throw invalid_input(str);
Thanks
HTH,
Allan

Schobi

--
Sp******@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers
Jul 22 '05 #3

"Allan Bruce" <al*****@TAKEAWAYf2s.com> wrote in message
news:c5**********@news.freedom2surf.net...
I have a string like:

"FL:1234ABCD:3:FileName With Spaces.txt\n"

and I want to read the values separated by ':' into variables. I tried to
use sscanf like this:

sscanf("FL:%s:%d:%s\n", lGuid, &lID, lFileName);

but the lGUID just continues until whitespace or \0 is found. I need a way to set the ':' as a separator, and a way for lFileName to be readable even
with whitespace in it.
Basically, a version of sscanf with specifiable speparators/terminators is
what I require. Is there anything to help me? At the moment, I am skipping colons and using memcpy() which is messy.

Thanks
Allan


Thanks for the help guys - I wanted something shorter hand, and found out
some things about sscanf. Now I just need to use:

sscanf(xiBuffer, "FD:%[^:]%s:%[^:]%d:%d", lGUID, &lLastPktFlag,
&lDataLength);

and it works a treat
Allan
Jul 22 '05 #4
"Allan Bruce" <al*****@TAKEAWAYf2s.com> wrote in message
news:c5**********@news.freedom2surf.net...
Thanks for the help guys - I wanted something shorter hand, and found out
some things about sscanf. Now I just need to use:

sscanf(xiBuffer, "FD:%[^:]%s:%[^:]%d:%d", lGUID, &lLastPktFlag,
&lDataLength);


It is shorter, but I feel obliged to point out that it is not safer. If you
set any of the types wrong, either in the format string or in the argument
list, sscanf will fail. If the type sscanf tries to read is smaller than the
variable you pass it, it'll likely fail rather benevolently, but if it's the
other way around you have the possibility of a stack overflow and all the
security problems associated with that. Same goes for overrunning your
string buffers, can be dangerous on the heap, even more so on the stack. The
question you need to ask yourself is whether you can trust yourself (and any
future maintainers of the code) to always ensure type safety and buffer
length safety where sscanf does not enforce it, and (even more importantly)
if you can always trust the input to be well-formed. If the string is coming
from a network/internet source especially, you need something a lot more
secure than sscanf since network input can't be trusted. Local file input is
slightly better, but might still be malformed either purpously or by
filesystem malfunction. This still applies even if some other part of your
program generates the string, because then you are assuming that that part
of the program is 100% correct and can never generate a wrong string, and
that the memory in which the string resides is not corrupted by some
external (or internal) force.

Of course I can't force you to use any kind of construct, but I just hope
you're aware of the dangers that lie with sscanf, and that those few more
lines of code of the examples I and Hendrik provided can easily help avoid
them.

--
Unforgiven
Jul 22 '05 #5
"Allan Bruce" <al*****@TAKEAWAYf2s.com> wrote in message news:<c5**********@news.freedom2surf.net>...
I have a string like:

"FL:1234ABCD:3:FileName With Spaces.txt\n"

and I want to read the values separated by ':' into variables. I tried to
use sscanf like this:


Use std::istringstream ... this has get and getline functions that
allow you to specify a termination character. You might find
something like this useful:

#include <string>
#include <sstream>
#include <iostream>

class my_parser {
std::string text;
public:
static const int max_field_size=100;

explicit my_parser(const char *s) : text(s) {}

void tokenize(char *fields[], int nf, const char delimiter=':') {
// parses internal string, breaking at instances of <delimiter>
// which are thrown away. Returns separated values as fields
std::istringstream buffer(text);
int f=0;
while (f<nf) {
buffer.getline(fields[f], max_field_size, delimiter);
++f;
}
}
};

int main() {
using namespace std;
const char* s="FL:1234ABCD:3:FileName With Spaces.txt\n";
char **fields = new char*[4];
for (int i=0; i<4; ++i)
fields[i]=new char[my_parser::max_field_size];

my_parser parse(s);
parse.tokenize(fields, 4);
for (int i=0; i<4; ++i)
cout << fields[i] << endl;

return 0;
}

Obviously this is somewhat lobotomized, but I only spent a few minutes
on it .. it compiles (under gcc 3.3) and runs, giving the desired
result. Hopefully you can get the gist and adapt it to your purposes.

HTH, Dave Moore
Jul 22 '05 #6
Allan Bruce wrote:

Thanks for the help guys - I wanted something shorter hand, and found out
some things about sscanf. Now I just need to use:

sscanf(xiBuffer, "FD:%[^:]%s:%[^:]%d:%d", lGUID, &lLastPktFlag,
&lDataLength);

and it works a treat


Dear god, man, think of the children!!

Seriously, that code has very serious problems and should not be used.
Never, NEVER use %s, or %[ in scanf() without specifying a correct field
width. It's a security exploit waiting to happen. It's gets()
reinvented. Just... no.

By the time you manage to get proper field widths in there, you'll find
a few things: 1) It's nearly as complex as the stringstream solutions.
Maybe even more complex. 2) It's less flexible than the stringstream
solutions. 3) It's less clear and maintainable than the stringstream
solutions. 4) Over the life of your program, it will harbor more bugs
than the stringstream solution.

Besides that, it's still wrong. I count five conversion fields, and only
three additional arguments. That gives undefined behavior (likely
result: stack corruption leading to a program crash or random bizarre
behavior). I also see %s followed by a ':'. That can never succeed,
since %s will not stop on a ':' character.

-Kevin
--
My email address is valid, but changes periodically.
To contact me please use the address from a recent posting.
Jul 22 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Gerrit Holl | last post by:
Posted with permission from the author. I have some comments on this PEP, see the (coming) followup to this message. PEP: 321 Title: Date/Time Parsing and Formatting Version: $Revision: 1.3 $...
2
by: Cigdem | last post by:
Hello, I am trying to parse the XML files that the user selects(XML files are on anoher OS400 system called "wkdis3"). But i am permenantly getting that error: Directory0: \\wkdis3\ROOT\home...
6
by: BerkshireGuy | last post by:
Does anyone know of a good function that will parse out parts of an SQL statement that is passed to it in seperate variables? It should be able to parse statements that contain ORDERBY, WHERE,...
9
by: Paulers | last post by:
Hello, I have a log file that contains many multi-line messages. What is the best approach to take for extracting data out of each message and populating object properties to be stored in an...
3
by: Anup Daware | last post by:
Hi Group, I am facing a strange problem here: I am trying to read xml response from a servlet using XmlTextWriter. I am able to read the read half of the xml and suddenly an exception:...
3
by: aspineux | last post by:
My goal is to write a parser for these imaginary string from the SMTP protocol, regarding RFC 821 and 1869. I'm a little flexible with the BNF from these RFC :-) Any comment ? tests= def...
2
by: RG | last post by:
I am having trouble parsing the data I need from a Serial Port Buffer. I am sending info to a microcontroller that is being echoed back that I need to remove before I start the actual important...
6
by: gw7rib | last post by:
I have a program that needs to do a small amount of relatively simple parsing. The routines I've written work fine, but the code using them is a bit long-winded. I therefore had the idea of...
1
by: hd95 | last post by:
In a perfect world my xml feed source would produce perfect xml ..that is not the case I am parsing an XML feed that sometimes has ampersands and dashes in the content that messes up my parsing. ...
1
by: eyeore | last post by:
Hello everyone my String reverse code works but my professor wants me to use pop top push or Stack code and parsing code could you please teach me how to make this code work with pop top push or...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.