473,715 Members | 6,082 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Problem using sscanf...

Hi,

using sscanf, I'm trying to retrieve something, but nothing seems to
work.

Here's the pattern: SS%*sþ0þ%6s
Heres the data: SS0000003950000 00000DC-þ0þ799829þ11745 03725þ

Actually, I would like to retrieve the "799829" from the data, but it
always failed. I thought that the "%*sþ0þ" would work as if I was
using "%*21cþ0þ", but it doesnt.

Can someone tell me why ?

Regards,

AM.

Mar 21 '07 #1
5 3502
On Mar 21, 3:19 pm, "Alex Mathieu" <alex.math...@g mail.comwrote:
using sscanf, I'm trying to retrieve something, but nothing seems to
work.

Here's the pattern: SS%*sþ0þ%6s
Heres the data: SS0000003950000 00000DC-þ0þ799829þ11745 03725þ

Actually, I would like to retrieve the "799829" from the data, but it
always failed. I thought that the "%*sþ0þ" would work as if I was
using "%*21cþ0þ", but it doesnt.

Can someone tell me why ?
Nope, but comp.lang.c might help, or you might consider using strtok
or Boost.Tokenizer or even std::tr1::regex (aka Boost.Regex).

Cheers! --M

Mar 21 '07 #2
On Mar 21, 8:19 pm, "Alex Mathieu" <alex.math...@g mail.comwrote:
Hi,

using sscanf, I'm trying to retrieve something, but nothing seems to
work.

Here's the pattern: SS%*sþ0þ%6s
Heres the data: SS0000003950000 00000DC-þ0þ799829þ11745 03725þ

Actually, I would like to retrieve the "799829" from the data, but it
always failed. I thought that the "%*sþ0þ" would work as if I was
using "%*21cþ0þ", but it doesnt.

Can someone tell me why ?

Regards,

AM.
I think your problem is that "%*s" already reads the entire string,
because there is no seperator to stop parsing the string (a whitespace
for example). This means that after "%*s" there's nothing left to
parse.

You really should consider using a regular expressions library, as
mlimber already said.

Alex

Mar 21 '07 #3
Alex Mathieu wrote:
Hi,

using sscanf, I'm trying to retrieve something, but nothing seems to
work.

Here's the pattern: SS%*sþ0þ%6s
Heres the data: SS0000003950000 00000DC-þ0þ799829þ11745 03725þ

Actually, I would like to retrieve the "799829" from the data, but it
always failed. I thought that the "%*sþ0þ" would work as if I was
using "%*21cþ0þ", but it doesnt.

Can someone tell me why ?

Regards,

AM.
Because scanf is greedy. It will match till it cannot match anymore.
What your scan pattern does is read in the entire 'word' which means
that it will read till it hits a whitespace. Try /something like/ this:

"SS%*[^-]-þ0þ%[^þ]%*s"

What this says is to read in a string that consists of every character
that is not a '-', read in the -þ0þ then read in a string that consists
of everything but a 'þ', then read in the rest of the 'word' (this last
step not necessary if you are using sscanf).

Note: using scanf can be difficult to get right. I also find the return
value not very useful as it doesn't tell me where the parse completed in
case I want to continue from there, but it could be useful for other
purposes.

When I use scanf, I usually use a %n in the format, and limit the string
read in to stop buffer overflows like this:

int byteOffset = 0; // must init as sscanf will not change if doesn't
// reach %n.
char stuff[7];
stuff[sizeof(stuff)-1] = '\0'; // ensuring null termination of the
// string without initialising the rest
// of it.

// Note the "%6[^þ]", this keeps the stuff buffer from overflowing.
sscanf(buffer, "SS%*[^-]-þ0þ%6[^þ]þ%n", stuff, &byteOffset) ;
if (byteOffset != 0) {
printf("You have read in the string %s.\n", stuff);
}

However, if you change the size of stuff to contain less elements then
you must change this number too, this is a potential maintenance
problem. To getting around that I would do like so:

// not sure if there is a header that contains these two macros:
#define STRINGIZE2(x) #x
#define STRINGIZE(x) STRINGIZE2(x)

#define DIM 7
int byteOffset = 0; // must init as sscanf will not change if doesn't
// reach %n.
char stuff[DIM];
stuff[DIM-1] = '\0'; // ensuring null termination of the string,
// without initialising the rest of it.

// Note the "%" STRINGIZE(DIM) "[^þ]", this keeps the stuff buffer
// from overflowing, while allowing you to modify the dimension at
// a single point some time later.
sscanf(buffer, "SS%*[^-]-þ0þ%" STRINGIZE(DIM)"[^þ]þ%n", stuff
, &byteOffset) ;
if (byteOffset != 0) {
printf("You have read in the string %s.\n", stuff);
}
#undef DIM // remove extraneous macros from the global macro namespace
#undef STRINGIZE
#undef STRINGIZE2

Because of the difficulty in its use, many people choose not use it.
However, if used correctly, it can be very fast at parsing.

FYI, I wrote this without testing it. There may be errors in the code
posted.

Hope this helps.
Adrian
--
_______________ _______________ _______________ _______________ _________
\/Adrian_Hawryluk BSc. - Specialties: UML, OOPD, Real-Time Systems\/
\ My newsgroup writings are licensed under a Creative Commons /
\ Attribution-Share Alike 3.0 License /
\_______[http://creativecommons.org/licenses/by-sa/3.0/]______/
\/_______[blog:_http://adrians-musings.blogspo t.com/]______\/
Mar 21 '07 #4
Yeah, seen this way this could be part of a solution and I REALLY
thank you..

Thing is that the sscanf is use into a log injector in our systems, so
I'm only specifying the pattern and the data to deal with... no very
much latitude. However, your solution with the "SS%*[^-]-þ0þ%[^þ]%*s"
pattern could help me for a while.

Actually, my problem is that I want to retrieve infos from a data
chunk where data are enclose between "þ" where "þ" is use as a
delimiter. Using regex it would be easy to retrieve the data with
something like þ*þ..., but with sscanf... this seems not too much
possible...

I'll try to think about a way to retrieve info easily from this
message...

However, thanks a lot for your very complete answer and the time you
took to wrote it down. It's not lost time, I'll try to implement that
solution on my own for test purpose :)

Regards,

Alexandre M.
Montréal, Québec

On 21 mar, 15:54, Adrian Hawryluk <adrian.hawrylu k-at-
gmail....@nospa m.comwrote:
Alex Mathieuwrote:
Hi,
using sscanf, I'm trying to retrieve something, but nothing seems to
work.
Here's the pattern: SS%*sþ0þ%6s
Heres the data: SS0000003950000 00000DC-þ0þ799829þ11745 03725þ
Actually, I would like to retrieve the "799829" from the data, but it
always failed. I thought that the "%*sþ0þ" would work as if I was
using "%*21cþ0þ", but it doesnt.
Can someone tell me why ?
Regards,
AM.

Because scanf is greedy. It will match till it cannot match anymore.
What your scan pattern does is read in the entire 'word' which means
that it will read till it hits a whitespace. Try /something like/ this:

"SS%*[^-]-þ0þ%[^þ]%*s"

What this says is to read in a string that consists of every character
that is not a '-', read in the -þ0þ then read in a string that consists
of everything but a 'þ', then read in the rest of the 'word' (this last
step not necessary if you are using sscanf).

Note: using scanf can be difficult to get right. I also find the return
value not very useful as it doesn't tell me where the parse completed in
case I want to continue from there, but it could be useful for other
purposes.

When I use scanf, I usually use a %n in the format, and limit the string
read in to stop buffer overflows like this:

int byteOffset = 0; // must init as sscanf will not change if doesn't
// reach %n.
char stuff[7];
stuff[sizeof(stuff)-1] = '\0'; // ensuring null termination of the
// string without initialising the rest
// of it.

// Note the "%6[^þ]", this keeps the stuff buffer from overflowing.
sscanf(buffer, "SS%*[^-]-þ0þ%6[^þ]þ%n", stuff, &byteOffset) ;
if (byteOffset != 0) {
printf("You have read in the string %s.\n", stuff);
}

However, if you change the size of stuff to contain less elements then
you must change this number too, this is a potential maintenance
problem. To getting around that I would do like so:

// not sure if there is a header that contains these two macros:
#define STRINGIZE2(x) #x
#define STRINGIZE(x) STRINGIZE2(x)

#define DIM 7
int byteOffset = 0; // must init as sscanf will not change if doesn't
// reach %n.
char stuff[DIM];
stuff[DIM-1] = '\0'; // ensuring null termination of the string,
// without initialising the rest of it.

// Note the "%" STRINGIZE(DIM) "[^þ]", this keeps the stuff buffer
// from overflowing, while allowing you to modify the dimension at
// a single point some time later.
sscanf(buffer, "SS%*[^-]-þ0þ%" STRINGIZE(DIM)"[^þ]þ%n", stuff
, &byteOffset) ;
if (byteOffset != 0) {
printf("You have read in the string %s.\n", stuff);
}
#undef DIM // remove extraneous macros from the global macro namespace
#undef STRINGIZE
#undef STRINGIZE2

Because of the difficulty in its use, many people choose not use it.
However, if used correctly, it can be very fast at parsing.

FYI, I wrote this without testing it. There may be errors in the code
posted.

Hope this helps.

Adrian
--
_______________ _______________ _______________ _______________ _________
\/Adrian_Hawryluk BSc. - Specialties: UML, OOPD, Real-Time Systems\/
\ My newsgroup writings are licensed under a Creative Commons /
\ Attribution-Share Alike 3.0 License /
\_______[http://creativecommons.org/licenses/by-sa/3.0/]______/
\/_______[blog:_http://adrians-musings.blogspo t.com/]______\/

Mar 22 '07 #5
Alex Mathieu wrote:
Yeah, seen this way this could be part of a solution and I REALLY
thank you..

Thing is that the sscanf is use into a log injector in our systems, so
I'm only specifying the pattern and the data to deal with... no very
much latitude. However, your solution with the "SS%*[^-]-þ0þ%[^þ]%*s"
pattern could help me for a while.

Actually, my problem is that I want to retrieve infos from a data
chunk where data are enclose between "þ" where "þ" is use as a
delimiter. Using regex it would be easy to retrieve the data with
something like þ*þ..., but with sscanf... this seems not too much
possible...

I'll try to think about a way to retrieve info easily from this
message...

However, thanks a lot for your very complete answer and the time you
took to wrote it down. It's not lost time, I'll try to implement that
solution on my own for test purpose :)
No problem Alex, but please don't top post or quote unnecessarily, it is
considered rude on the newsgroups.

Re your more general problem of getting data between the delimiters.
Use something like this:

// x is the max number of chars to read in.
#define RECORD(x) "%" #x "[^þ]þ"
#define RECORD_IGNORE "%*[^þ]þ"

char str[8] = {}; // init entire array to '\0'. Slightly more
// overhead than just init the last one to '\0' but
// for small non-time-critical applications, it
// should be fine.
sscanf(buffer, RECORD_IGNORE RECORD(7) RECORD_IGNORE, str);

- or -

#define LEN 7
char str[LEN+1] = {};
sscanf(buffer, RECORD_IGNORE RECORD(LEN) RECORD_IGNORE, str);

- *don't do* -

#define DIM 8
char str[DIM] = {};
sscanf(buffer, RECORD_IGNORE RECORD(DIM-1) RECORD_IGNORE, str);

as your string will become "%*[^þ]þ%8-1[^þ]þ%*[^þ]þ" which doesn't make
sense.

Good luck.
Adrian
--
_______________ _______________ _______________ _______________ _________
\/Adrian_Hawryluk BSc. - Specialties: UML, OOPD, Real-Time Systems\/
\ _---_ Q. What are you doing here? _---_ /
\ / | A. Just surf'n the net, teaching and | \ /
\__/___\___ learning, learning and teaching. You?_____/___\__/
\/______[blog:__http://adrians-musings.blogspo t.com/]______\/
Mar 22 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
1434
by: PX | last post by:
Greetings, Say I want to read a file line by line so I used fgets(). And I want to parse strings from each line of the file, I used sscanf(). But there's a strange problem that each time I run the program, sscanf() cannot read in first line of file correctly. It always appends a character at the end of last string that's been parsed. I am working on HP-UX using cc. Anybody has any clue? Thanks a bunch!
4
518
by: Ivan Lam | last post by:
Hi All, Thanks for reading my post! I have a problem that using the scanf function. I would like to scan a value from a line like: file:c:\program files\mpd\mpd.exe however, when I read the value by
10
5457
by: baumann | last post by:
hi, 1) first test program code #include <stdio.h> int main(void) { char * file = "aaa 23 32 m 2.23 ammasd"; int i2,i3;
4
2027
by: baumann | last post by:
hi all there has 2 program 1) the first test program code #include <stdio.h> int main(void) {
4
1806
by: lynx_ace | last post by:
Hi everyone. I need a little bit help here...I have an assignment and it is working fine until the last part which I can't solve. So here's the code in simple form #define maxlength 200 while( fgets( command, MAXLLENGTH, stdin ) != NULL ) {
2
3086
by: jou00jou | last post by:
Hi, I have trouble using sscanf and fgets to check for overflow. I will post the assignment specification so I could help whoever would kindly like to offer his/her help. ____________________________________________________________________________________ 1) The program should expect 2 decimal integers per line. Let's call these n1 and n2. 2) For each such input line it should output the value of n1 + 2 * n2 (followed by a newline...
9
2058
by: Tim Streater | last post by:
I'm obviously missing something here. I have a line like this (in $line), and I want all the text within the double quotes:: somestr="string I want" with some whitespace at the beginning. So I try this: $res = sscanf ($line, ' %s"%s"'); $wanted = $res;
5
7976
by: Timo | last post by:
I haven't been using ANSI-C for string parsing for some time, so even this simple task is problematic: I have a string tmp_str, which includes date + time + newline in format: "25.6.2008 21:49". I try to parse date from this string to variables tmp1, tmp2, tmp3: 1st attempt: sscanf(tmp_str, "%d.%d.%d", &tmp1, &tmp2, &tmp3);
4
1710
by: utab | last post by:
Dear all, I have to interface some C code in C++, but I had a problem with sscanf function, it has been some time I have not used C and I could not figure out my problem. Simple code is below, I am trying to read a file with line line 8 characters wide, 88888888 it has unix line ending LF, but I am getting a segfault from the sscanf
0
8823
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9343
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
9104
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9047
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7973
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6646
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5967
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4477
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4738
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.