473,769 Members | 2,244 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

frustrated by fscanf and sscanf

a
After reading FAQ comp.lang.c section 12 and googling again, still there is
no threads talking about reading a series of numbers. The input files,
somehow structured, is exemplified below:

<presence/absence of n space/tab on the first n lines>
12

<presence/absence of n space/tab here>0<presence/absence of n space/tab
here>90 10 23 43 0 0 0 0 0 0 0
90 0 0 0 0 88 0 0 0 0 0 0
10 0 0 0 0 0 26 16 0 0 0 0
23 0 0 0 0 0 0 0 0 0 0 0
43 0 0 0 0 0 0 0 0 0 0 0
0 88 0 0 0 0 0 0 1 0 0 0
0 0 26 0 0 0 0 0 0 0 0 0
0 0 16 0 0 0 0 0 0 96 0 0
0 0 0 0 0 1 0 0 0 0 29 0
0 0 0 0 0 0 0 96 0 0 0 37
0 0 0 0 0 0 0 0 29 0 0 0
0 0 0 0 0 0 0 0 0 37 0 0
<presence/absence of n space/tab/newline here>
0 36 54 26 59 72 9 34 79 17 46 95
36 0 73 35 90 58 30 78 35 44 79 36
54 73 0 21 10 97 58 66 69 61 54 63
26 35 21 0 93 12 46 40 37 48 68 85
59 90 10 93 0 64 5 29 76 16 5 76
72 58 97 12 64 0 96 55 38 54 0 34
9 30 58 46 5 96 0 83 35 11 56 37
34 78 66 40 29 55 83 0 44 12 15 80
79 35 69 37 76 38 35 44 0 64 39 33
17 44 61 48 16 54 11 12 64 0 70 86
46 79 54 68 5 0 56 15 39 70 0 18
95 36 63 85 76 34 37 80 33 86 18 0
<presence/absence of n space/tab/newline here>
The unusual but deterministic behaviour is that some rows can be read
successfully but not the others. In this case, the first matrix is read
successfully, but the second one, nothing can be read (I think, printf shows
all zeros) until the 5 96 0 83 (the 7th row on 2nd matrix) but then the
reading is then on and off again (Frustrated >.< )

My code, after trying feof, fgets, sscanf and so, now is as follows:

while(!feof(fil e) && i< SIZE * SIZE + SIZE * SIZE) { //SIZE, an
integer obtained from reading the 1st line
fscanf(file, "%lf", &r[i]);
i++;
}

I know they are integer matrices but the double data structure is needed for
further development.
Nov 28 '07 #1
5 2314
On Nov 27, 6:48 pm, "a" <a...@a.comwrot e:
After reading FAQ comp.lang.c section 12 and googling again, still there is
no threads talking about reading a series of numbers. The input files,
somehow structured, is exemplified below:

<presence/absence of n space/tab on the first n lines>
12

<presence/absence of n space/tab here>0<presence/absence of n space/tab
here>90 10 23 43 0 0 0 0 0 0 0
90 0 0 0 0 88 0 0 0 0 0 0
10 0 0 0 0 0 26 16 0 0 0 0
23 0 0 0 0 0 0 0 0 0 0 0
43 0 0 0 0 0 0 0 0 0 0 0
0 88 0 0 0 0 0 0 1 0 0 0
0 0 26 0 0 0 0 0 0 0 0 0
0 0 16 0 0 0 0 0 0 96 0 0
0 0 0 0 0 1 0 0 0 0 29 0
0 0 0 0 0 0 0 96 0 0 0 37
0 0 0 0 0 0 0 0 29 0 0 0
0 0 0 0 0 0 0 0 0 37 0 0
<presence/absence of n space/tab/newline here>
0 36 54 26 59 72 9 34 79 17 46 95
36 0 73 35 90 58 30 78 35 44 79 36
54 73 0 21 10 97 58 66 69 61 54 63
26 35 21 0 93 12 46 40 37 48 68 85
59 90 10 93 0 64 5 29 76 16 5 76
72 58 97 12 64 0 96 55 38 54 0 34
9 30 58 46 5 96 0 83 35 11 56 37
34 78 66 40 29 55 83 0 44 12 15 80
79 35 69 37 76 38 35 44 0 64 39 33
17 44 61 48 16 54 11 12 64 0 70 86
46 79 54 68 5 0 56 15 39 70 0 18
95 36 63 85 76 34 37 80 33 86 18 0
<presence/absence of n space/tab/newline here>
The unusual but deterministic behaviour is that some rows can be read
successfully but not the others. In this case, the first matrix is read
successfully, but the second one, nothing can be read (I think, printf shows
all zeros) until the 5 96 0 83 (the 7th row on 2nd matrix) but then the
reading is then on and off again (Frustrated >.< )

My code, after trying feof, fgets, sscanf and so, now is as follows:

while(!feof(fil e) && i< SIZE * SIZE + SIZE * SIZE) { //SIZE, an
integer obtained from reading the 1st line
fscanf(file, "%lf", &r[i]);
i++;
}

I know they are integer matrices but the double data structure is needed for
further development.
Suggestion:
The scanf() function is evil, as everyone knows.
I suggest using fgets() to read a line, and then parse the line into
individual numbers with something like this:
=============== =============== =============== =============== ===========
#include <string.h>
#include <limits.h>
#include <stdlib.h>
#include <ctype.h>

/* The default delimiters are chosen as some ordinary white space
characters: */
static const char default_delimit ers[] = {' ', '\n', '\t', '\r', '\f',
0};

/*
* The tokenize() function is similar to a reentrant version of
strtok().
* It parses tokens from 'string', where tokens are substrings
separated by characters from 'delimiter_list '.
* To get the first token from 'string', tokenize() is called with
'string' as its first parameter.
* Remaining tokens from 'string' are obtained by calling tokenize()
with NULL for the first parameter.
* The string of delimiters, identified by 'delimiter_list ', can
change from call to call.
* If the string of delimiters is NULL, then the standard list
'default_delimi ters' (see above) is used.
* tokenize() modifies the memory pointed to by 'string', because it
writes null characters into the buffer.
*/
char *tokenize(char *string, const char *delimiter_list ,
char **placeholder)
{
if (delimiter_list == NULL)
delimiter_list = default_delimit ers;

if (delimiter_list[0] == 0)
delimiter_list = default_delimit ers;

if (string == NULL)
string = *placeholder;

if (string == NULL)
return NULL;
/*
* The strspn() function computes the length of the initial segment of
the first string
* that consists entirely of characters contained in the second
string.
*/
string += strspn(string, delimiter_list) ;
if (!string[0]) {
*placeholder = string;
return NULL;
} else {
char *token;
token = string;
/*
* The strpbrk() function finds the first occurrence of any character
contained in the second string
* found in the first string.
*/
string = strpbrk(token, delimiter_list) ;
if (string == NULL)
*placeholder = token + strlen(token);
else {
*string++ = 0;
*placeholder = string;
}
return token;
}
}

#ifdef UNIT_TEST
char test_string0[] = "This is a test. This is only a
test. If it were an actual emergency, you would be dead.";
char test_string1[] = "This is a also a test. This is only
a test. If it were an actual emergency, you would be dead. 12345";
char test_string2[] = "The quick brown fox jumped over the
lazy dog's back 1234567890 times.";
char test_string3[] = " \t\r\n\fThe quick brown fox jumped
over the lazy dog's back 1234567890 times.";
char test_string4[] = "This is a test. This is only a
test. If it were an actual emergency, you would be dead.";
char test_string5[] = "This is a also a test. This is only
a test. If it were an actual emergency, you would be dead. 12345";
char test_string6[] = "The quick brown fox jumped over the
lazy dog's back 1234567890 times.";
char test_string7[] = " \t\r\n\fThe quick brown fox jumped
over the lazy dog's back 1234567890 times.";

#include <stdio.h>

char whitespace[UCHAR_MAX + 1];

/* This test will create token separators as any whitespace or any
punctuation marks: */
void init_whitespace ()
{
int i;
int index = 0;
for (i = 0; i < UCHAR_MAX; i++) {
if (isspace(i)) {
whitespace[index++] = (char) i;
}
if (ispunct(i)) {
whitespace[index++] = (char) i;
}
}
}

/*
TNX Gerd.
*/
void spin_test(char *test_string, char *white)
{
char *p = NULL;
char *token;
token = tokenize(test_s tring, white, &p);
while (token) {
puts(token);
token = tokenize(NULL, white, &p);
}
}

int main(void)
{
init_whitespace ();
puts("Whitespac e is whitespace+punc tuation");
spin_test(test_ string0, whitespace);
spin_test(test_ string1, whitespace);
spin_test(test_ string2, whitespace);
spin_test(test_ string3, whitespace);
puts("Whitespac e is simple whitespace");
spin_test(test_ string4, NULL);
spin_test(test_ string5, NULL);
spin_test(test_ string6, NULL);
spin_test(test_ string7, NULL);
return 0;
}
#endif
=============== =============== =============== =============== ===========

And then read the numbers one at a time using sscanf() on the
fragment, checking the return of sscanf() each time. The problem with
scanf() is that you don't know where it went off. By splitting into
pieces you can easily find out where the trouble spots are and more
quickly diagnose the changes you will have to make.

There are (of course) many other alternatives.
Nov 28 '07 #2
a wrote:
After reading FAQ comp.lang.c section 12 and googling again, still
there is no threads talking about reading a series of numbers. The
input files, somehow structured, is exemplified below:
<snip description of input file structure>

Your description of the file format makes no sense at all. Do you mean
that the format of lines can change within the file.

Why not use a language like Perl which is expressly designed for such
purposes, at least to re-structure the file into a consistent format,
and then have the C program read the sanitised file?
The unusual but deterministic behaviour is that some rows can be read
successfully but not the others.
This indicates that your format specifiers are working for some rows,
but not for others. Reading data whose format varies on the fly with
the *scanf() family of functions is tricky. At the very least, why
don't you capture a line completely with fgets() and then try to pick
it apart with sscanf().

You should error check _every_ call to _every_ library function. This
way you can easily narrow down the input failure to a particular line
(provided you use the fgets()/sscanf() method I described). Then we can
say more about your problem.
In this case, the first matrix is
read successfully, but the second one, nothing can be read (I think,
printf shows all zeros) until the 5 96 0 83 (the 7th row on 2nd
matrix) but then the reading is then on and off again (Frustrated >.<
)
Such vague information along with statements like "I think..." is not
(unfortunately) going to be enough to help you. You must provide us the
compilable source code of a minimal program that still exhibits your
problem, along with a sample of your input file or a clearer
description of it's format.
My code, after trying feof, fgets, sscanf and so, now is as follows:

while(!feof(fil e) && i< SIZE * SIZE + SIZE * SIZE) {
feof() and ferror() make sense only _after_ a read operation has failed.
They are used to determine _why_ the read failed, because of
end-of-file or an error.

In this loop control expression you are trying to do too many things at
once and doing them incorrectly too.
//SIZE, an
integer obtained from reading the 1st line
fscanf(file, "%lf", &r[i]);
i++;
}
This is hopeless. Please read in every line with fgets() and try to
convert it with sscanf(). Please check all library calls for failure.
Something like this:

#include <stdio.h>
#include <stdlib.h>
#define MAX_LINE 128

int main(void) {
char line[MAX_LINE];
int retval;
FILE *fp = fopen("input.fi le", "r");

if (!fp) return EXIT_FAILURE;

while (fgets(line, MAX_LINE, fp) != NULL) {
retval = sscanf(line, "%WHATEVER_FORM AT", &MATRIX_ELEMENT );
/* If retval does not contain the number of items you expect the
call to have successfully read and converted then something
went wrong with your format specifier and the concerned line.
*/
/* Other processing */
}
/* NOW you can check feof() and ferror() to determine why fgets()
returned NULL
*/
return STATUS;
}

<snip>

Nov 28 '07 #3
a

"santosh" <sa*********@gm ail.comwrote in message
news:fi******** **@aioe.org...
>a wrote:
>After reading FAQ comp.lang.c section 12 and googling again, still
there is no threads talking about reading a series of numbers. The
input files, somehow structured, is exemplified below:

<snip description of input file structure>

Your description of the file format makes no sense at all. Do you mean
that the format of lines can change within the file.

Why not use a language like Perl which is expressly designed for such
purposes, at least to re-structure the file into a consistent format,
and then have the C program read the sanitised file?

I also agree that Perl does that elegantly by regular expression. However,
because the numbers are to be placed onto a matrix which will be operated on
by C codes, I need to write the whole program in C.
Nov 28 '07 #4
a wrote:
After reading FAQ comp.lang.c section 12 and googling again, still there is
no threads talking about reading a series of numbers. The input files,
somehow structured, is exemplified below:
....
The unusual but deterministic behaviour is that some rows can be read
successfully but not the others. In this case, the first matrix is read
successfully, but the second one, nothing can be read (I think, printf shows
all zeros) until the 5 96 0 83 (the 7th row on 2nd matrix) but then the
reading is then on and off again (Frustrated >.< )
I took your example data, removing what I assumed was explanatory text
that's not in your actual data, and stored it in a file. I wrapped your
code fragment in a complete program, which set up everything
appropriately. You used feof() inappropriately , and SIZE violates the
usual conventions for naming what must be a variable in this program,
but I left those things uncorrected, since they shouldn't affect the
results.

// This code was written based upon a message posted by a@a.com on the
// usenet newsgroup comp.lang.c
// Message-ID: <fi***********@ justice.itsc.cu hk.edu.hk>
// Date: Wed, 28 Nov 2007 10:48:27 +0800
// The lines from that message are marked with //a. The rest of this program
// was written by James Kuyper to fill in a suitable context.
#include <stdio.h>
#include <stdlib.h>

int main(void)
{
int SIZE;
int retval = EXIT_SUCCESS;
const char filename[] = "test.dat";
FILE *file = fopen(filename, "r");
double *r;

if(file == NULL)
{
perror(filename );
return EXIT_FAILURE;
}
if(fscanf(file, "%d ", &SIZE) != 1)

perror("SIZE");
retval = EXIT_FAILURE;
}
else if(SIZE <1 || SIZE_MAX/2/SIZE/SIZE < 1)
{
fprintf(stderr, "Unacceptab le value for SIZE:%d\n", SIZE);
retval = EXIT_FAILURE;
}
else if((r=malloc(2* SIZE*SIZE*sizeo f(*r)))==NULL)
{
fprintf(stderr, "Insufficie nt memory");
retval = EXIT_FAILURE;
}
else
{
int i;

printf("Reading 2 %dX%d arrays of double.\n", SIZE, SIZE);
while(!feof(fil e) && i< SIZE * SIZE + SIZE * SIZE) { //a
int n =
fscanf(file, "%lf", &r[i]); //a
if(n != 1)
{
fprintf(stderr, "fscanf() returned %d\n", n);
break;
}
i++; //a
} //a
if(ferror(file) )
{
perror(filename );
retval = EXIT_FAILURE;
}
printf("Element s read:%d\n", i);

free(r);
}

fclose(file);
return retval;
}

I compiled and ran my version of your program, with the following results:

~/testprog(77) make scan_array
cc -std=c99 -pedantic -Wall -Wpointer-arith -Wcast-align -Wwrite-strings
-Wstrict-prototypes -Wmissing-prototypes -c -o scan_array.o scan_array.c
cc scan_array.o -o scan_array
~/testprog(78) scan_array
Reading 2 12X12 arrays of double.
Elements read:288

Whatever the problem with your actual program is, it comes from
something that's different from what I wrote. Therefore, what you should
do is simplify your code as much as possible, while still demonstrating
the problem. Then post your ENTIRE program, not just a fragment. Post
your actual input file, not one with textual explanations stuck in the
middle, and finally include the exact output from your posted program,
when run using your posted data. Only then will be able to help you further.
Nov 28 '07 #5
a
Thank you all. After simplifying my codes, I successfully debug.
Nov 28 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1286
by: Vladimir | last post by:
Hello, All! Is there any analogous class/function to sscanf() & fscanf() in C#? In another words, I need to convert string into the set of given variables of different types. Regards, Vladimir. Winamp 5.0 (playing): Stratovarius - Call Of The Wilderness (instrumental)
2
422
by: Vladimir | last post by:
Hello, All! Is there any analogous class/function to sscanf() & fscanf() in C#? In another words, I need to convert string into the set of given variables of different types. Regards, Vladimir. Winamp 5.0 (playing): Stratovarius - Call Of The Wilderness
4
3060
by: Psibur | last post by:
Hello, trying to get back into c and was having issue with reading a simple text file with an aribtrary # of lines with 3 int's per line, with the eventual purpose of putting each int into an element of an array (eventually will be other things, but I'm sticking to int's for now). I.e.: 0 1 1 1 1 1 2 1 1 etc... The problem is it'll read and print all but the last line. Is there
7
5458
by: Thomas Sourmail | last post by:
Hi, I hope I am missing something simple, but.. here is my problem: I need my program to check the last column of a file, as in : a b c d target ref 0 0 0 0 1 a 1 0 0 0 1.5 b 2 0 0 0 2 c
7
2828
by: Kay | last post by:
1) If i want to read data from a txt file, eg John; 23; a Mary; 16; i How can I read the above data stopping reading b4 each semi-colon and save it in three different variables ? 2) If I enter a number, can I use to call a particular node ? eg enter a number: 3 calling node of number 3 is it possible ?
1
2214
by: siliconwafer | last post by:
Hi All, here is one code: int main() { FILE*fp; unsigned long a; fp = fopen("my_file.txt","w+"); a = 24; fprintf(fp,"%ld",a); while(fscanf(fp,"%ld",&a) == 1) {
4
4232
by: John | last post by:
I need to read data from the file like the following with name and score, but some line may only has name without score: joe 100 amy 80 may Here's my code, but it couldn't read the line with "may" because there is no score. Anyone knows what is the workaround to this problem?
37
4979
by: PeterOut | last post by:
I am using MS Visual C++ 6.0 on Windows XP 5.1 (SP2). I am not sure if this is a C, C++ or MS issue but fscanf has been randomly hanging on me. I make the call hundreds, if not thousands, of times but it hangs in different places with the same data. The offending code follows. ReadFile(char *csFileName) { float fFloat1, fFloat2;
59
5591
by: David Mathog | last post by:
Apologies if this is in the FAQ. I looked, but didn't find it. In a particular program the input read from a file is supposed to be: + 100 200 name1 - 101 201 name2 It is parsed by reading the + character, and then sending the remainder into fscanf() like
0
9579
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10206
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10035
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8863
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7403
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6662
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5441
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3949
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2811
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.