473,387 Members | 1,483 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

frustrated by fscanf and sscanf

a
After reading FAQ comp.lang.c section 12 and googling again, still there is
no threads talking about reading a series of numbers. The input files,
somehow structured, is exemplified below:

<presence/absence of n space/tab on the first n lines>
12

<presence/absence of n space/tab here>0<presence/absence of n space/tab
here>90 10 23 43 0 0 0 0 0 0 0
90 0 0 0 0 88 0 0 0 0 0 0
10 0 0 0 0 0 26 16 0 0 0 0
23 0 0 0 0 0 0 0 0 0 0 0
43 0 0 0 0 0 0 0 0 0 0 0
0 88 0 0 0 0 0 0 1 0 0 0
0 0 26 0 0 0 0 0 0 0 0 0
0 0 16 0 0 0 0 0 0 96 0 0
0 0 0 0 0 1 0 0 0 0 29 0
0 0 0 0 0 0 0 96 0 0 0 37
0 0 0 0 0 0 0 0 29 0 0 0
0 0 0 0 0 0 0 0 0 37 0 0
<presence/absence of n space/tab/newline here>
0 36 54 26 59 72 9 34 79 17 46 95
36 0 73 35 90 58 30 78 35 44 79 36
54 73 0 21 10 97 58 66 69 61 54 63
26 35 21 0 93 12 46 40 37 48 68 85
59 90 10 93 0 64 5 29 76 16 5 76
72 58 97 12 64 0 96 55 38 54 0 34
9 30 58 46 5 96 0 83 35 11 56 37
34 78 66 40 29 55 83 0 44 12 15 80
79 35 69 37 76 38 35 44 0 64 39 33
17 44 61 48 16 54 11 12 64 0 70 86
46 79 54 68 5 0 56 15 39 70 0 18
95 36 63 85 76 34 37 80 33 86 18 0
<presence/absence of n space/tab/newline here>
The unusual but deterministic behaviour is that some rows can be read
successfully but not the others. In this case, the first matrix is read
successfully, but the second one, nothing can be read (I think, printf shows
all zeros) until the 5 96 0 83 (the 7th row on 2nd matrix) but then the
reading is then on and off again (Frustrated >.< )

My code, after trying feof, fgets, sscanf and so, now is as follows:

while(!feof(file) && i< SIZE * SIZE + SIZE * SIZE) { //SIZE, an
integer obtained from reading the 1st line
fscanf(file, "%lf", &r[i]);
i++;
}

I know they are integer matrices but the double data structure is needed for
further development.
Nov 28 '07 #1
5 2290
On Nov 27, 6:48 pm, "a" <a...@a.comwrote:
After reading FAQ comp.lang.c section 12 and googling again, still there is
no threads talking about reading a series of numbers. The input files,
somehow structured, is exemplified below:

<presence/absence of n space/tab on the first n lines>
12

<presence/absence of n space/tab here>0<presence/absence of n space/tab
here>90 10 23 43 0 0 0 0 0 0 0
90 0 0 0 0 88 0 0 0 0 0 0
10 0 0 0 0 0 26 16 0 0 0 0
23 0 0 0 0 0 0 0 0 0 0 0
43 0 0 0 0 0 0 0 0 0 0 0
0 88 0 0 0 0 0 0 1 0 0 0
0 0 26 0 0 0 0 0 0 0 0 0
0 0 16 0 0 0 0 0 0 96 0 0
0 0 0 0 0 1 0 0 0 0 29 0
0 0 0 0 0 0 0 96 0 0 0 37
0 0 0 0 0 0 0 0 29 0 0 0
0 0 0 0 0 0 0 0 0 37 0 0
<presence/absence of n space/tab/newline here>
0 36 54 26 59 72 9 34 79 17 46 95
36 0 73 35 90 58 30 78 35 44 79 36
54 73 0 21 10 97 58 66 69 61 54 63
26 35 21 0 93 12 46 40 37 48 68 85
59 90 10 93 0 64 5 29 76 16 5 76
72 58 97 12 64 0 96 55 38 54 0 34
9 30 58 46 5 96 0 83 35 11 56 37
34 78 66 40 29 55 83 0 44 12 15 80
79 35 69 37 76 38 35 44 0 64 39 33
17 44 61 48 16 54 11 12 64 0 70 86
46 79 54 68 5 0 56 15 39 70 0 18
95 36 63 85 76 34 37 80 33 86 18 0
<presence/absence of n space/tab/newline here>
The unusual but deterministic behaviour is that some rows can be read
successfully but not the others. In this case, the first matrix is read
successfully, but the second one, nothing can be read (I think, printf shows
all zeros) until the 5 96 0 83 (the 7th row on 2nd matrix) but then the
reading is then on and off again (Frustrated >.< )

My code, after trying feof, fgets, sscanf and so, now is as follows:

while(!feof(file) && i< SIZE * SIZE + SIZE * SIZE) { //SIZE, an
integer obtained from reading the 1st line
fscanf(file, "%lf", &r[i]);
i++;
}

I know they are integer matrices but the double data structure is needed for
further development.
Suggestion:
The scanf() function is evil, as everyone knows.
I suggest using fgets() to read a line, and then parse the line into
individual numbers with something like this:
================================================== =====================
#include <string.h>
#include <limits.h>
#include <stdlib.h>
#include <ctype.h>

/* The default delimiters are chosen as some ordinary white space
characters: */
static const char default_delimiters[] = {' ', '\n', '\t', '\r', '\f',
0};

/*
* The tokenize() function is similar to a reentrant version of
strtok().
* It parses tokens from 'string', where tokens are substrings
separated by characters from 'delimiter_list'.
* To get the first token from 'string', tokenize() is called with
'string' as its first parameter.
* Remaining tokens from 'string' are obtained by calling tokenize()
with NULL for the first parameter.
* The string of delimiters, identified by 'delimiter_list', can
change from call to call.
* If the string of delimiters is NULL, then the standard list
'default_delimiters' (see above) is used.
* tokenize() modifies the memory pointed to by 'string', because it
writes null characters into the buffer.
*/
char *tokenize(char *string, const char *delimiter_list,
char **placeholder)
{
if (delimiter_list == NULL)
delimiter_list = default_delimiters;

if (delimiter_list[0] == 0)
delimiter_list = default_delimiters;

if (string == NULL)
string = *placeholder;

if (string == NULL)
return NULL;
/*
* The strspn() function computes the length of the initial segment of
the first string
* that consists entirely of characters contained in the second
string.
*/
string += strspn(string, delimiter_list);
if (!string[0]) {
*placeholder = string;
return NULL;
} else {
char *token;
token = string;
/*
* The strpbrk() function finds the first occurrence of any character
contained in the second string
* found in the first string.
*/
string = strpbrk(token, delimiter_list);
if (string == NULL)
*placeholder = token + strlen(token);
else {
*string++ = 0;
*placeholder = string;
}
return token;
}
}

#ifdef UNIT_TEST
char test_string0[] = "This is a test. This is only a
test. If it were an actual emergency, you would be dead.";
char test_string1[] = "This is a also a test. This is only
a test. If it were an actual emergency, you would be dead. 12345";
char test_string2[] = "The quick brown fox jumped over the
lazy dog's back 1234567890 times.";
char test_string3[] = " \t\r\n\fThe quick brown fox jumped
over the lazy dog's back 1234567890 times.";
char test_string4[] = "This is a test. This is only a
test. If it were an actual emergency, you would be dead.";
char test_string5[] = "This is a also a test. This is only
a test. If it were an actual emergency, you would be dead. 12345";
char test_string6[] = "The quick brown fox jumped over the
lazy dog's back 1234567890 times.";
char test_string7[] = " \t\r\n\fThe quick brown fox jumped
over the lazy dog's back 1234567890 times.";

#include <stdio.h>

char whitespace[UCHAR_MAX + 1];

/* This test will create token separators as any whitespace or any
punctuation marks: */
void init_whitespace()
{
int i;
int index = 0;
for (i = 0; i < UCHAR_MAX; i++) {
if (isspace(i)) {
whitespace[index++] = (char) i;
}
if (ispunct(i)) {
whitespace[index++] = (char) i;
}
}
}

/*
TNX Gerd.
*/
void spin_test(char *test_string, char *white)
{
char *p = NULL;
char *token;
token = tokenize(test_string, white, &p);
while (token) {
puts(token);
token = tokenize(NULL, white, &p);
}
}

int main(void)
{
init_whitespace();
puts("Whitespace is whitespace+punctuation");
spin_test(test_string0, whitespace);
spin_test(test_string1, whitespace);
spin_test(test_string2, whitespace);
spin_test(test_string3, whitespace);
puts("Whitespace is simple whitespace");
spin_test(test_string4, NULL);
spin_test(test_string5, NULL);
spin_test(test_string6, NULL);
spin_test(test_string7, NULL);
return 0;
}
#endif
================================================== =====================

And then read the numbers one at a time using sscanf() on the
fragment, checking the return of sscanf() each time. The problem with
scanf() is that you don't know where it went off. By splitting into
pieces you can easily find out where the trouble spots are and more
quickly diagnose the changes you will have to make.

There are (of course) many other alternatives.
Nov 28 '07 #2
a wrote:
After reading FAQ comp.lang.c section 12 and googling again, still
there is no threads talking about reading a series of numbers. The
input files, somehow structured, is exemplified below:
<snip description of input file structure>

Your description of the file format makes no sense at all. Do you mean
that the format of lines can change within the file.

Why not use a language like Perl which is expressly designed for such
purposes, at least to re-structure the file into a consistent format,
and then have the C program read the sanitised file?
The unusual but deterministic behaviour is that some rows can be read
successfully but not the others.
This indicates that your format specifiers are working for some rows,
but not for others. Reading data whose format varies on the fly with
the *scanf() family of functions is tricky. At the very least, why
don't you capture a line completely with fgets() and then try to pick
it apart with sscanf().

You should error check _every_ call to _every_ library function. This
way you can easily narrow down the input failure to a particular line
(provided you use the fgets()/sscanf() method I described). Then we can
say more about your problem.
In this case, the first matrix is
read successfully, but the second one, nothing can be read (I think,
printf shows all zeros) until the 5 96 0 83 (the 7th row on 2nd
matrix) but then the reading is then on and off again (Frustrated >.<
)
Such vague information along with statements like "I think..." is not
(unfortunately) going to be enough to help you. You must provide us the
compilable source code of a minimal program that still exhibits your
problem, along with a sample of your input file or a clearer
description of it's format.
My code, after trying feof, fgets, sscanf and so, now is as follows:

while(!feof(file) && i< SIZE * SIZE + SIZE * SIZE) {
feof() and ferror() make sense only _after_ a read operation has failed.
They are used to determine _why_ the read failed, because of
end-of-file or an error.

In this loop control expression you are trying to do too many things at
once and doing them incorrectly too.
//SIZE, an
integer obtained from reading the 1st line
fscanf(file, "%lf", &r[i]);
i++;
}
This is hopeless. Please read in every line with fgets() and try to
convert it with sscanf(). Please check all library calls for failure.
Something like this:

#include <stdio.h>
#include <stdlib.h>
#define MAX_LINE 128

int main(void) {
char line[MAX_LINE];
int retval;
FILE *fp = fopen("input.file", "r");

if (!fp) return EXIT_FAILURE;

while (fgets(line, MAX_LINE, fp) != NULL) {
retval = sscanf(line, "%WHATEVER_FORMAT", &MATRIX_ELEMENT);
/* If retval does not contain the number of items you expect the
call to have successfully read and converted then something
went wrong with your format specifier and the concerned line.
*/
/* Other processing */
}
/* NOW you can check feof() and ferror() to determine why fgets()
returned NULL
*/
return STATUS;
}

<snip>

Nov 28 '07 #3
a

"santosh" <sa*********@gmail.comwrote in message
news:fi**********@aioe.org...
>a wrote:
>After reading FAQ comp.lang.c section 12 and googling again, still
there is no threads talking about reading a series of numbers. The
input files, somehow structured, is exemplified below:

<snip description of input file structure>

Your description of the file format makes no sense at all. Do you mean
that the format of lines can change within the file.

Why not use a language like Perl which is expressly designed for such
purposes, at least to re-structure the file into a consistent format,
and then have the C program read the sanitised file?

I also agree that Perl does that elegantly by regular expression. However,
because the numbers are to be placed onto a matrix which will be operated on
by C codes, I need to write the whole program in C.
Nov 28 '07 #4
a wrote:
After reading FAQ comp.lang.c section 12 and googling again, still there is
no threads talking about reading a series of numbers. The input files,
somehow structured, is exemplified below:
....
The unusual but deterministic behaviour is that some rows can be read
successfully but not the others. In this case, the first matrix is read
successfully, but the second one, nothing can be read (I think, printf shows
all zeros) until the 5 96 0 83 (the 7th row on 2nd matrix) but then the
reading is then on and off again (Frustrated >.< )
I took your example data, removing what I assumed was explanatory text
that's not in your actual data, and stored it in a file. I wrapped your
code fragment in a complete program, which set up everything
appropriately. You used feof() inappropriately, and SIZE violates the
usual conventions for naming what must be a variable in this program,
but I left those things uncorrected, since they shouldn't affect the
results.

// This code was written based upon a message posted by a@a.com on the
// usenet newsgroup comp.lang.c
// Message-ID: <fi***********@justice.itsc.cuhk.edu.hk>
// Date: Wed, 28 Nov 2007 10:48:27 +0800
// The lines from that message are marked with //a. The rest of this program
// was written by James Kuyper to fill in a suitable context.
#include <stdio.h>
#include <stdlib.h>

int main(void)
{
int SIZE;
int retval = EXIT_SUCCESS;
const char filename[] = "test.dat";
FILE *file = fopen(filename, "r");
double *r;

if(file == NULL)
{
perror(filename);
return EXIT_FAILURE;
}
if(fscanf(file, "%d ", &SIZE) != 1)

perror("SIZE");
retval = EXIT_FAILURE;
}
else if(SIZE <1 || SIZE_MAX/2/SIZE/SIZE < 1)
{
fprintf(stderr, "Unacceptable value for SIZE:%d\n", SIZE);
retval = EXIT_FAILURE;
}
else if((r=malloc(2*SIZE*SIZE*sizeof(*r)))==NULL)
{
fprintf(stderr, "Insufficient memory");
retval = EXIT_FAILURE;
}
else
{
int i;

printf("Reading 2 %dX%d arrays of double.\n", SIZE, SIZE);
while(!feof(file) && i< SIZE * SIZE + SIZE * SIZE) { //a
int n =
fscanf(file, "%lf", &r[i]); //a
if(n != 1)
{
fprintf(stderr, "fscanf() returned %d\n", n);
break;
}
i++; //a
} //a
if(ferror(file))
{
perror(filename);
retval = EXIT_FAILURE;
}
printf("Elements read:%d\n", i);

free(r);
}

fclose(file);
return retval;
}

I compiled and ran my version of your program, with the following results:

~/testprog(77) make scan_array
cc -std=c99 -pedantic -Wall -Wpointer-arith -Wcast-align -Wwrite-strings
-Wstrict-prototypes -Wmissing-prototypes -c -o scan_array.o scan_array.c
cc scan_array.o -o scan_array
~/testprog(78) scan_array
Reading 2 12X12 arrays of double.
Elements read:288

Whatever the problem with your actual program is, it comes from
something that's different from what I wrote. Therefore, what you should
do is simplify your code as much as possible, while still demonstrating
the problem. Then post your ENTIRE program, not just a fragment. Post
your actual input file, not one with textual explanations stuck in the
middle, and finally include the exact output from your posted program,
when run using your posted data. Only then will be able to help you further.
Nov 28 '07 #5
a
Thank you all. After simplifying my codes, I successfully debug.
Nov 28 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Vladimir | last post by:
Hello, All! Is there any analogous class/function to sscanf() & fscanf() in C#? In another words, I need to convert string into the set of given variables of different types. Regards,...
2
by: Vladimir | last post by:
Hello, All! Is there any analogous class/function to sscanf() & fscanf() in C#? In another words, I need to convert string into the set of given variables of different types. Regards,...
4
by: Psibur | last post by:
Hello, trying to get back into c and was having issue with reading a simple text file with an aribtrary # of lines with 3 int's per line, with the eventual purpose of putting each int into an...
7
by: Thomas Sourmail | last post by:
Hi, I hope I am missing something simple, but.. here is my problem: I need my program to check the last column of a file, as in : a b c d target ref 0 0 0 0 1 a 1 0 0 0 1.5 b 2 0 0 0 2 c
7
by: Kay | last post by:
1) If i want to read data from a txt file, eg John; 23; a Mary; 16; i How can I read the above data stopping reading b4 each semi-colon and save it in three different variables ? 2) If I...
1
by: siliconwafer | last post by:
Hi All, here is one code: int main() { FILE*fp; unsigned long a; fp = fopen("my_file.txt","w+"); a = 24; fprintf(fp,"%ld",a); while(fscanf(fp,"%ld",&a) == 1) {
4
by: John | last post by:
I need to read data from the file like the following with name and score, but some line may only has name without score: joe 100 amy 80 may Here's my code, but it couldn't read the line with...
37
by: PeterOut | last post by:
I am using MS Visual C++ 6.0 on Windows XP 5.1 (SP2). I am not sure if this is a C, C++ or MS issue but fscanf has been randomly hanging on me. I make the call hundreds, if not thousands, of...
59
by: David Mathog | last post by:
Apologies if this is in the FAQ. I looked, but didn't find it. In a particular program the input read from a file is supposed to be: + 100 200 name1 - 101 201 name2 It is parsed by reading...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.