473,218 Members | 1,483 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,218 software developers and data experts.

frustrated by fscanf and sscanf

a
After reading FAQ comp.lang.c section 12 and googling again, still there is
no threads talking about reading a series of numbers. The input files,
somehow structured, is exemplified below:

<presence/absence of n space/tab on the first n lines>
12

<presence/absence of n space/tab here>0<presence/absence of n space/tab
here>90 10 23 43 0 0 0 0 0 0 0
90 0 0 0 0 88 0 0 0 0 0 0
10 0 0 0 0 0 26 16 0 0 0 0
23 0 0 0 0 0 0 0 0 0 0 0
43 0 0 0 0 0 0 0 0 0 0 0
0 88 0 0 0 0 0 0 1 0 0 0
0 0 26 0 0 0 0 0 0 0 0 0
0 0 16 0 0 0 0 0 0 96 0 0
0 0 0 0 0 1 0 0 0 0 29 0
0 0 0 0 0 0 0 96 0 0 0 37
0 0 0 0 0 0 0 0 29 0 0 0
0 0 0 0 0 0 0 0 0 37 0 0
<presence/absence of n space/tab/newline here>
0 36 54 26 59 72 9 34 79 17 46 95
36 0 73 35 90 58 30 78 35 44 79 36
54 73 0 21 10 97 58 66 69 61 54 63
26 35 21 0 93 12 46 40 37 48 68 85
59 90 10 93 0 64 5 29 76 16 5 76
72 58 97 12 64 0 96 55 38 54 0 34
9 30 58 46 5 96 0 83 35 11 56 37
34 78 66 40 29 55 83 0 44 12 15 80
79 35 69 37 76 38 35 44 0 64 39 33
17 44 61 48 16 54 11 12 64 0 70 86
46 79 54 68 5 0 56 15 39 70 0 18
95 36 63 85 76 34 37 80 33 86 18 0
<presence/absence of n space/tab/newline here>
The unusual but deterministic behaviour is that some rows can be read
successfully but not the others. In this case, the first matrix is read
successfully, but the second one, nothing can be read (I think, printf shows
all zeros) until the 5 96 0 83 (the 7th row on 2nd matrix) but then the
reading is then on and off again (Frustrated >.< )

My code, after trying feof, fgets, sscanf and so, now is as follows:

while(!feof(file) && i< SIZE * SIZE + SIZE * SIZE) { //SIZE, an
integer obtained from reading the 1st line
fscanf(file, "%lf", &r[i]);
i++;
}

I know they are integer matrices but the double data structure is needed for
further development.
Nov 28 '07 #1
5 2282
On Nov 27, 6:48 pm, "a" <a...@a.comwrote:
After reading FAQ comp.lang.c section 12 and googling again, still there is
no threads talking about reading a series of numbers. The input files,
somehow structured, is exemplified below:

<presence/absence of n space/tab on the first n lines>
12

<presence/absence of n space/tab here>0<presence/absence of n space/tab
here>90 10 23 43 0 0 0 0 0 0 0
90 0 0 0 0 88 0 0 0 0 0 0
10 0 0 0 0 0 26 16 0 0 0 0
23 0 0 0 0 0 0 0 0 0 0 0
43 0 0 0 0 0 0 0 0 0 0 0
0 88 0 0 0 0 0 0 1 0 0 0
0 0 26 0 0 0 0 0 0 0 0 0
0 0 16 0 0 0 0 0 0 96 0 0
0 0 0 0 0 1 0 0 0 0 29 0
0 0 0 0 0 0 0 96 0 0 0 37
0 0 0 0 0 0 0 0 29 0 0 0
0 0 0 0 0 0 0 0 0 37 0 0
<presence/absence of n space/tab/newline here>
0 36 54 26 59 72 9 34 79 17 46 95
36 0 73 35 90 58 30 78 35 44 79 36
54 73 0 21 10 97 58 66 69 61 54 63
26 35 21 0 93 12 46 40 37 48 68 85
59 90 10 93 0 64 5 29 76 16 5 76
72 58 97 12 64 0 96 55 38 54 0 34
9 30 58 46 5 96 0 83 35 11 56 37
34 78 66 40 29 55 83 0 44 12 15 80
79 35 69 37 76 38 35 44 0 64 39 33
17 44 61 48 16 54 11 12 64 0 70 86
46 79 54 68 5 0 56 15 39 70 0 18
95 36 63 85 76 34 37 80 33 86 18 0
<presence/absence of n space/tab/newline here>
The unusual but deterministic behaviour is that some rows can be read
successfully but not the others. In this case, the first matrix is read
successfully, but the second one, nothing can be read (I think, printf shows
all zeros) until the 5 96 0 83 (the 7th row on 2nd matrix) but then the
reading is then on and off again (Frustrated >.< )

My code, after trying feof, fgets, sscanf and so, now is as follows:

while(!feof(file) && i< SIZE * SIZE + SIZE * SIZE) { //SIZE, an
integer obtained from reading the 1st line
fscanf(file, "%lf", &r[i]);
i++;
}

I know they are integer matrices but the double data structure is needed for
further development.
Suggestion:
The scanf() function is evil, as everyone knows.
I suggest using fgets() to read a line, and then parse the line into
individual numbers with something like this:
================================================== =====================
#include <string.h>
#include <limits.h>
#include <stdlib.h>
#include <ctype.h>

/* The default delimiters are chosen as some ordinary white space
characters: */
static const char default_delimiters[] = {' ', '\n', '\t', '\r', '\f',
0};

/*
* The tokenize() function is similar to a reentrant version of
strtok().
* It parses tokens from 'string', where tokens are substrings
separated by characters from 'delimiter_list'.
* To get the first token from 'string', tokenize() is called with
'string' as its first parameter.
* Remaining tokens from 'string' are obtained by calling tokenize()
with NULL for the first parameter.
* The string of delimiters, identified by 'delimiter_list', can
change from call to call.
* If the string of delimiters is NULL, then the standard list
'default_delimiters' (see above) is used.
* tokenize() modifies the memory pointed to by 'string', because it
writes null characters into the buffer.
*/
char *tokenize(char *string, const char *delimiter_list,
char **placeholder)
{
if (delimiter_list == NULL)
delimiter_list = default_delimiters;

if (delimiter_list[0] == 0)
delimiter_list = default_delimiters;

if (string == NULL)
string = *placeholder;

if (string == NULL)
return NULL;
/*
* The strspn() function computes the length of the initial segment of
the first string
* that consists entirely of characters contained in the second
string.
*/
string += strspn(string, delimiter_list);
if (!string[0]) {
*placeholder = string;
return NULL;
} else {
char *token;
token = string;
/*
* The strpbrk() function finds the first occurrence of any character
contained in the second string
* found in the first string.
*/
string = strpbrk(token, delimiter_list);
if (string == NULL)
*placeholder = token + strlen(token);
else {
*string++ = 0;
*placeholder = string;
}
return token;
}
}

#ifdef UNIT_TEST
char test_string0[] = "This is a test. This is only a
test. If it were an actual emergency, you would be dead.";
char test_string1[] = "This is a also a test. This is only
a test. If it were an actual emergency, you would be dead. 12345";
char test_string2[] = "The quick brown fox jumped over the
lazy dog's back 1234567890 times.";
char test_string3[] = " \t\r\n\fThe quick brown fox jumped
over the lazy dog's back 1234567890 times.";
char test_string4[] = "This is a test. This is only a
test. If it were an actual emergency, you would be dead.";
char test_string5[] = "This is a also a test. This is only
a test. If it were an actual emergency, you would be dead. 12345";
char test_string6[] = "The quick brown fox jumped over the
lazy dog's back 1234567890 times.";
char test_string7[] = " \t\r\n\fThe quick brown fox jumped
over the lazy dog's back 1234567890 times.";

#include <stdio.h>

char whitespace[UCHAR_MAX + 1];

/* This test will create token separators as any whitespace or any
punctuation marks: */
void init_whitespace()
{
int i;
int index = 0;
for (i = 0; i < UCHAR_MAX; i++) {
if (isspace(i)) {
whitespace[index++] = (char) i;
}
if (ispunct(i)) {
whitespace[index++] = (char) i;
}
}
}

/*
TNX Gerd.
*/
void spin_test(char *test_string, char *white)
{
char *p = NULL;
char *token;
token = tokenize(test_string, white, &p);
while (token) {
puts(token);
token = tokenize(NULL, white, &p);
}
}

int main(void)
{
init_whitespace();
puts("Whitespace is whitespace+punctuation");
spin_test(test_string0, whitespace);
spin_test(test_string1, whitespace);
spin_test(test_string2, whitespace);
spin_test(test_string3, whitespace);
puts("Whitespace is simple whitespace");
spin_test(test_string4, NULL);
spin_test(test_string5, NULL);
spin_test(test_string6, NULL);
spin_test(test_string7, NULL);
return 0;
}
#endif
================================================== =====================

And then read the numbers one at a time using sscanf() on the
fragment, checking the return of sscanf() each time. The problem with
scanf() is that you don't know where it went off. By splitting into
pieces you can easily find out where the trouble spots are and more
quickly diagnose the changes you will have to make.

There are (of course) many other alternatives.
Nov 28 '07 #2
a wrote:
After reading FAQ comp.lang.c section 12 and googling again, still
there is no threads talking about reading a series of numbers. The
input files, somehow structured, is exemplified below:
<snip description of input file structure>

Your description of the file format makes no sense at all. Do you mean
that the format of lines can change within the file.

Why not use a language like Perl which is expressly designed for such
purposes, at least to re-structure the file into a consistent format,
and then have the C program read the sanitised file?
The unusual but deterministic behaviour is that some rows can be read
successfully but not the others.
This indicates that your format specifiers are working for some rows,
but not for others. Reading data whose format varies on the fly with
the *scanf() family of functions is tricky. At the very least, why
don't you capture a line completely with fgets() and then try to pick
it apart with sscanf().

You should error check _every_ call to _every_ library function. This
way you can easily narrow down the input failure to a particular line
(provided you use the fgets()/sscanf() method I described). Then we can
say more about your problem.
In this case, the first matrix is
read successfully, but the second one, nothing can be read (I think,
printf shows all zeros) until the 5 96 0 83 (the 7th row on 2nd
matrix) but then the reading is then on and off again (Frustrated >.<
)
Such vague information along with statements like "I think..." is not
(unfortunately) going to be enough to help you. You must provide us the
compilable source code of a minimal program that still exhibits your
problem, along with a sample of your input file or a clearer
description of it's format.
My code, after trying feof, fgets, sscanf and so, now is as follows:

while(!feof(file) && i< SIZE * SIZE + SIZE * SIZE) {
feof() and ferror() make sense only _after_ a read operation has failed.
They are used to determine _why_ the read failed, because of
end-of-file or an error.

In this loop control expression you are trying to do too many things at
once and doing them incorrectly too.
//SIZE, an
integer obtained from reading the 1st line
fscanf(file, "%lf", &r[i]);
i++;
}
This is hopeless. Please read in every line with fgets() and try to
convert it with sscanf(). Please check all library calls for failure.
Something like this:

#include <stdio.h>
#include <stdlib.h>
#define MAX_LINE 128

int main(void) {
char line[MAX_LINE];
int retval;
FILE *fp = fopen("input.file", "r");

if (!fp) return EXIT_FAILURE;

while (fgets(line, MAX_LINE, fp) != NULL) {
retval = sscanf(line, "%WHATEVER_FORMAT", &MATRIX_ELEMENT);
/* If retval does not contain the number of items you expect the
call to have successfully read and converted then something
went wrong with your format specifier and the concerned line.
*/
/* Other processing */
}
/* NOW you can check feof() and ferror() to determine why fgets()
returned NULL
*/
return STATUS;
}

<snip>

Nov 28 '07 #3
a

"santosh" <sa*********@gmail.comwrote in message
news:fi**********@aioe.org...
>a wrote:
>After reading FAQ comp.lang.c section 12 and googling again, still
there is no threads talking about reading a series of numbers. The
input files, somehow structured, is exemplified below:

<snip description of input file structure>

Your description of the file format makes no sense at all. Do you mean
that the format of lines can change within the file.

Why not use a language like Perl which is expressly designed for such
purposes, at least to re-structure the file into a consistent format,
and then have the C program read the sanitised file?

I also agree that Perl does that elegantly by regular expression. However,
because the numbers are to be placed onto a matrix which will be operated on
by C codes, I need to write the whole program in C.
Nov 28 '07 #4
a wrote:
After reading FAQ comp.lang.c section 12 and googling again, still there is
no threads talking about reading a series of numbers. The input files,
somehow structured, is exemplified below:
....
The unusual but deterministic behaviour is that some rows can be read
successfully but not the others. In this case, the first matrix is read
successfully, but the second one, nothing can be read (I think, printf shows
all zeros) until the 5 96 0 83 (the 7th row on 2nd matrix) but then the
reading is then on and off again (Frustrated >.< )
I took your example data, removing what I assumed was explanatory text
that's not in your actual data, and stored it in a file. I wrapped your
code fragment in a complete program, which set up everything
appropriately. You used feof() inappropriately, and SIZE violates the
usual conventions for naming what must be a variable in this program,
but I left those things uncorrected, since they shouldn't affect the
results.

// This code was written based upon a message posted by a@a.com on the
// usenet newsgroup comp.lang.c
// Message-ID: <fi***********@justice.itsc.cuhk.edu.hk>
// Date: Wed, 28 Nov 2007 10:48:27 +0800
// The lines from that message are marked with //a. The rest of this program
// was written by James Kuyper to fill in a suitable context.
#include <stdio.h>
#include <stdlib.h>

int main(void)
{
int SIZE;
int retval = EXIT_SUCCESS;
const char filename[] = "test.dat";
FILE *file = fopen(filename, "r");
double *r;

if(file == NULL)
{
perror(filename);
return EXIT_FAILURE;
}
if(fscanf(file, "%d ", &SIZE) != 1)

perror("SIZE");
retval = EXIT_FAILURE;
}
else if(SIZE <1 || SIZE_MAX/2/SIZE/SIZE < 1)
{
fprintf(stderr, "Unacceptable value for SIZE:%d\n", SIZE);
retval = EXIT_FAILURE;
}
else if((r=malloc(2*SIZE*SIZE*sizeof(*r)))==NULL)
{
fprintf(stderr, "Insufficient memory");
retval = EXIT_FAILURE;
}
else
{
int i;

printf("Reading 2 %dX%d arrays of double.\n", SIZE, SIZE);
while(!feof(file) && i< SIZE * SIZE + SIZE * SIZE) { //a
int n =
fscanf(file, "%lf", &r[i]); //a
if(n != 1)
{
fprintf(stderr, "fscanf() returned %d\n", n);
break;
}
i++; //a
} //a
if(ferror(file))
{
perror(filename);
retval = EXIT_FAILURE;
}
printf("Elements read:%d\n", i);

free(r);
}

fclose(file);
return retval;
}

I compiled and ran my version of your program, with the following results:

~/testprog(77) make scan_array
cc -std=c99 -pedantic -Wall -Wpointer-arith -Wcast-align -Wwrite-strings
-Wstrict-prototypes -Wmissing-prototypes -c -o scan_array.o scan_array.c
cc scan_array.o -o scan_array
~/testprog(78) scan_array
Reading 2 12X12 arrays of double.
Elements read:288

Whatever the problem with your actual program is, it comes from
something that's different from what I wrote. Therefore, what you should
do is simplify your code as much as possible, while still demonstrating
the problem. Then post your ENTIRE program, not just a fragment. Post
your actual input file, not one with textual explanations stuck in the
middle, and finally include the exact output from your posted program,
when run using your posted data. Only then will be able to help you further.
Nov 28 '07 #5
a
Thank you all. After simplifying my codes, I successfully debug.
Nov 28 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Vladimir | last post by:
Hello, All! Is there any analogous class/function to sscanf() & fscanf() in C#? In another words, I need to convert string into the set of given variables of different types. Regards,...
2
by: Vladimir | last post by:
Hello, All! Is there any analogous class/function to sscanf() & fscanf() in C#? In another words, I need to convert string into the set of given variables of different types. Regards,...
4
by: Psibur | last post by:
Hello, trying to get back into c and was having issue with reading a simple text file with an aribtrary # of lines with 3 int's per line, with the eventual purpose of putting each int into an...
7
by: Thomas Sourmail | last post by:
Hi, I hope I am missing something simple, but.. here is my problem: I need my program to check the last column of a file, as in : a b c d target ref 0 0 0 0 1 a 1 0 0 0 1.5 b 2 0 0 0 2 c
7
by: Kay | last post by:
1) If i want to read data from a txt file, eg John; 23; a Mary; 16; i How can I read the above data stopping reading b4 each semi-colon and save it in three different variables ? 2) If I...
1
by: siliconwafer | last post by:
Hi All, here is one code: int main() { FILE*fp; unsigned long a; fp = fopen("my_file.txt","w+"); a = 24; fprintf(fp,"%ld",a); while(fscanf(fp,"%ld",&a) == 1) {
4
by: John | last post by:
I need to read data from the file like the following with name and score, but some line may only has name without score: joe 100 amy 80 may Here's my code, but it couldn't read the line with...
37
by: PeterOut | last post by:
I am using MS Visual C++ 6.0 on Windows XP 5.1 (SP2). I am not sure if this is a C, C++ or MS issue but fscanf has been randomly hanging on me. I make the call hundreds, if not thousands, of...
59
by: David Mathog | last post by:
Apologies if this is in the FAQ. I looked, but didn't find it. In a particular program the input read from a file is supposed to be: + 100 200 name1 - 101 201 name2 It is parsed by reading...
1
isladogs
by: isladogs | last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, Mike...
3
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 3 Jan 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). For other local times, please check World Time Buddy In...
0
by: jianzs | last post by:
Introduction Cloud-native applications are conventionally identified as those designed and nurtured on cloud infrastructure. Such applications, rooted in cloud technologies, skillfully benefit from...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: fareedcanada | last post by:
Hello I am trying to split number on their count. suppose i have 121314151617 (12cnt) then number should be split like 12,13,14,15,16,17 and if 11314151617 (11cnt) then should be split like...
0
by: stefan129 | last post by:
Hey forum members, I'm exploring options for SSL certificates for multiple domains. Has anyone had experience with multi-domain SSL certificates? Any recommendations on reliable providers or specific...
0
Git
by: egorbl4 | last post by:
Скачал я git, хотел начать настройку, а там вылезло вот это Что это? Что мне с этим делать? ...
1
by: davi5007 | last post by:
Hi, Basically, I am trying to automate a field named TraceabilityNo into a web page from an access form. I've got the serial held in the variable strSearchString. How can I get this into the...
0
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.