473,387 Members | 1,606 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

fscanf problem

Hi,

I hope I am missing something simple, but.. here is my problem:

I need my program to check the last column of a file, as in :

a b c d target ref
0 0 0 0 1 a
1 0 0 0 1.5 b
2 0 0 0 2 c
0 0 6 0 2 g
0 0 0 4 1.5 h
0 0 0 8 2 i
3 0 0 0 1 j
1 0 0 0 1.5 k

to do this, I read up to column 5, then apply the following

if(fscanf(ifp,"%f",&fref)==1) {
last_column_is_number=1;
} else if (fscanf(ifp,"%s",ref)!=1 ) {
error=1;
}

which.. works fine in most problems I've tried.
It is just by chance that I built the above example file and realised
that, when it comes to 'i' or 'n', fscanf simply skips to the next
point, reading, in the example above, '3'.

I've moved 'i' around, the error remains, 'ii' is also ignored, but
anything larger works fine.

Confused !
Any help greatly appreciated,

Thomas.
Nov 14 '05 #1
7 5359
Thomas Sourmail wrote:
Hi,

I hope I am missing something simple, but.. here is my problem:

I need my program to check the last column of a file, as in :

a b c d target ref
0 0 0 0 1 a
1 0 0 0 1.5 b
2 0 0 0 2 c
0 0 6 0 2 g
0 0 0 4 1.5 h
0 0 0 8 2 i
3 0 0 0 1 j
1 0 0 0 1.5 k

to do this, I read up to column 5, then apply the following

if(fscanf(ifp,"%f",&fref)==1) {
last_column_is_number=1;
} else if (fscanf(ifp,"%s",ref)!=1 ) {
error=1;
}

which.. works fine in most problems I've tried.
It is just by chance that I built the above example file and realised
that, when it comes to 'i' or 'n', fscanf simply skips to the next
point, reading, in the example above, '3'.

I've moved 'i' around, the error remains, 'ii' is also ignored, but
anything larger works fine.


Perhaps the problem has something to do with how
you "read up to column 5?" Please trim your code a
little less severely, and post a short, complete, and
compilable demonstration of the problem.

--
Er*********@sun.com

Nov 14 '05 #2
In article <ca**********@pegasus.csx.cam.ac.uk>
Thomas Sourmail <ts***@hermes.cam.ac.uk> writes:
... I read up to column 5, then apply the following

if(fscanf(ifp,"%f",&fref)==1) {
last_column_is_number=1;
} else if (fscanf(ifp,"%s",ref)!=1 ) {
error=1;
}

which.. works fine in most problems I've tried.
It is just by chance that I built the above example file and realised
that, when it comes to 'i' or 'n', fscanf simply skips to the next
point, reading, in the example above, '3'.


I will note here that 'i' is the first (case-insensitive) letter
of "Inf", and 'n' is the first letter of "NaN". I suspect this is
significant.

It is not completely clear to me whether you mean "the first scanf
call fails, returning 0, so that the second scanf call succeeds --
returning 1 -- but stores "3" in the ref[] array". If so, this
may be working the way ANSI/ISO C dictates. (I have never been quite
happy with the ISO rules for the scanf engine, and I know my stdio
does not behave according to the Standard -- input like "1.23e+whoops"
scans as 1.23, leaving "e+whoops" in the stream, while the Standard
says that at least the 'e' and '+' are eaten, and perhaps the 'w'.
I am not sure what is supposed to happen to the 1.23.)

The best approach is almost certainly the same one that is so often
best when dealing with either data files or interaction with users:
read complete lines, one at a time, and *then* pick them apart in
whatever way you like, possibly including sscanf(). Here the %n
directive may come in handy.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #3
Eric Sosman wrote:
Thomas Sourmail wrote:
Hi,

I hope I am missing something simple, but.. here is my problem:

I need my program to check the last column of a file, as in :

a b c d target ref
0 0 0 0 1 a
1 0 0 0 1.5 b
2 0 0 0 2 c
0 0 6 0 2 g
0 0 0 4 1.5 h
0 0 0 8 2 i
3 0 0 0 1 j
1 0 0 0 1.5 k

to do this, I read up to column 5, then apply the following

if(fscanf(ifp,"%f",&fref)==1) {
last_column_is_number=1;
} else if (fscanf(ifp,"%s",ref)!=1 ) {
error=1;
}

which.. works fine in most problems I've tried.
It is just by chance that I built the above example file and realised
that, when it comes to 'i' or 'n', fscanf simply skips to the next
point, reading, in the example above, '3'.

I've moved 'i' around, the error remains, 'ii' is also ignored, but
anything larger works fine.

Perhaps the problem has something to do with how
you "read up to column 5?" Please trim your code a
little less severely, and post a short, complete, and
compilable demonstration of the problem.


Here is a compilable demo of the problem, with the input file below

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
FILE *ifp;
int i,j;
float dummy;
char ref[10],filename[10]="test.csv",mode[10]="r";

ifp=fopen(filename,mode);
for (i=0;i<26;++i) {
for (j=0;j<5;++j) {
fscanf(ifp,"%f",&dummy);
printf("%1.1f ",dummy);
}
if(fscanf(ifp,"%f",&dummy)==0) {
printf(" Failed reading as number ");
printf(" dummy is now %f ",dummy);
fscanf(ifp,"%s",ref);
printf("%s\n",ref);
} else {
printf("Read as number !\n");
printf("%f\n",dummy);
}
}
fclose(ifp);
}
the file test.csv:

0 0 0 0 1 a
1 0 0 0 1.5 b
0 0 0 8 2 i
0 0 0 0 1 j
0 2 0 0 1.5 m
0 4 0 0 2 nn
0 0 3 0 1.5 o
0 0 0 4 1.5 NaN
0 0 0 8 2 r
0 0 0 0 1 Inf
1 0 0 0 1.5 t
and the output on my machine:

0.0 0.0 0.0 0.0 1.0 Failed reading as number dummy is now 1.000000 a
1.0 0.0 0.0 0.0 1.5 Failed reading as number dummy is now 1.500000 b
0.0 0.0 0.0 8.0 2.0 Failed reading as number dummy is now 2.000000 0
0.0 0.0 0.0 1.0 1.0 Failed reading as number dummy is now 1.000000 j
0.0 2.0 0.0 0.0 1.5 Failed reading as number dummy is now 1.500000 m
0.0 4.0 0.0 0.0 2.0 Failed reading as number dummy is now 2.000000 0
0.0 3.0 0.0 1.5 1.5 Failed reading as number dummy is now 1.500000 o
0.0 0.0 0.0 4.0 1.5 Read as number !
nan
0.0 0.0 0.0 8.0 2.0 Failed reading as number dummy is now 2.000000 r
0.0 0.0 0.0 0.0 1.0 Read as number !
inf
1.0 0.0 0.0 0.0 1.5 Failed reading as number dummy is now 1.500000 t

When I run this, there is no problem with NaN or Inf (fscanf converts ok
to float), but on 'i' and 'nn', the problem remains.
It seems that, instead of leaving the character in the input stream as
described in http://www.eskimo.com/~scs/C-faq/q12.19.html
fscanf jumps to the next one, but only on these particular characters
(i, ii, n, nn).
Interestingly, with 'iii' or 'nnn', fscanf reads 'i' and 'n' resp. and
similarly if you increase the number of 'i's and 'n's..

I am sure there are workarounds, but I am quite curious about what is
happening..

Just in case, I'm on RH9, gcc-3.2.2-5, glibc-2.3.2-27.9.7

Thomas.
Nov 14 '05 #4


Thomas Sourmail wrote:
Eric Sosman wrote:
Perhaps the problem has something to do with how
you "read up to column 5?" Please trim your code a
little less severely, and post a short, complete, and
compilable demonstration of the problem.


Here is a compilable demo of the problem, with the input file below

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
FILE *ifp;
int i,j;
float dummy;
char ref[10],filename[10]="test.csv",mode[10]="r";

ifp=fopen(filename,mode);
for (i=0;i<26;++i) {
for (j=0;j<5;++j) {
fscanf(ifp,"%f",&dummy);
printf("%1.1f ",dummy);
}
if(fscanf(ifp,"%f",&dummy)==0) {
printf(" Failed reading as number ");
printf(" dummy is now %f ",dummy);
fscanf(ifp,"%s",ref);
printf("%s\n",ref);
} else {
printf("Read as number !\n");
printf("%f\n",dummy);
}
}
fclose(ifp);
}
the file test.csv:

0 0 0 0 1 a
1 0 0 0 1.5 b
0 0 0 8 2 i
0 0 0 0 1 j
0 2 0 0 1.5 m
0 4 0 0 2 nn
0 0 3 0 1.5 o
0 0 0 4 1.5 NaN
0 0 0 8 2 r
0 0 0 0 1 Inf
1 0 0 0 1.5 t
and the output on my machine:

0.0 0.0 0.0 0.0 1.0 Failed reading as number dummy is now 1.000000 a


....snip...
When I run this, there is no problem with NaN or Inf (fscanf converts ok
to float), but on 'i' and 'nn', the problem remains.
It seems that, instead of leaving the character in the input stream as
described in http://www.eskimo.com/~scs/C-faq/q12.19.html
fscanf jumps to the next one, but only on these particular characters
(i, ii, n, nn).
Interestingly, with 'iii' or 'nnn', fscanf reads 'i' and 'n' resp. and
similarly if you increase the number of 'i's and 'n's..

I am sure there are workarounds, but I am quite curious about what is
happening..

Just in case, I'm on RH9, gcc-3.2.2-5, glibc-2.3.2-27.9.7


Your loops do not appear correct or the loops may be getting
things out of sync. I would consider removing the loop and
use the suppression and scanset features of function fscanf.
This might give better control of troubleshooting problems.

Example:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
FILE *ifp;
int i,linenr;
float dummy;
char ref[10],filename[10]="test.csv",mode[10]="r";

if((ifp=fopen(filename,mode)) == NULL)
{
perror("Failed to open file test.csv");
exit(EXIT_FAILURE);
}
for(linenr = 1,*ref = '\0';
(i = fscanf(ifp,"%*f%*f%*f%*f%*[^1234567890\r\n]"
"%f%*[' ']%[^ \r\n]",&dummy, ref))!=EOF;
linenr++, *ref = '\0')
{
if(i == 2)
printf("line #: %d dummy = %.2f ref = \"%s\"\n",
linenr, dummy, ref);
else if(i == 1 && *ref == '\0' )
{
printf("line #: %d dummy = %.2f there is no ref\n"
"File Format failure: Exiting...\n",
linenr, dummy);
break;
}
else
{
printf("line #: %d No data read\n"
"File Format failure. Exiting...\n",linenr);
break;
}
}
fclose(ifp);
return 0;
}
--
Al Bowers
Tampa, Fl USA
mailto: xa******@myrapidsys.com (remove the x to send email)
http://www.geocities.com/abowers822/

Nov 14 '05 #5
Thomas Sourmail wrote:

Here is a compilable demo of the problem, with the input file below

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
FILE *ifp;
int i,j;
float dummy;
char ref[10],filename[10]="test.csv",mode[10]="r";

ifp=fopen(filename,mode);
for (i=0;i<26;++i) {
for (j=0;j<5;++j) {
fscanf(ifp,"%f",&dummy);
printf("%1.1f ",dummy);
}
if(fscanf(ifp,"%f",&dummy)==0) {
printf(" Failed reading as number ");
printf(" dummy is now %f ",dummy);
fscanf(ifp,"%s",ref);
printf("%s\n",ref);
} else {
printf("Read as number !\n");
printf("%f\n",dummy);
}
}
fclose(ifp);
}
the file test.csv:

0 0 0 0 1 a
1 0 0 0 1.5 b
0 0 0 8 2 i
0 0 0 0 1 j
0 2 0 0 1.5 m
0 4 0 0 2 nn
0 0 3 0 1.5 o
0 0 0 4 1.5 NaN
0 0 0 8 2 r
0 0 0 0 1 Inf
1 0 0 0 1.5 t
and the output on my machine:

0.0 0.0 0.0 0.0 1.0 Failed reading as number dummy is now 1.000000 a
1.0 0.0 0.0 0.0 1.5 Failed reading as number dummy is now 1.500000 b
0.0 0.0 0.0 8.0 2.0 Failed reading as number dummy is now 2.000000 0
0.0 0.0 0.0 1.0 1.0 Failed reading as number dummy is now 1.000000 j
0.0 2.0 0.0 0.0 1.5 Failed reading as number dummy is now 1.500000 m
0.0 4.0 0.0 0.0 2.0 Failed reading as number dummy is now 2.000000 0
0.0 3.0 0.0 1.5 1.5 Failed reading as number dummy is now 1.500000 o
0.0 0.0 0.0 4.0 1.5 Read as number !
nan
0.0 0.0 0.0 8.0 2.0 Failed reading as number dummy is now 2.000000 r
0.0 0.0 0.0 0.0 1.0 Read as number !
inf
1.0 0.0 0.0 0.0 1.5 Failed reading as number dummy is now 1.500000 t

When I run this, there is no problem with NaN or Inf (fscanf converts ok
to float), but on 'i' and 'nn', the problem remains.
It seems that, instead of leaving the character in the input stream as
described in http://www.eskimo.com/~scs/C-faq/q12.19.html
fscanf jumps to the next one, but only on these particular characters
(i, ii, n, nn).
Interestingly, with 'iii' or 'nnn', fscanf reads 'i' and 'n' resp. and
similarly if you increase the number of 'i's and 'n's..

I am sure there are workarounds, but I am quite curious about what is
happening..


I think Chris Torek's answer is the right one. When the
first non-white character encountered by "%f" is an 'i' or an
'n', it could be the beginning of "inf" or "nan". So fscanf()
reads the next character to try to match the remainder of the
"inf" or "nan", and if the next character is a newline the
match fails. However, the initial 'i' or 'n' has already
been read and accepted; here's what 7.19.6.2/9 has to say:

[...] An input item is defined as the longest sequence
of input characters [...] which is, or is a prefix of,
a matching input sequence. [...]

'i' and 'n' are prefixes of "inf" and "nan", so they are matched
and consumed by "%f". When the '\n' comes along the match fails,
but only the '\n' remains unconsumed: fprintf() can only push
back one character, and can't "rewind" the input to an arbitrary
position.

That explains what happens with "i\n" and "n\n", but it
doesn't explain the behavior on the "nn\n" line. I'd expect
the "%f" to consume the first 'n' as a prefix of "nan", then
choke on the second 'n' and push it back as a non-matching
character. Then your second attempt with "%s" should have
found the second 'n' again, followed by a newline, and should
have stored the one-character string "n" in `ref'. But it
looks like the second 'n' didn't get pushed back after the
matching failure, which may mean there's a bug in the fscanf()
implementation. (Or, of course, it may mean I've misread
what's supposed to happen; the possible forms of "nan" seem
to be pretty close to infinite ...)

For what it's worth, I tried your program on another
implementation and found what I think is a different incorrect
behavior: Both the "i" and the "nn" were read as strings by
the "%s" conversion. Thus, at least one of the implementations
is wrong -- and according to my (non-authoritative) reading of
the Standard, both are wrong!

So, what to do about your problem? Again, I think Chris'
suggestion is best: Don't use fscanf() to read lines of input.
Instead, use fgets() to read a line at a time and then use
other means -- possibly including sscanf() -- to pick them
apart. fscanf() doesn't always stop at a newline when you'd
want it to, but sscanf() absolutely *will* stop at a '\0',
and you won't "lose synchronization" with the input file.

--
Er*********@sun.com

Nov 14 '05 #6
Eric Sosman wrote:
Thomas Sourmail wrote:

Here is a compilable demo of the problem, with the input file below

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
FILE *ifp;
int i,j;
float dummy;
char ref[10],filename[10]="test.csv",mode[10]="r";

ifp=fopen(filename,mode);
for (i=0;i<26;++i) {
for (j=0;j<5;++j) {
fscanf(ifp,"%f",&dummy);
printf("%1.1f ",dummy);
}
if(fscanf(ifp,"%f",&dummy)==0) {
printf(" Failed reading as number ");
printf(" dummy is now %f ",dummy);
fscanf(ifp,"%s",ref);
printf("%s\n",ref);
} else {
printf("Read as number !\n");
printf("%f\n",dummy);
}
}
fclose(ifp);
}

snip..

When I run this, there is no problem with NaN or Inf (fscanf converts
ok to float), but on 'i' and 'nn', the problem remains.
It seems that, instead of leaving the character in the input stream as
described in http://www.eskimo.com/~scs/C-faq/q12.19.html
fscanf jumps to the next one, but only on these particular characters
(i, ii, n, nn).
Interestingly, with 'iii' or 'nnn', fscanf reads 'i' and 'n' resp. and
similarly if you increase the number of 'i's and 'n's..

I am sure there are workarounds, but I am quite curious about what is
happening..

I think Chris Torek's answer is the right one. When the
first non-white character encountered by "%f" is an 'i' or an
'n', it could be the beginning of "inf" or "nan". So fscanf()
reads the next character to try to match the remainder of the
"inf" or "nan", and if the next character is a newline the
match fails. However, the initial 'i' or 'n' has already
been read and accepted; here's what 7.19.6.2/9 has to say:

[...] An input item is defined as the longest sequence
of input characters [...] which is, or is a prefix of,
a matching input sequence. [...]

'i' and 'n' are prefixes of "inf" and "nan", so they are matched
and consumed by "%f". When the '\n' comes along the match fails,
but only the '\n' remains unconsumed: fprintf() can only push
back one character, and can't "rewind" the input to an arbitrary
position.

That explains what happens with "i\n" and "n\n", but it
doesn't explain the behavior on the "nn\n" line. I'd expect
the "%f" to consume the first 'n' as a prefix of "nan", then
choke on the second 'n' and push it back as a non-matching
character. Then your second attempt with "%s" should have
found the second 'n' again, followed by a newline, and should
have stored the one-character string "n" in `ref'. But it
looks like the second 'n' didn't get pushed back after the
matching failure, which may mean there's a bug in the fscanf()
implementation. (Or, of course, it may mean I've misread
what's supposed to happen; the possible forms of "nan" seem
to be pretty close to infinite ...)

For what it's worth, I tried your program on another
implementation and found what I think is a different incorrect
behavior: Both the "i" and the "nn" were read as strings by
the "%s" conversion. Thus, at least one of the implementations
is wrong -- and according to my (non-authoritative) reading of
the Standard, both are wrong!

So, what to do about your problem? Again, I think Chris'
suggestion is best: Don't use fscanf() to read lines of input.
Instead, use fgets() to read a line at a time and then use
other means -- possibly including sscanf() -- to pick them
apart. fscanf() doesn't always stop at a newline when you'd
want it to, but sscanf() absolutely *will* stop at a '\0',
and you won't "lose synchronization" with the input file.


Yes, that seems to be the right explanation. Strangely, the first
fscanf(ifp,"%f",&dummy) always seems to consume one more character than
should be necessary to distinguish whether the entry is inf or i**
something else.

For example, with 'inter' at the end, the second fscanf(ifp,"%s",ref)
returns 'er', but
'itter' -> 'ter'
'natto' -> 'to'
'nttto' -> 'tto'

Anyway, thanks a lot for all your help, I meant to use this method
because, in my real problem, the number of column is not fixed, and
sscanf does not 'move along the line' if I simply repeat it as I've done
above, I guess I have to try vsscanf..

Thomas.
Nov 14 '05 #7
In article <WYmAc.403$b64.363@newsfe1-win>
Thomas Sourmail <ts***@hermes.cam.ac.uk> writes:
Anyway, thanks a lot for all your help, I meant to use this method
because, in my real problem, the number of column is not fixed, and
sscanf does not 'move along the line' if I simply repeat it as I've done
above ...


This is why I suggested that the "%n" conversion might also be
helpful.

Another method, of course, is to use strtod() and other "lower-level"
functions to take apart input lines.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Psibur | last post by:
Hello, trying to get back into c and was having issue with reading a simple text file with an aribtrary # of lines with 3 int's per line, with the eventual purpose of putting each int into an...
2
by: ArmedCoder | last post by:
Hi, im just learning about multithreading for a program i am writin that needs to read from multiple camreas attached to the computer a the same time. I know how to create threads and pass around...
7
by: Kay | last post by:
1) If i want to read data from a txt file, eg John; 23; a Mary; 16; i How can I read the above data stopping reading b4 each semi-colon and save it in three different variables ? 2) If I...
5
by: learner | last post by:
I have datafiles like this: 0 1941 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.00 0.00 1 0 1941 0.00 0.03 0.00 0.03 0.04 0.02 0.00 0.00 0.00 0.00 2 0 1941 0.00 0.00 0.00 0.00 0.52...
23
by: Babak | last post by:
Hi Everyone, I've written a standard C code for a simple finite element analysis in MSVC++ . When I save the file as a cpp file, it compiles and runs perfectly, but when I save it as a c file,...
14
by: abhishekkarnik | last post by:
Hi, I am trying to read an exe file and print it out character by character in hexadecimal format. The file goes something like this in hexadecimal 0x4d 0x5a 0x90 0x00 0x03 .... so on When I...
18
by: Scott | last post by:
Hi, a problem with this following code is really bugging me. tform = fopen(country, "r"); fseek(tform, 9L, SEEK_SET); fgets(player2, 38, tform); printf("Player Name (save): %s", player);...
37
by: PeterOut | last post by:
I am using MS Visual C++ 6.0 on Windows XP 5.1 (SP2). I am not sure if this is a C, C++ or MS issue but fscanf has been randomly hanging on me. I make the call hundreds, if not thousands, of...
4
by: neha_chhatre | last post by:
i have a wierd problem...iam not able to understand wat is goin wrong...i have written a code which reads the values form a text file (it contains lot of values like time=1.147279,value=240.66 and...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.