473,385 Members | 2,029 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

parsing a file..

I need to parse a file which has about 2000 lines and I'm getting
told that reading the file in ascii would be a slower way to do it and
so i need to resort to binary by reading it in large chunks. Can any
one please explain what is all
this about ?
Mar 14 '08 #1
31 2450
broli said:
I need to parse a file which has about 2000 lines and I'm getting
told that reading the file in ascii would be a slower way to do it and
so i need to resort to binary by reading it in large chunks. Can any
one please explain what is all this about ?
Someone's pulling your leg. 2000 lines of text is nothing. Just write the
program so that it's clear, correct, and easy to understand. Then, if and
only if it's too slow (and you should define the "fast enough"/"too slow"
boundary before you start writing the program), it's time to think about
how it might be made faster.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Mar 14 '08 #2
broli said:

<snip>
But then I
was told that " normally we don't read scientific data in ascii for
accuracy and speed concerns" which made me wonder what was so wrong ?
The statement!
I could parse 2000 lines in hardly any time and there was no problem
with ascii either.
Right. Someone's pulling your leg, or is overly concerned with efficiency
at the expense of development time and clarity. That isn't to say that
efficiency isn't important. But let's just pretend, for the sake of
argument, that you write it /both/ ways, and then you measure. You
discover that the "binary" technique takes 0.025 seconds to process the
2000 data groups, whereas the "text" version takes 0.075 seconds - three
times slower! Surely this is a triumph for binary!

Yeah, right, but who cares? You press ENTER, and then it takes you 0.1
seconds to look up at the screen, and everything's finished, no matter
which one you ran.

Write it clear, simple, and correct. Then worry about speed if and only if
you have to.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Mar 14 '08 #3
In article <4e**********************************@s19g2000prg. googlegroups.com>,
broli <Br*****@gmail.comwrote:
>I need to parse a file which has about 2000 lines and I'm getting
told that reading the file in ascii would be a slower way to do it and
so i need to resort to binary by reading it in large chunks. Can any
one please explain what is all this about ?
Reading in large chunks is unrelated to whether it's binary or
ascii. Perhaps they meant that character-at-a-time reading with
getchar() is slow, which it is on some systems. You can perfectly
well use fread() on text files.

-- Richard

--
:wq
Mar 14 '08 #4
Chris Dollin said:
Richard Heathfield wrote:
<snip>
>>
Someone's pulling your leg. 2000 lines of text is nothing. Just write
the program so that it's clear, correct, and easy to understand. Then,
if and only if it's too slow (and you should define the "fast
enough"/"too slow" boundary before you start writing the program), it's
time to think about how it might be made faster.

I agree that speed is unlikely to be a factor -- but accuracy may be.
Possibly, but that comes under correctness, not performance.

<snip>
After all, if they want to read those 2000 lines 1000 times per second
...
....and that is covered by "fast enough/too slow". Again, I would emphasise
that the first priority is to make the program *clear* (because it's
easier to make a clear program correct than to make a correct program
clear). The second priority (and a sine qua non, obviously) is to make the
program *correct*. When and only when it works, it's time to worry about
speed. (This obviously does *not* mean that one should intentionally adopt
gross algorithmic inefficiencies.)

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Mar 14 '08 #5
Richard HeathField,

There are many modules involved in my software package and this is
just one of them. My software would also involve huge number of
calculations, searching, memory allocation etc etc but the thing is
that I have to parallelize the software code to run on different
machines anyway. Even if speed is an issue, I doubt that reading a
file in ascii or "binary" would make a huge impact overall.
Mar 14 '08 #6
broli said:

<snip>
But when I use fgets() then wouldn't I get a string
of characters (also many tabs, null character etc) ?
Yes.
Wouldn't it be a
difficult task to convert an array of characters into double type
floating numbers again ?
I don't see that you have any choice. If what you've described is correct,
the numbers are already in text form. Converting is easy enough, though,
using strtod.
I think using fread will make it very fast
(considering that it allows you to read as many bytes of data at a
time as you want) but once again I'm not very adept at file handling
just at the begginign stages.
It's very likely that the input stream is buffered, so it won't actually
make much, if any, difference.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Mar 14 '08 #7
ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:
In article <4e**********************************@s19g2000prg. googlegroups.com>,
broli <Br*****@gmail.comwrote:
>>I need to parse a file which has about 2000 lines and I'm getting
told that reading the file in ascii would be a slower way to do it and
so i need to resort to binary by reading it in large chunks. Can any
one please explain what is all this about ?

Reading in large chunks is unrelated to whether it's binary or
ascii.
I would question that statement. Reading in binary will be a LOT faster
,if its the same platform. for reading in the same NUMBER of
readings.
Perhaps they meant that character-at-a-time reading with
getchar() is slow, which it is on some systems. You can perfectly
well use fread() on text files.
The text file will be larger. There is a need to parse the ascii text
into the destination formats.

It will be slower in the great majority of cases.
>
-- Richard
Mar 14 '08 #8
Richard wrote:
ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:
>In article <4e**********************************@s19g2000prg. googlegroups.com>,
broli <Br*****@gmail.comwrote:
>>>I need to parse a file which has about 2000 lines and I'm getting
told that reading the file in ascii would be a slower way to do it and
so i need to resort to binary by reading it in large chunks. Can any
one please explain what is all this about ?

Reading in large chunks is unrelated to whether it's binary or
ascii.

I would question that statement. Reading in binary will be a LOT faster
,if its the same platform. for reading in the same NUMBER of
readings.
> Perhaps they meant that character-at-a-time reading with
getchar() is slow, which it is on some systems. You can perfectly
well use fread() on text files.

The text file will be larger. There is a need to parse the ascii text
into the destination formats.

It will be slower in the great majority of cases.
Quick test, one file, 2000 lines, each line with two floats (1.12345
and 7.890), about 28Kb total.

One single big-enough fread:

real 0m0.002s
user 0m0.000s
sys 0m0.001s

Repeat fscanf( ... "%lf %lf" ... ) until EOF:

real 0m0.004s
user 0m0.002s
sys 0m0.002s

Yes, in this test it's twice as slow. The data file is probably
cached (it's been read several other times already as I /cough/
debugged my code). It includes program start-up time (I just did
`time ./a.out` to get the numbers) so the actual reading time will
be less.

Myself I wouldn't count that as "LOTS faster" for binary data,
but doubtless there are applications where it is so counted;
I don't think the OPs case is one of them, and it does look as
though he's reading a text file anyway.

--
"Creation began." - James Blish, /A Clash of Cymbals/

Hewlett-Packard Limited registered office: Cain Road, Bracknell,
registered no: 690597 England Berks RG12 1HN

Mar 14 '08 #9
In article <fr**********@registered.motzarella.org>,
Richard <de***@gmail.comwrote:
>Reading in large chunks is unrelated to whether it's binary or
ascii.
>I would question that statement. Reading in binary will be a LOT faster
,if its the same platform. for reading in the same NUMBER of
readings.
I didn't say whether it's in binary is unrelated to *speed*.

I meant: there are two separate issues; whether you read it in large
chunks, and whether it's binary. You can read each of text or binary
in small or large chunks. Each of these choices will separately affect
the speed.

-- Richard
--
:wq
Mar 14 '08 #10
ri*****@cogsci.ed.ac.uk (Richard Tobin) wrote:
In article <fr**********@registered.motzarella.org>,
Richard <de***@gmail.comwrote:
Reading in large chunks is unrelated to whether it's binary or
ascii.
I would question that statement. Reading in binary will be a LOT faster
,if its the same platform. for reading in the same NUMBER of
readings.

I didn't say whether it's in binary is unrelated to *speed*.

I meant: there are two separate issues; whether you read it in large
chunks, and whether it's binary. You can read each of text or binary
in small or large chunks. Each of these choices will separately affect
the speed.
Besides, he _has_ a text file. Yes, it's a lot larger than a binary file
would be, and therefore slower to read. But the fact that the _file_ is
text is not the OP's doing. Reading this file as text or as binary won't
make a large difference. _Writing_ it as a binary file would have; but
that's not something the OP can do.

Richard
Mar 14 '08 #11

"Chris Dollin" <ch**********@hp.comwrote in message
news:fr**********@news-pa1.hpl.hp.com...
Richard wrote:
>ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:
>>In article
<4e**********************************@s19g2000pr g.googlegroups.com>,
broli <Br*****@gmail.comwrote:

I need to parse a file which has about 2000 lines and I'm getting
told that reading the file in ascii would be a slower way to do it and
so i need to resort to binary by reading it in large chunks. Can any
one please explain what is all this about ?

Reading in large chunks is unrelated to whether it's binary or
ascii.

I would question that statement. Reading in binary will be a LOT faster
,if its the same platform. for reading in the same NUMBER of
readings.
Quick test, one file, 2000 lines, each line with two floats (1.12345
and 7.890), about 28Kb total.

One single big-enough fread:

real 0m0.002s
user 0m0.000s
sys 0m0.001s

Repeat fscanf( ... "%lf %lf" ... ) until EOF:

real 0m0.004s
user 0m0.002s
sys 0m0.002s

Yes, in this test it's twice as slow. The data file is probably
cached (it's been read several other times already as I /cough/
My own tests:

(A) 100,000 lines of text, each with 3 doubles (2900000 bytes):

2.1 seconds to read a number at a time, using sscanf() (but I use a wrapper
or two with some extra overhead)

(B) The same data as 300,000 doubles written as binary (2400000 bytes):

0.8 seconds to read a number at a time, using fread() 8 bytes at a time

(C) Same binary data as (B)

0.004 seconds to read as a single block into memory (possibly straight into
the array or whatever datastructure is used). Using fread() on 2400000
bytes.

So about 200-500 times faster in binary mode, when done properly.

--
Bart

Mar 14 '08 #12
ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:
In article <fr**********@registered.motzarella.org>,
Richard <de***@gmail.comwrote:
>>Reading in large chunks is unrelated to whether it's binary or
ascii.
>>I would question that statement. Reading in binary will be a LOT faster
,if its the same platform. for reading in the same NUMBER of
readings.

I didn't say whether it's in binary is unrelated to *speed*.
I'm not sure that parses :-;
>
I meant: there are two separate issues; whether you read it in large
chunks, and whether it's binary. You can read each of text or binary
in small or large chunks. Each of these choices will separately affect
the speed.
Yes, I agree.
>
-- Richard
Mar 14 '08 #13
Bartc wrote:
) My own tests:
)
) (A) 100,000 lines of text, each with 3 doubles (2900000 bytes):
)
) 2.1 seconds to read a number at a time, using sscanf() (but I use a wrapper
) or two with some extra overhead)
)
) (B) The same data as 300,000 doubles written as binary (2400000 bytes):
)
) 0.8 seconds to read a number at a time, using fread() 8 bytes at a time
)
) (C) Same binary data as (B)
)
) 0.004 seconds to read as a single block into memory (possibly straight into
) the array or whatever datastructure is used). Using fread() on 2400000
) bytes.
)
) So about 200-500 times faster in binary mode, when done properly.

Have you tried reading the text file into memory as a single block
and then using sscanf() to parse it ?
SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
Mar 14 '08 #14
Chris Dollin <ch**********@hp.comwrites:
Richard wrote:
>ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:
>>In article <4e**********************************@s19g2000prg. googlegroups.com>,
broli <Br*****@gmail.comwrote:

I need to parse a file which has about 2000 lines and I'm getting
told that reading the file in ascii would be a slower way to do it and
so i need to resort to binary by reading it in large chunks. Can any
one please explain what is all this about ?

Reading in large chunks is unrelated to whether it's binary or
ascii.

I would question that statement. Reading in binary will be a LOT faster
,if its the same platform. for reading in the same NUMBER of
readings.
>> Perhaps they meant that character-at-a-time reading with
getchar() is slow, which it is on some systems. You can perfectly
well use fread() on text files.

The text file will be larger. There is a need to parse the ascii text
into the destination formats.

It will be slower in the great majority of cases.

Quick test, one file, 2000 lines, each line with two floats (1.12345
and 7.890), about 28Kb total.

One single big-enough fread:

real 0m0.002s
user 0m0.000s
sys 0m0.001s

Repeat fscanf( ... "%lf %lf" ... ) until EOF:

real 0m0.004s
user 0m0.002s
sys 0m0.002s

Yes, in this test it's twice as slow. The data file is probably
cached (it's been read several other times already as I /cough/
debugged my code). It includes program start-up time (I just did
`time ./a.out` to get the numbers) so the actual reading time will
be less.

Myself I wouldn't count that as "LOTS faster" for binary data,
but doubtless there are applications where it is so counted;
I don't think the OPs case is one of them, and it does look as
though he's reading a text file anyway.
Then why not take the static noise out? Make the file a lot bigger and
report back.

But even these results do indicate quite a large % difference .....

And we do not know how often this data sample is written or read. I
could be thousands of times an hour leading to considerable unnecessary
overhead if using ascii over binary.
Mar 14 '08 #15
"Bartc" <bc@freeuk.comwrites:
"Chris Dollin" <ch**********@hp.comwrote in message
news:fr**********@news-pa1.hpl.hp.com...
>Richard wrote:
>>ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:

In article
<4e**********************************@s19g2000p rg.googlegroups.com>,
broli <Br*****@gmail.comwrote:

>I need to parse a file which has about 2000 lines and I'm getting
>told that reading the file in ascii would be a slower way to do it and
>so i need to resort to binary by reading it in large chunks. Can any
>one please explain what is all this about ?

Reading in large chunks is unrelated to whether it's binary or
ascii.

I would question that statement. Reading in binary will be a LOT faster
,if its the same platform. for reading in the same NUMBER of
readings.
>Quick test, one file, 2000 lines, each line with two floats (1.12345
and 7.890), about 28Kb total.

One single big-enough fread:

real 0m0.002s
user 0m0.000s
sys 0m0.001s

Repeat fscanf( ... "%lf %lf" ... ) until EOF:

real 0m0.004s
user 0m0.002s
sys 0m0.002s

Yes, in this test it's twice as slow. The data file is probably
cached (it's been read several other times already as I /cough/

My own tests:

(A) 100,000 lines of text, each with 3 doubles (2900000 bytes):

2.1 seconds to read a number at a time, using sscanf() (but I use a wrapper
or two with some extra overhead)

(B) The same data as 300,000 doubles written as binary (2400000 bytes):

0.8 seconds to read a number at a time, using fread() 8 bytes at a time

(C) Same binary data as (B)

0.004 seconds to read as a single block into memory (possibly straight into
the array or whatever datastructure is used). Using fread() on 2400000
bytes.

So about 200-500 times faster in binary mode, when done properly.
I'm surprised this is even being contested.
Mar 14 '08 #16

"Willem" <wi****@stack.nlwrote in message
news:sl*******************@snail.stack.nl...
Bartc wrote:
) My own tests:
)
) (A) 100,000 lines of text, each with 3 doubles (2900000 bytes):
)
) 2.1 seconds to read a number at a time, using sscanf() (but I use a
wrapper
) or two with some extra overhead)
)
) (B) The same data as 300,000 doubles written as binary (2400000 bytes):
)
) 0.8 seconds to read a number at a time, using fread() 8 bytes at a time
)
) (C) Same binary data as (B)
)
) 0.004 seconds to read as a single block into memory (possibly straight
into
) the array or whatever datastructure is used). Using fread() on 2400000
) bytes.
)
) So about 200-500 times faster in binary mode, when done properly.

Have you tried reading the text file into memory as a single block
and then using sscanf() to parse it ?
No. I would imagine it would add a second or so to the time.

However, I left out the word 'apparently' when quoting the 200+ speed-up for
the binary block. I'm sure the disk cache has a big effect here, unless my
harddrive has a 600MB/sec transfer rate.

--
Bart
Mar 14 '08 #17
In article <fr**********@registered.motzarella.org>,
Richard <de***@gmail.comwrote:
>I didn't say whether it's in binary is unrelated to *speed*.
>I'm not sure that parses :-;
I didn't say { { whether it's in binary } is unrelated to { *speed* } }.

-- Richard
--
:wq
Mar 14 '08 #18
Richard wrote:
Chris Dollin <ch**********@hp.comwrites:
>Myself I wouldn't count that as "LOTS faster" for binary data,
but doubtless there are applications where it is so counted;
I don't think the OPs case is one of them, and it does look as
though he's reading a text file anyway.

Then why not take the static noise out? Make the file a lot bigger and
report back.
Because 2000 lines was the OPs file size, and for that file size
and context, the difference in timing is unimportant, and because
life being finite, I'd already spent what time I had available.
But even these results do indicate quite a large % difference .....

And we do not know how often this data sample is written or read. I
could be thousands of times an hour leading to considerable unnecessary
overhead if using ascii over binary.
Yes, and it could be once a day. Or a week. And for all we know -- hey,
if you can invent facts, so can I -- his code will be run on machines
with different floating-point formats, making binary transfer a clear
road to the Pit and text transfer more of a Dragons of Bel'kwinith thing.

--
"Creation began." - James Blish, /A Clash of Cymbals/

Hewlett-Packard Limited registered office: Cain Road, Bracknell,
registered no: 690597 England Berks RG12 1HN

Mar 14 '08 #19
Richard wrote:
"Bartc" <bc@freeuk.comwrites:
>So about 200-500 times faster in binary mode, when done properly.

I'm surprised this is even being contested.
It's not being contested; it's being /quantified/, which is part of
deciding whether whatever is the right thing to do.

[You can drive along the M4 at 70mph or at 120mph [1]; the latter is
certainly faster.]

[1] And, Just In Case Someone Suspects A Weasel, at a whole bunch of
other speeds as well, including at times 0; I don't /think/ I've
ever had to go negative, though.

--
"It was the dawn of the third age of mankind." /Babylon 5/

Hewlett-Packard Limited registered office: Cain Road, Bracknell,
registered no: 690597 England Berks RG12 1HN

Mar 14 '08 #20
Richard Tobin wrote:
In article <fr**********@aioe.org>,
Mark Bluemel <ma**********@pobox.comwrote:
>This is fairly clearly a text file, so I can't see why anyone
should consider processing it as binary.

Perhaps the idea is to instead store the data in binary in the file.
As the OP goes on to state "I am using a graphics package
which always produces the .zeus file strictly in the above format", I'm
inclined to doubt that that is an option.
Mar 14 '08 #21
On 14 Mar 2008 at 14:36, Richard wrote:
"Bartc" <bc@freeuk.comwrites:
>My own tests:

(A) 100,000 lines of text, each with 3 doubles (2900000 bytes):

2.1 seconds to read a number at a time, using sscanf() (but I use a wrapper
or two with some extra overhead)

(B) The same data as 300,000 doubles written as binary (2400000 bytes):

0.8 seconds to read a number at a time, using fread() 8 bytes at a time

(C) Same binary data as (B)

0.004 seconds to read as a single block into memory (possibly straight into
the array or whatever datastructure is used). Using fread() on 2400000
bytes.

So about 200-500 times faster in binary mode, when done properly.

I'm surprised this is even being contested.
Are you really surprised by *anything* in clc any more?

Leave common sense at the door when you enter clc.

Mar 14 '08 #22
Chris Dollin <ch**********@hp.comwrites:
Richard wrote:
>"Bartc" <bc@freeuk.comwrites:
>>So about 200-500 times faster in binary mode, when done properly.

I'm surprised this is even being contested.

It's not being contested; it's being /quantified/, which is part of
deciding whether whatever is the right thing to do.
See other post. You can not quantify it without all the criteria.

And faster is faster.
Mar 14 '08 #23
Bartc wrote:
My own tests:

(A) 100,000 lines of text, each with 3 doubles (2900000 bytes):

2.1 seconds to read a number at a time, using sscanf() (but I use a wrapper
or two with some extra overhead)

(B) The same data as 300,000 doubles written as binary (2400000 bytes):

0.8 seconds to read a number at a time, using fread() 8 bytes at a time

(C) Same binary data as (B)

0.004 seconds to read as a single block into memory (possibly straight into
the array or whatever datastructure is used). Using fread() on 2400000
bytes.

So about 200-500 times faster in binary mode, when done properly.
OK.

But given that the OP has the data presented to him in a specific
format, which he has explained and which he has indicated is fixed, as
far as he is aware, this option is not open to him.

We do not know whether there is a need for repeated parsing of this
file - the OP has not told us. If there is, then there _may_ be an
argument for transforming the file into a binary format. Otherwise,
I don't see where binary comes into the problem - the file is a text
file and will need to be read and parsed from the text format.
Mar 14 '08 #24
Richard wrote:

<snip about text vs. binary I/O>
Take the faster way and then the unknown are a moot point.
The fastest method to do a task isn't always the most appropriate one.
The OP might have reasons for preferring/using text which we do not
know, as he has not specified much detail.

Mar 14 '08 #25

"Bartc" <bc@freeuk.comwrote in message
news:QP*****************@text.news.virginmedia.com ...
>
"Chris Dollin" <ch**********@hp.comwrote in message
>Quick test, one file, 2000 lines, each line with two floats (1.12345
and 7.890), about 28Kb total.

One single big-enough fread:

real 0m0.002s
user 0m0.000s
sys 0m0.001s

Repeat fscanf( ... "%lf %lf" ... ) until EOF:

real 0m0.004s
user 0m0.002s
sys 0m0.002s

Yes, in this test it's twice as slow. The data file is probably
cached (it's been read several other times already as I /cough/

My own tests:

(A) 100,000 lines of text, each with 3 doubles (2900000 bytes):

2.1 seconds to read a number at a time, using sscanf() (but I use a
wrapper
or two with some extra overhead)

(B) The same data as 300,000 doubles written as binary (2400000 bytes):

0.8 seconds to read a number at a time, using fread() 8 bytes at a time

(C) Same binary data as (B)

0.004 seconds to read as a single block into memory (possibly straight
into
the array or whatever datastructure is used). Using fread() on 2400000
bytes.

So about 200-500 times faster in binary mode, when done properly.
I've done new tests which (i) use sscanf more directly and (ii) allow for
disk caching:

300,000 doubles as text in 100,000 lines read with fgets/sscanf: 1.8 seconds
300,000 doubles as binary, read individually with fread(): 0.8 seconds
300,000 doubles as binary, read as one block with fread: 0.09 seconds

So reading binary directly into a memory array is still nearly 10 times
faster than read number by number, and twenty times faster than reading as
text.

--
Bart
Mar 14 '08 #26
broli wrote:

<snip>

Why have you posted it again?

You need allow at least a few hours for responses, and preferably, about
a couple of days.

Mar 17 '08 #27
Bill Reid wrote:
and
yes, if several people here said "use a 'state machine'", that's actually
a good reason to NOT use a "state machine"...
Each to their own. Here's a quick and hopefully not too dirty state
machine solution anyway:-

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

typedef struct {
double x, y, z;
} Vector;

typedef struct {
Vector V;
} Vertex;

typedef struct {
int v0, v1, v2;
} Triangle;

typedef struct {
int nvert, ntri;
Vertex *Vert;
Triangle *Tri;
} Object;

int main(void)
{

enum { FIRST_LINE,
SECOND_LINE,
THIRD_LINE,
FOURTH_LINE,
VERTICES,
GAP,
TRIANGLES,
LAST_LINE
} state = FIRST_LINE;

Object current_object;
int vertex_count = 0;
int triangle_count = 0;
int gap_line_count = 0;

FILE *zeus_file;
zeus_file = fopen("zeus.dat", "r");

while (state != LAST_LINE) {
char dataLine[120]; /* or some more appropriate value */
if (fgets(dataLine, 119, zeus_file) == NULL) {
perror("zeus file read failed");
exit(EXIT_FAILURE);
}
switch (state) {
case (FIRST_LINE):
/* do nothing */
state = SECOND_LINE;
break;
case (SECOND_LINE):
/* do nothing */
state = THIRD_LINE;
break;
case (THIRD_LINE):
sscanf(dataLine, "%d %*d %d", &current_object.nvert,
&current_object.ntri);
current_object.Vert =
calloc(sizeof(Vertex), current_object.nvert);
current_object.Tri =
calloc(sizeof(Triangle), current_object.ntri);
state = FOURTH_LINE;
printf("Getting %d Vertexes and %d Triangles\n",
current_object.nvert, current_object.ntri);
break;
case (FOURTH_LINE):
/* do nothing */
state = VERTICES;
break;
case (VERTICES):
sscanf(dataLine, "%lf %lf %lf",
&(current_object.Vert[vertex_count].V.x),
&(current_object.Vert[vertex_count].V.y),
&(current_object.Vert[vertex_count].V.z));
vertex_count += 1;
if (vertex_count >= current_object.nvert) {
state = GAP;
}
break;
case (GAP):
gap_line_count += 1;
if (gap_line_count >= 2) {
state = TRIANGLES;
}
break;
case (TRIANGLES):
sscanf(dataLine, "%d %d %d",
&(current_object.Tri[triangle_count].v0),
&(current_object.Tri[triangle_count].v1),
&(current_object.Tri[triangle_count].v2));
triangle_count += 1;
if (triangle_count >= current_object.ntri) {
state = LAST_LINE;
}
break;
}
}

printf("VERTEXES :-\n");
for (vertex_count = 0; vertex_count < current_object.nvert;
vertex_count++) {
printf("%d: %f %f %f\n", vertex_count,
current_object.Vert[vertex_count].V.x,
current_object.Vert[vertex_count].V.y,
current_object.Vert[vertex_count].V.z);
}
printf("Triangles :-\n");
for (triangle_count = 0; triangle_count < current_object.ntri;
triangle_count++) {
printf("%d: %d %d %d\n", triangle_count,
current_object.Tri[triangle_count].v0,
current_object.Tri[triangle_count].v1,
current_object.Tri[triangle_count].v2);
}
}
Mar 17 '08 #28
broli said:
So here's my attempt using STATE MACHINE (as many people have
suggested) for reading the .zeus file in its proper format. Please
tell me what are the flaws in this program (Please do not point
obvious things
I have refrained, at your request, from pointing out the obvious problems
with the code.

There are no non-obvious problems that I can see.

If you'd like to know about the obvious problems after all, just say so.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Mar 17 '08 #29
Ben Bacarisse <be********@bsb.me.ukwrites:
broli <Br*****@gmail.comwrites:
>So here's my attempt using STATE MACHINE (as many people have
suggested)

Many people? I think, in this case it just complicates the program.
Your states are entered in sequence. Parsing the file is just a
sequence of actions, one after the other.
[...]

As one of the people who suggested a state machine, I think you're
right. If the states are purely sequential (state 1 is always
followed by state 2, which is always followed by state 3, etc.), then
an explicit state machine is probably overkill. The state can be
implicitly represented by where you are in the program.

But if you want to use an explicit state machine anyway, *please* give
your states meaningful names.

--
Keith Thompson (The_Other_Keith) <ks***@mib.org>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Mar 17 '08 #30
broli <Br*****@gmail.comwrites:
[...]
label:while(!feof(fp))
[...]

Please read section 12 of the comp.lang.c FAQ, <http://c-faq.com/>,
particularly question 12.2.

--
Keith Thompson (The_Other_Keith) <ks***@mib.org>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Mar 17 '08 #31
Keith Thompson wrote:
Ben Bacarisse <be********@bsb.me.ukwrites:
>broli <Br*****@gmail.comwrites:
>>So here's my attempt using STATE MACHINE (as many people have
suggested)
Many people? I think, in this case it just complicates the program.
Your states are entered in sequence. Parsing the file is just a
sequence of actions, one after the other.
[...]

As one of the people who suggested a state machine, I think you're
right.
Yes, probably.
If the states are purely sequential (state 1 is always
followed by state 2, which is always followed by state 3, etc.), then
an explicit state machine is probably overkill. The state can be
implicitly represented by where you are in the program.
But writing a state machine is fun, generalizable and has the benefit -
if you use names, not opaque numbers - of making things very explicit.
Mar 17 '08 #32

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Willem Ligtenberg | last post by:
I decided to use SAX to parse my xml file. But the parser crashes on: File "/usr/lib/python2.3/site-packages/_xmlplus/sax/handler.py", line 38, in fatalError raise exception...
2
by: Cigdem | last post by:
Hello, I am trying to parse the XML files that the user selects(XML files are on anoher OS400 system called "wkdis3"). But i am permenantly getting that error: Directory0: \\wkdis3\ROOT\home...
3
by: Pir8 | last post by:
I have a complex xml file, which contains stories within a magazine. The structure of the xml file is as follows: <?xml version="1.0" encoding="ISO-8859-1" ?> <magazine> <story>...
1
by: Christoph Bisping | last post by:
Hello! Maybe someone is able to give me a little hint on this: I've written a vb.net app which is mainly an interpreter for specialized CAD/CAM files. These files mainly contain simple movement...
4
by: Rick Walsh | last post by:
I have an HTML table in the following format: <table> <tr><td>Header 1</td><td>Header 2</td></tr> <tr><td>1</td><td>2</td></tr> <tr><td>3</td><td>4</td></tr> <tr><td>5</td><td>6</td></tr>...
3
by: toton | last post by:
Hi, I have some ascii files, which are having some formatted text. I want to read some section only from the total file. For that what I am doing is indexing the sections (denoted by .START in...
9
by: Paulers | last post by:
Hello, I have a log file that contains many multi-line messages. What is the best approach to take for extracting data out of each message and populating object properties to be stored in an...
13
by: Chris Carlen | last post by:
Hi: Having completed enough serial driver code for a TMS320F2812 microcontroller to talk to a terminal, I am now trying different approaches to command interpretation. I have a very simple...
13
by: charliefortune | last post by:
I am fetching some product feeds with PHP like this $merch = substr($key,1); $feed = file_get_contents($_POST); $fp = fopen("./feeds/feed".$merch.".txt","w+"); fwrite ($fp,$feed); fclose...
2
by: Felipe De Bene | last post by:
I'm having problems parsing an HTML file with the following syntax : <TABLE cellspacing=0 cellpadding=0 ALIGN=CENTER BORDER=1 width='100%'> <TH BGCOLOR='#c0c0c0' Width='3%'>User ID</TH> <TH...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.