problem reading/writing structures from and to files

arne.muller

Hello,

I've come across some problems reading strucutres from binary files.
Basically I've some strutures

typedef struct {
int i;
double x;
int n;
double *mz;
short *intens;
} Data;

I've an array of these structures and their mz and intens pointers
point to arrays with n elements each.

My progam can write the Data array into a binary file, after writing
the structure itself (using fwrite) it fwrites (appends) the mz and
then the itnens arrays. I don't need to exachange this datafile between
machines, but another program (on the same machine, compiled with the
same compiler) needs to read this file frequently.

My idea is to read this file in one go into memory using fread (I could
even use mmap, since this file is accessed by several processes read
only), and then to reconstruct the mz and intens pointers properly. An
iterator would then fetch the next Data element (returning a pointer to
it). Well, this is were I got stuck, how do I best access this chunk of
memory to reconstruct the Data structure. Something like this?

....
the pointer returned by mmap:
void* mem;

the filesize in bytes
size_t filesize;
....

Data* nextEntry() {

static bytes_read = 0;
data *d = NULL
size_t n = 0;

/* is there still so,ething to reqd ? */
if ( filesize < bytes_read ) {
d = (Data*)&(((char*)mem)[bytes_read]);
bytes_read += sizeof(mem); /* jump to mz part */
d->mz = (double*)&(((char*)mem)[bytes_read]); /* re-assign mz */
bytes_read += sizeof(double) * d->n; /* jump to intens part */
d->intens = (short*)&(((char*)mem)[bytes_read]); /* re-assign */
bytes_read += sizeof(short) * s->n; /* jump to next Data field */
}

return d;
}

This is not the actual imlpementation I use, but it's the same
principle (simplified). I'm not feeling good using all these casts ...
.. The code needs to be portable (but not the binary file itself, it
will always stay on the same machine). This works under Linux, but
gives me 'Unaligned access' (Tru64) or a Bus Error (SunOS). So I guess
this code is not realy prtable ... :-(

Any hints how get this working?

thanks a lot for your help,

Arne

Oct 6 '06 #1

Subscribe Post Reply

5217

Ancient_Hacker

arne.mul...@gmail.com wrote:

Hello,

I've come across some problems reading strucutres from binary files.
Basically I've some strutures

typedef struct {
int i;
double x;
int n;
double *mz;
short *intens;
} Data;

Aym, ay, ay, caramba! I doubt if the C standard lets you read or
write pointers-- they're supposed to be valid only in the time and
space of one invocation of one program.

You can fix this two ways:

(1) Chaneg the pointers to be integer indexes into a big array of
doubles and another of shorts.
write your own Doublemalloc() which just returns the next free index in
the array.

(2) Keep your blasterd pointers, but when it comes time to write out
the file, copy the pointed to values to an array (same as (1)) and
replace the pointers with the array indexes.

Oct 6 '06 #2

arne.muller

Ancient_Hacker wrote:

arne.mul...@gmail.com wrote:
Hello,

I've come across some problems reading strucutres from binary files.
Basically I've some strutures

typedef struct {
int i;
double x;
int n;
double *mz;
short *intens;
} Data;

Aym, ay, ay, caramba! I doubt if the C standard lets you read or
write pointers-- they're supposed to be valid only in the time and
space of one invocation of one program.

yes, but fwrite takes just a void pointer an 'object' and writes it, it
shouldn't look inside the structure. The pointers stored in the file
are meaningless, and that's why I try to reconstruct the correct
address after wards.

>
You can fix this two ways:

(1) Chaneg the pointers to be integer indexes into a big array of
doubles and another of shorts.
write your own Doublemalloc() which just returns the next free index in
the array.

(2) Keep your blasterd pointers, but when it comes time to write out
the file, copy the pointed to values to an array (same as (1)) and
replace the pointers with the array indexes.

I thkn I'll try that one!

thanks,

Arne

Oct 6 '06 #3

Snis Pilbor

ar*********@gmail.com wrote:

Hello,

I've come across some problems reading strucutres from binary files.
Basically I've some strutures

(snip very complicated code to write the raw data in a structure verbatim into a file)

Writing structures bit-for-bit into a file is always a terrible idea.
It will make the file extremely unportable. Unportable between
systems, even unportable between compilations. Certainly any time you
alter the structure itself and recompile, all earlier files will become
garbage. And if you're writing pointers verbatim as binary data, then
the file may not even work right between different runs of the exact
same executable, making your file next to worthless.

It's more work, but it's virtually always worthwhile to instead write
individual elements of the structure to the file in a carefully
formatted way. Now, if your structure contains pointers, chances are
those pointers point to other types of structures which, assuming they
too are preserved in files or at least hardcoded into your program, you
should be able to look up by text name or identifier number or
something like that. For instance if your thingies contain linked
lists of widgets, then load the widget file first, including a name or
id number for each widget, THEN load your thingy file. Finally, in
case where structures point to other structures from the same file, or
in case different structures point to eachother so neither is truly
"better to load first", then you should do some acrobatics, like
temporarily just save the names or id numbers while loading both files,
then after both are fully loaded, only then work out what the pointers
should point to.

Oct 6 '06 #4

Barry Schwarz

On 6 Oct 2006 09:47:44 -0700, ar*********@gmail.com wrote:

>Hello,

I've come across some problems reading strucutres from binary files.
Basically I've some strutures

typedef struct {
int i;
double x;
int n;
double *mz;
short *intens;
} Data;

I've an array of these structures and their mz and intens pointers
point to arrays with n elements each.

My progam can write the Data array into a binary file, after writing
the structure itself (using fwrite) it fwrites (appends) the mz and
then the itnens arrays. I don't need to exachange this datafile between
machines, but another program (on the same machine, compiled with the
same compiler) needs to read this file frequently.

My idea is to read this file in one go into memory using fread (I could
even use mmap, since this file is accessed by several processes read

mmap is not a standard function so I have no idea what it does for or
to you.

>only), and then to reconstruct the mz and intens pointers properly. An
iterator would then fetch the next Data element (returning a pointer to

C doesn't have iterators. What does next Data element mean?

>it). Well, this is were I got stuck, how do I best access this chunk of
memory to reconstruct the Data structure. Something like this?

If your description of the file contents is correct, you only have one
set of data in the file. While it is possible to read the entire set
of data in a single fread, be aware that this could lead to some
alignment issues if sizeof(Data) is not a multiple of sizeof(double)
or sizeof(double) is not a multiple of sizeof(short). I think they
will be but I'm not certain and I don't think the standard guarantees
the second. If Data did not contain a member of type double, I would
not depend on it at all.

If all the sizeof's are proper multiples, a single fread into a large
dynamically allocated buffer would tell you how many bytes were read
and insure that the data was properly aligned. The "array" of doubles
would start sizeof(Data) bytes into the buffer, basically immediately
following the struct. You would set mz to this address. Something in
Data must tell you many doubles there are (N). The remaining bytes
(number read - sizeof(Data) - N*sizeof(double)) are the shorts. The
shorts start N*sizeof(double) beyond the value in mz and you would
store this address in intens.

If the sizeof's do not cooperate, use fread to read the struct into a
properly aligned buffer (obviously sizeof(Data) bytes). Something in
the struct must tell you how many doubles (N) follow. Use fread to
read them into another properly aligned buffer (obviously
N*sizeof(double) bytes) and set mz to the address of this buffer. You
can then read the remaining data into a third properly aligned buffer
and set intens to its address. The number of bytes read divided by
sizeof(short) gives you the number of object read.

snip mmap code
Remove del for email

Oct 7 '06 #5

Chris Torek

In article <11**********************@i42g2000cwa.googlegroups .com>
<ar*********@gmail.comwrote:

>I've come across some problems reading strucutres from binary files.

As others have cautioned, it is often wise to use something other
than "raw binary" format for data files. Problems that were
guaranteed to run on a single machine seem often to expand as
if by magic and suddenly require a heterogenous network. :-)

That said:

>typedef struct {
int i;
double x;
int n;
double *mz;
short *intens;
} Data;

I've an array of these structures and their mz and intens pointers
point to arrays with n elements each.

My progam can write the Data array into a binary file, after writing
the structure itself (using fwrite) it fwrites (appends) the mz and
then the itnens arrays.

In other words, you use fwrite() to write out the i, x, and n
fields (which you need in the file) plus also the "mz" and "intens"
fields (which you do *not* need in the file, since they have to
be replaced on subsequent "re-load-from-file" runs):

Data *p;
FILE *somefile;
... set up p, p->i, p->x, p->n, etc ...

somefile = fopen("somename", "wb");
if (somefile == NULL)
... handle error ...

/* possible additional code here */

if (fwrite(p, sizeof *p, 1, somefile) != 1)
... handle error ...
if (fwrite(p->mz, sizeof *p->mz, p->n, somefile) != p->n)
... handle error ...
if (fwrite(p->intens, sizeof *p->intens, p->n, somefile) != p->n)
... handle error ...

This code is OK, although the initial fwrite() -- writing bytes
from (void *)p for length sizeof *p -- writes three useful values
and two useless ones. It would be "better" (in some sense) to
write only the useful values, by replacing the first fwrite()
with three separate fwrite()s:

if (fwrite(&p->i, sizeof p->i, 1, somefile) != 1 ||
fwrite(&p->x, sizeof p->x, 1, somefile) != 1 ||
fwrite(&p->n, sizeof p->n, 1, somefile) != 1)
... handle error ...

>My idea is to read this file in one go into memory using fread ...

The simplest way to read it back is to use as many fread()s as
fwrite()s above:

Data *p;
FILE *somefile;
...
p = malloc(sizeof *p);
if (p == NULL)
... handle error ...
somefile = fopen("somename", "rb");
if (somefile == NULL)
... handle error ...

/* assuming three separate fwrite()s for the useful elements: */
if (fread(&p->i, sizeof p->i, 1, somefile) != 1 ||
fread(&p->x, sizeof p->x, 1, somefile) != 1 ||
fread(&p->n, sizeof p->n, 1, somefile) != 1)

/* insert range-checking on i, x, and p here if desired,
to validate the input data */

if ((p->mz = malloc(p->n * sizeof *p->mz)) == NULL)
... handle error ...
if ((p->intens = malloc(p->n * sizeof *p->intens)) == NULL)
... handle error ...
if (fread(p->mz, sizeof *p->mz, p->n, somefile) != p->n)
... handle error ...
if (fread(p->intens, sizeof *p->intens, p->n, somefile) != p->n)
... handle error ...

>(I could even use mmap, since this file is accessed by several
processes read only),

The mmap() routines are dangerously seductive. Using them ties
your code and data to OS- and machine-dependent items, and makes
it error-prone in ways that are not always obvious on first blush.
(For instance, the really odd one is what happens if the file is
truncated after successfully mapping it.)

>and then to reconstruct the mz and intens pointers properly.

If you omit them when writing, you can omit them when reading
back, as above.

While mmap() avoids what some people call "unnecessary" copying of
the data (during I/O), that very copying is what makes the code
above so simple and reliable. Often, the simplicity is worth the
performance penalty. (If it is not, one can always complextify
the code later. :-) )
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Oct 8 '06 #6

Joe Wright

Chris Torek wrote:

In article <11**********************@i42g2000cwa.googlegroups .com>
<ar*********@gmail.comwrote:
>I've come across some problems reading strucutres from binary files.

As others have cautioned, it is often wise to use something other
than "raw binary" format for data files. Problems that were
guaranteed to run on a single machine seem often to expand as
if by magic and suddenly require a heterogenous network. :-)

That said:

>typedef struct {
int i;
double x;
int n;
double *mz;
short *intens;
} Data;

I've an array of these structures and their mz and intens pointers
point to arrays with n elements each.

My progam can write the Data array into a binary file, after writing
the structure itself (using fwrite) it fwrites (appends) the mz and
then the itnens arrays.

In other words, you use fwrite() to write out the i, x, and n
fields (which you need in the file) plus also the "mz" and "intens"
fields (which you do *not* need in the file, since they have to
be replaced on subsequent "re-load-from-file" runs):

Data *p;
FILE *somefile;
... set up p, p->i, p->x, p->n, etc ...

somefile = fopen("somename", "wb");
if (somefile == NULL)
... handle error ...

/* possible additional code here */

if (fwrite(p, sizeof *p, 1, somefile) != 1)
... handle error ...
if (fwrite(p->mz, sizeof *p->mz, p->n, somefile) != p->n)
... handle error ...
if (fwrite(p->intens, sizeof *p->intens, p->n, somefile) != p->n)
... handle error ...

This code is OK, although the initial fwrite() -- writing bytes
from (void *)p for length sizeof *p -- writes three useful values
and two useless ones. It would be "better" (in some sense) to
write only the useful values, by replacing the first fwrite()
with three separate fwrite()s:

if (fwrite(&p->i, sizeof p->i, 1, somefile) != 1 ||
fwrite(&p->x, sizeof p->x, 1, somefile) != 1 ||
fwrite(&p->n, sizeof p->n, 1, somefile) != 1)
... handle error ...

>My idea is to read this file in one go into memory using fread ...

The simplest way to read it back is to use as many fread()s as
fwrite()s above:

Data *p;
FILE *somefile;
...
p = malloc(sizeof *p);
if (p == NULL)
... handle error ...
somefile = fopen("somename", "rb");
if (somefile == NULL)
... handle error ...

/* assuming three separate fwrite()s for the useful elements: */
if (fread(&p->i, sizeof p->i, 1, somefile) != 1 ||
fread(&p->x, sizeof p->x, 1, somefile) != 1 ||
fread(&p->n, sizeof p->n, 1, somefile) != 1)

/* insert range-checking on i, x, and p here if desired,
to validate the input data */

if ((p->mz = malloc(p->n * sizeof *p->mz)) == NULL)
... handle error ...
if ((p->intens = malloc(p->n * sizeof *p->intens)) == NULL)
... handle error ...
if (fread(p->mz, sizeof *p->mz, p->n, somefile) != p->n)
... handle error ...
if (fread(p->intens, sizeof *p->intens, p->n, somefile) != p->n)
... handle error ...

>(I could even use mmap, since this file is accessed by several
processes read only),

The mmap() routines are dangerously seductive. Using them ties
your code and data to OS- and machine-dependent items, and makes
it error-prone in ways that are not always obvious on first blush.
(For instance, the really odd one is what happens if the file is
truncated after successfully mapping it.)

>and then to reconstruct the mz and intens pointers properly.

If you omit them when writing, you can omit them when reading
back, as above.

While mmap() avoids what some people call "unnecessary" copying of
the data (during I/O), that very copying is what makes the code
above so simple and reliable. Often, the simplicity is worth the
performance penalty. (If it is not, one can always complextify
the code later. :-) )

Combining C structures and data and writing it to a file, then reading
that file back into memory, is non-trivial.

I invite you all to examine the .DBF file structure of dBASE or FoxPro
or Clipper. The file consists of a binary header (certainly a C-like
structure) describing record length, number of rows and such. Then
another array of structures describing the attributes of each column in
a row. The remainder of the .dbf file is text which begins at an offset
defined in the header and continues for cols * rows bytes, ending with
the ever-popular 0x1A byte.

I have worked with this structure for more than 20 years now. I like it.
Ten years ago I began writing C programs to manipulate .dbf files.
Doable of course but not 'simple' by any means.

Attempts to write structures and data to a single file and then read the
file and data in a meaningful way will prove to be non-trivial.

Simpler is better. Define your data in terms of columns per row and rows
per file, and write it in text, not binary. Beware the Endians.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Oct 8 '06 #7

by: Brandon McCombs | last post by:

This may be the wrong group but I didn't see anything for VC++ so I'm trying here. I have a C++ book by Deitel and Deitel that says I can use fstream File("data.dat", ios::in | ios::out |...

C / C++

operator[] and different behaviour for reading and writing

by: Mark Stijnman | last post by:

A while ago I posted a question about how to get operator behave differently for reading and writing. I basically wanted to make a vector that can be queried about whether it is modified recently...

C / C++

Python doc problem example: gzip module (reprise)

by: Xah Lee | last post by:

Python Doc Problem Example: gzip Xah Lee, 20050831 Today i need to use Python to compress/decompress gzip files. Since i've read the official Python tutorial 8 months ago, have spent 30...

Python

Problem with wrapping an unmanaged C++ DLL using the header file

by: Lokkju | last post by:

I am pretty much lost here - I am trying to create a managed c++ wrapper for this dll, so that I can use it from c#/vb.net, however, it does not conform to any standard style of coding I have seen....

.NET Framework

Reading / Writing Data into / out of a C# Structure

by: Zeke Zinzul | last post by:

Hi Guys & Geeks, What's the most elegant way of dealing with binary data and structures? Say I have this (which I actually do, a woo-hoo): struct Struct_IconHeader { byte width; byte...

C# / C Sharp

Problems using zlib - buffer problem or something else?

by: patrickdepinguin | last post by:

Hi, I use zlib to write data structures to a compressed file, using the gzwrite function. Afterwards I read the data back with gzread. I notice that this works well when the data written is not...

C / C++

Problem with zipfile and newlines

by: Neil Crighton | last post by:

I'm using the zipfile library to read a zip file in Windows, and it seems to be adding too many newlines to extracted files. I've found that for extracted text-encoded files, removing all instances...

Python

Reading a text file..

by: =?Utf-8?B?emFsZHk=?= | last post by:

Hi! I have a question.. How can csharp read a text file and execute it in another application? What I mean is that, Im doing a drawing using Tekla Structures. But instead of doing the same...

C# / C Sharp

Conversion Problem

by: John | last post by:

Hi This .net is driving me crazy!! In VB6 I had a type which contained a couple of multi-dimentional arrays which i used to create and read records: Type AAA : Array1(10,10,2) as Integer

Visual Basic .NET

Cloud Servers without Credit Card and Email Registration: A Simpler Way to Get on the Cloud

by: CloudSolutions | last post by:

Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...

General

Wordpress or something else?

by: Faith0G | last post by:

I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

Content Management Systems

One-click Importing Excel Data into a*Database

by: ryjfgjl | last post by:

In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...

Microsoft Excel

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

problem reading/writing structures from and to files

Similar topics