473,233 Members | 1,526 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,233 software developers and data experts.

directly serializing structs

Greetings,

When directly serializing C++ structures to a file with the standard
library functions giving the address of the data and length of
structure using the sizeof operator, do I risk portability because of
different compilers packing structures into different sizes or
components of this structure to different address boundaries (for
example placing in multiples of 4 on a 32bit system)? Once the file is
serialized, does the same code compiled by another compiler or even
the same compiler but a different version carry the risk of not
reading the contents properly?

Thank you

Jun 23 '07 #1
12 4872
Cagdas Ozgenc wrote:
Greetings,

When directly serializing C++ structures to a file with the standard
library functions giving the address of the data and length of
structure using the sizeof operator, do I risk portability because of
different compilers packing structures into different sizes or
components of this structure to different address boundaries (for
example placing in multiples of 4 on a 32bit system)?
Yes. You also risk portability problems because different compilers (or
platforms) having different formats for individual data items. Byte
ordering for integers often varies across platforms, floating point
formats can even vary for different compilers on the same platform.

Once the file is
serialized, does the same code compiled by another compiler or even
the same compiler but a different version carry the risk of not
reading the contents properly?
Yes, see above.

If this is a concern then consider using text, it's much more portable.
>
Thank you
Jun 23 '07 #2
Cagdas Ozgenc wrote:
Greetings,

When directly serializing C++ structures to a file with the standard
library functions giving the address of the data and length of
structure using the sizeof operator, do I risk portability because of
different compilers packing structures into different sizes or
components of this structure to different address boundaries (for
example placing in multiples of 4 on a 32bit system)? Once the file is
serialized, does the same code compiled by another compiler or even
the same compiler but a different version carry the risk of not
reading the contents properly?
Yes, it does. Internal layout is implementation defined.

--
Ian Collins.
Jun 23 '07 #3

"Cagdas Ozgenc" <ca***********@gmail.comwrote in message
news:11*********************@w5g2000hsg.googlegrou ps.com...
Greetings,

When directly serializing C++ structures to a file with the standard
library functions giving the address of the data and length of
structure using the sizeof operator, do I risk portability because of
different compilers packing structures into different sizes or
components of this structure to different address boundaries (for
example placing in multiples of 4 on a 32bit system)? Once the file is
serialized, does the same code compiled by another compiler or even
the same compiler but a different version carry the risk of not
reading the contents properly?
Does your software/application require that much portability? Why struggle
with "write once, compile anywhere" if you're only targeting one platform or
even only one machine, for instances?

I wouldn't call what you described above "serializing" though. To me,
"serializing" has the connotation that you indeed are looking into structs
and the sizes of data members, their padding etc. or using ASN.1/BER for
over the wire transmission, for another example, rather than just doing a
struct-sized write. The recommended practice of streaming everything at
every boundary (disk, wire) seems unnatural and tedious to me also. I guess
a layer at the boundaries that does the streaming on the non-primary
platform and doesn't do anything on the primary platform isn't that bad to
implement.

I can think of 3 issues that prevent the the "blast struct all over" concept
from working: endianess, padding/alignment, datatype sizes. The first one is
the party spoiler. Guaranteed width integers helps for the last issue.
Byte-aligning data (no padding) is probably available on most compilers (?).
Endianess though, well there's not much you can do about that to make the
concept work. Luckily, the users of big endian machines are mostly
categorizably different from little endian machine users, so you can just
pick your target users and tailor your software to them. Or else do the
conversions:

struct on Intel going over wire to a Sparc -no change to struct
struct coming into Sparc from Intel -convert struct endianess
struct on Sparc going to disk -convert struct endianess
struct on Sparc going to Intel -convert struct endianess
struct coming into Intel from Sparc -no change to struct
stuct on Intel going to disk -no change to struct

(The above scenario assumes platform-independent files are desired. If not,
fewer conversions required).
(Yes, before anyone quips, I do know that "network byte order" is big
endian. There's also more Windows machines than Unix).

(Issue 4: size of a byte).

John

Jun 23 '07 #4
On Fri, 22 Jun 2007 22:55:38 -0700, Cagdas Ozgenc <ca***********@gmail.com>
wrote:
>When directly serializing C++ structures to a file with the standard
library functions giving the address of the data and length of
structure using the sizeof operator, do I risk portability because of
different compilers packing structures into different sizes or
components of this structure to different address boundaries (for
example placing in multiples of 4 on a 32bit system)? Once the file is
serialized, does the same code compiled by another compiler or even
the same compiler but a different version carry the risk of not
reading the contents properly?
Serialization is a complex issue that has so far eluded a truly general
solution, primarily because the needs of each developer varies so much. There
are several "classes" of techniques, though. They are described quite well in
the C++ FAQ Lite pages:
http://www.parashift.com/c++-faq-lit...alization.html

When I want to do general-purpose, cross-platform, binary-compatible
exchanges, I generally:

1. Pack data structures to the byte (using #pragmas, most times)
2. Use fixed-width integer types
3. Choose an endian representation and provide conversion facilities
4. Use IEEE representation for floating point numbers, else use fixed point
notation
5. Serialize PODs and structs only, not class hierarchies

An adaptation library with conditional compilation switches can be made for
items 1-4 that allows you to encapsulate the compiler- or platform-specific
behaviors.

-dr
Jun 23 '07 #5

"Dave Rahardja" <dr****************************@pobox.comwrote in message
news:9g********************************@4ax.com...
On Fri, 22 Jun 2007 22:55:38 -0700, Cagdas Ozgenc
<ca***********@gmail.com>
wrote:
>>When directly serializing C++ structures to a file with the standard
library functions giving the address of the data and length of
structure using the sizeof operator, do I risk portability because of
different compilers packing structures into different sizes or
components of this structure to different address boundaries (for
example placing in multiples of 4 on a 32bit system)? Once the file is
serialized, does the same code compiled by another compiler or even
the same compiler but a different version carry the risk of not
reading the contents properly?

Serialization is a complex issue that has so far eluded a truly general
solution, primarily because the needs of each developer varies so much.
There
are several "classes" of techniques, though. They are described quite well
in
the C++ FAQ Lite pages:
http://www.parashift.com/c++-faq-lit...alization.html

When I want to do general-purpose, cross-platform, binary-compatible
exchanges, I generally:

1. Pack data structures to the byte (using #pragmas, most times)
I think "no padding" may indeed be a feature that a new language could
exploit.
2. Use fixed-width integer types
3. Choose an endian representation and provide conversion facilities
That's the key one. If there were one gift that the hardware vendors good
give, it would be to standardize endianess. IMO. OK, it's little endian from
now on. Let's move on! LOL! (Oh wait, can I have a standard definition of
"byte" also?).
4. Use IEEE representation for floating point numbers, else use fixed
point
notation
5. Serialize PODs and structs only, not class hierarchies
By "class hierarchies", I think you mean "derived structs". If there were
more guarantee (or I was so assured) that struct B derived from struct A
would be exactly like a struct containing the data members of A followed
immediately by data members of B, I'd be eventually OK with those
compositions.
>
An adaptation library with conditional compilation switches can be made
for
items 1-4 that allows you to encapsulate the compiler- or
platform-specific
behaviors.
Grouping those hides the "severity" of 3.

Even with your 1-5, all bets are still off because sizeof(char) could be
different somewhere else (right?).

John

Jun 24 '07 #6
JohnQ wrote:
>
"Dave Rahardja" <dr****************************@pobox.comwrote in
message news:9g********************************@4ax.com...
>On Fri, 22 Jun 2007 22:55:38 -0700, Cagdas Ozgenc
<ca***********@gmail.com>
wrote:
>>When directly serializing C++ structures to a file with the standard
library functions giving the address of the data and length of
structure using the sizeof operator, do I risk portability because of
different compilers packing structures into different sizes or
components of this structure to different address boundaries (for
example placing in multiples of 4 on a 32bit system)? Once the file is
serialized, does the same code compiled by another compiler or even
the same compiler but a different version carry the risk of not
reading the contents properly?

Serialization is a complex issue that has so far eluded a truly general
solution, primarily because the needs of each developer varies so
much. There
are several "classes" of techniques, though. They are described quite
well in
the C++ FAQ Lite pages:
http://www.parashift.com/c++-faq-lit...alization.html

When I want to do general-purpose, cross-platform, binary-compatible
exchanges, I generally:

1. Pack data structures to the byte (using #pragmas, most times)

I think "no padding" may indeed be a feature that a new language could
exploit.
Not if the hardware doesn't support it, or even supports it with a
significant performance hit.

--
Ian Collins.
Jun 24 '07 #7
On Jun 23, 7:55 am, Cagdas Ozgenc <cagdas.ozg...@gmail.comwrote:
When directly serializing C++ structures to a file with the standard
library functions giving the address of the data and length of
structure using the sizeof operator, do I risk portability because of
different compilers packing structures into different sizes or
components of this structure to different address boundaries (for
example placing in multiples of 4 on a 32bit system)? Once the file is
serialized, does the same code compiled by another compiler or even
the same compiler but a different version carry the risk of not
reading the contents properly?
Very much so. Even changing the compile flags can cause
problems. About the only time this works is for temporary
files, which are read and written by the same binary imagine.

--
James Kanze (Gabi Software) email: ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 24 '07 #8
On Jun 23, 11:58 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"Cagdas Ozgenc" <cagdas.ozg...@gmail.comwrote in message
news:11*********************@w5g2000hsg.googlegrou ps.com...
When directly serializing C++ structures to a file with the standard
library functions giving the address of the data and length of
structure using the sizeof operator, do I risk portability because of
different compilers packing structures into different sizes or
components of this structure to different address boundaries (for
example placing in multiples of 4 on a 32bit system)? Once the file is
serialized, does the same code compiled by another compiler or even
the same compiler but a different version carry the risk of not
reading the contents properly?

Does your software/application require that much portability? Why struggle
with "write once, compile anywhere" if you're only targeting one platformor
even only one machine, for instances?
And one version of one compiler with one set of compiler options.

I guess he's a professional.

[...]
I can think of 3 issues that prevent the the "blast struct all
over" concept from working: endianess, padding/alignment,
datatype sizes.
Representation in general. For floating point, it's a real
problem, even today. For integers, there is also at least one
machine on the market which uses 36 bit ones complement
integers, but it's not very wide spread, and many people can
afford to ignore it.

Just be aware of the restriction, and document it, so that some
maintenance programmer in the future doesn't get bitten. And
whatever you do, document all external formats, so a maintenance
programmer has a chance of implementing them on some future
material.
The first one is the party spoiler.
I'd say that the different representations are even worse.
(Note too that "endianness" isn't a good word, since it suggests
two possible arrangements. At least three are widespread.)

--
James Kanze (Gabi Software) email: ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 24 '07 #9

"James Kanze" <ja*********@gmail.comwrote in message
news:11*********************@p77g2000hsh.googlegro ups.com...
On Jun 23, 11:58 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:

(Note too that "endianness" isn't a good word, since it suggests
two possible arrangements. At least three are widespread.)

But that one is called "middle ENDIAN" right? If so, that makes "endianness"
seem OK.

John

Jun 26 '07 #10

"Ian Collins" <ia******@hotmail.comwrote in message
news:5e*************@mid.individual.net...
JohnQ wrote:
>>
"Dave Rahardja" <dr****************************@pobox.comwrote in
message news:9g********************************@4ax.com...
>>On Fri, 22 Jun 2007 22:55:38 -0700, Cagdas Ozgenc
<ca***********@gmail.com>
wrote:

When directly serializing C++ structures to a file with the standard
library functions giving the address of the data and length of
structure using the sizeof operator, do I risk portability because of
different compilers packing structures into different sizes or
components of this structure to different address boundaries (for
example placing in multiples of 4 on a 32bit system)? Once the file is
serialized, does the same code compiled by another compiler or even
the same compiler but a different version carry the risk of not
reading the contents properly?

Serialization is a complex issue that has so far eluded a truly general
solution, primarily because the needs of each developer varies so
much. There
are several "classes" of techniques, though. They are described quite
well in
the C++ FAQ Lite pages:
http://www.parashift.com/c++-faq-lit...alization.html

When I want to do general-purpose, cross-platform, binary-compatible
exchanges, I generally:

1. Pack data structures to the byte (using #pragmas, most times)

I think "no padding" may indeed be a feature that a new language could
exploit.
Not if the hardware doesn't support it, or even supports it with a
significant performance hit.
That would be a nice table to see: CPUs and the supported compiler/language
properties. Writing code that will run on all platforms is a waste of effort
when it is known that the software will never be deployed on those other
platforms. Layering on top of C++ to abstract away what needn't be bothered
with on a daily coding basis is the way to go. Just because C++ is "close to
the hardware" doesn't mean you have to program at that low level all of the
time.

John

Jun 26 '07 #11

"James Kanze" <ja*********@gmail.comwrote in message
news:11*********************@q69g2000hsb.googlegro ups.com...
On Jun 27, 4:03 pm, Dave Rahardja
<drahardja_atsign_pobox_dot_...@pobox.comwrote:
On Tue, 26 Jun 2007 03:30:54 -0500, "JohnQ"
<johnqREMOVETHISprogram...@yahoo.comwrote:
"James Kanze" <james.ka...@gmail.comwrote in message
news:11*********************@p77g2000hsh.googlegro ups.com...
On Jun 23, 11:58 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
(Note too that "endianness" isn't a good word, since it suggests
two possible arrangements. At least three are widespread.)
But that one is called "middle ENDIAN" right? If so, that makes
"endianness"
seem OK.
"I've never heard it called anything:-). It just is. (There are
also word addressed machines, where it makes no sense to speak
of "endian".)"

Well if saying "endian" suggests to would-be/will-be hardware designers that
there are only two, that would be a good thing. Even a better thing if they
choose to deprecate the less ubiquitous perversions.

John

Jun 28 '07 #12

"James Kanze" <ja*********@gmail.comwrote in message
news:11*********************@p77g2000hsh.googlegro ups.com...
On Jun 23, 11:58 am, "JohnQ" <johnqREMOVETHISprogram...@yahoo.com>
wrote:
"Cagdas Ozgenc" <cagdas.ozg...@gmail.comwrote in message
news:11*********************@w5g2000hsg.googlegrou ps.com...
When directly serializing C++ structures to a file with the standard
library functions giving the address of the data and length of
structure using the sizeof operator, do I risk portability because of
different compilers packing structures into different sizes or
components of this structure to different address boundaries (for
example placing in multiples of 4 on a 32bit system)? Once the file is
serialized, does the same code compiled by another compiler or even
the same compiler but a different version carry the risk of not
reading the contents properly?

Does your software/application require that much portability? Why struggle
with "write once, compile anywhere" if you're only targeting one platform
or
even only one machine, for instances?
"And one version of one compiler with one set of compiler options.

I guess he's a professional."

Seems like massochism rather than professionalism.

[...]

"(Note too that "endianness" isn't a good word, since it suggests
two possible arrangements. At least three are widespread.)"

Perhaps you'd like to update http://en.wikipedia.org/wiki/Endianness. (Yes,
they list 3 endian arrangements).

John

Jun 29 '07 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Bruno Jouhier | last post by:
Is there a way to serialize a graph of objects and get the output as an XML document, without first serializing to text and then parsing the text (I know how to do this but I find it really silly...
10
by: copx | last post by:
I want to save a struct to disk.... as plain text. At the moment I do it with a function that just writes the data using fprintf. I mean like this: fprintf(fp, "%d %d", my_struct.a, my_struct.b)...
1
by: Chris | last post by:
I'm having trouble Serializing a System.Data.DataColumn object. When I try to serialize it, I get the following: System.NotSupportedException: Cannot serialize member...
2
by: Tobias Zimmergren | last post by:
Hi, just wondering what serializing really is, and howto use it? Thanks. Tobias __________________________________________________________________ Tobias ICQ#: 55986339 Current ICQ status: +...
0
by: Olav Langeland | last post by:
I have a structure like this: public struct MyStruct { public long lVal; public short sVal; } The structure is used like this:
1
by: chaitanyag | last post by:
Hi, I have my data stored in a set of classes (or structs, doesn't matter), which I am trying to serialize. These classes are stored in an ArrayList, which serializes ok when all the objects in...
4
by: Jason Shohet | last post by:
We are thinking of serializing an object & passing it toseveral functions on web service. This will happen about 35 times as the page loads. The class has about 20 attributes. We're not sure...
2
by: Simon | last post by:
I'm developing a new application and want to use serialization as a way to save my data. But as I add new variables to my classes, how will serializing cope with that? For example, suppose I have...
61
by: Marty | last post by:
I am new to C# and to structs so this could be easy or just not possible. I have a struct defined called Branch If I use Branch myBranch = new Branch(i); // everything works If I use Branch...
0
by: jianzs | last post by:
Introduction Cloud-native applications are conventionally identified as those designed and nurtured on cloud infrastructure. Such applications, rooted in cloud technologies, skillfully benefit from...
0
by: abbasky | last post by:
### Vandf component communication method one: data sharing ​ Vandf components can achieve data exchange through data sharing, state sharing, events, and other methods. Vandf's data exchange method...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: fareedcanada | last post by:
Hello I am trying to split number on their count. suppose i have 121314151617 (12cnt) then number should be split like 12,13,14,15,16,17 and if 11314151617 (11cnt) then should be split like...
0
by: stefan129 | last post by:
Hey forum members, I'm exploring options for SSL certificates for multiple domains. Has anyone had experience with multi-domain SSL certificates? Any recommendations on reliable providers or specific...
0
Git
by: egorbl4 | last post by:
Скачал я git, хотел начать настройку, а там вылезло вот это Что это? Что мне с этим делать? ...
1
by: davi5007 | last post by:
Hi, Basically, I am trying to automate a field named TraceabilityNo into a web page from an access form. I've got the serial held in the variable strSearchString. How can I get this into the...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Aftab Ahmad | last post by:
Hello Experts! I have written a code in MS Access for a cmd called "WhatsApp Message" to open WhatsApp using that very code but the problem is that it gives a popup message everytime I clicked on...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.