473,659 Members | 2,929 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Binary data representation

Hi,

I currently writing a serialize/unserialize architecture. The read/write
function will read/write from a binary file.

My question is is there some sort on defined standart to use when
representing data type (int , int32, int64, double, string, etc....) ?
Thanks,


Jul 22 '05 #1
11 3133
Charles T. wrote:
I currently writing a serialize/unserialize architecture. The read/write
function will read/write from a binary file.
Why a binary file?
My question is is there some sort on defined standart to use when
representing data type (int , int32, int64, double, string, etc....) ?


No. There is not even a "standard" for what order bytes go inside an int.

The least heinous data format is XML. You can write very simple or very
complex data structures in it, and you can read those structures in a text
editor.

But XML can be a little obese. Some data formats are compressed XML.

--
Phlip
http://www.xpsd.org/cgi-bin/wiki?Tes...UserInterfaces
Jul 22 '05 #2
[snip]
The least heinous data format is XML. You can write very simple or
very complex data structures in it, and you can read those structures
in a text editor.

But XML can be a little obese. Some data formats are compressed XML.


I would reccomend this, also.
The game Age of Mythology uses XML compressed with zLib compatible
compression, and it generates very compact but easily decoded files.
You can get zLib here:
http://www.zlib.org

- Pete
Jul 22 '05 #3
Charles T. wrote:
Hi,

I currently writing a serialize/unserialize architecture. The read/write
function will read/write from a binary file.
There has been much discussion on Serialization and Persistence in
this newsgroup and news:comp.lang. c. Use a search engine and look
for some ideas.

My question is is there some sort on defined standart to use when
representing data type (int , int32, int64, double, string, etc....) ?
There is no standard, from platform to platform. On some platforms,
there may be no standards between OS versions or compiler versions.
For better portability, write out the data in a consistent form
(i.e. uint64 == 64 bits, little endian) and let the programs convert
the data into the native representation.

Remember, when serializing, that the size of a structure may not
be the sum of the size of its members. Compilers are allowed to
add "padding bytes" between members.

Pointers don't store well. There is a very small probability
that an OS will allocate a variable in the same place for each
execution of a program.

Since pointers don't store well, don't store strings as pointers.
Store text as <quantity, text> or <text, sentinel character>.

See section [35] of the C++ FAQ (about serialization):
http://www.parashift.com/c++-faq-lit...alization.html


Thanks,

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.l earn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book

Jul 22 '05 #4
Phlip wrote:
Charles T. wrote:

I currently writing a serialize/unserialize architecture. The read/write
function will read/write from a binary file.

Why a binary file?

My question is is there some sort on defined standart to use when
representin g data type (int , int32, int64, double, string, etc....) ?

No. There is not even a "standard" for what order bytes go inside an int.

The least heinous data format is XML. You can write very simple or very
complex data structures in it, and you can read those structures in a text
editor.

But XML can be a little obese. Some data formats are compressed XML.


If you're talking about a real-time (streaming) system, the XML overhead
may be too much of a price to pay.

In 1999 I built a binary XML format that could be "parsed" in a fraction
of the time. But for some systems, even this one was too expensive.

Jul 22 '05 #5
Gianni Mariani wrote:
[snip]

In 1999 I built a binary XML format that could be "parsed" in a
fraction of the time. But for some systems, even this one was too
expensive.


Would you mind posting your implementation? I would be interested in seeing
it.
Thanks!

- Pete
Jul 22 '05 #6
You might want to look (depending on your application area and on
whether you have time to learn it) at ASN.1, which is an ITU standard to
provide "a notation for defining data structures [and] a defined
(machine-independent) encoding for those data structures".

Have a glance at www-sop.inria.fr/rodeo/personnel/hoschka/asn1.html,
www.asn1.org, or google will bring back lots of links.

Geoff Macartney

Charles T. wrote:
Hi,

I currently writing a serialize/unserialize architecture. The read/write
function will read/write from a binary file.

My question is is there some sort on defined standart to use when
representing data type (int , int32, int64, double, string, etc....) ?
Thanks,


Jul 22 '05 #7
On Wed, 04 Feb 2004 12:21:16 -0500, Gianni Mariani wrote:
In 1999 I built a binary XML format that could be "parsed" in a fraction
of the time. But for some systems, even this one was too expensive.


No need to reinvent the wheel, have a look at ASN.1. Parsers abundand BTW.

M4

Jul 22 '05 #8
Martijn Lievaart wrote:
On Wed, 04 Feb 2004 12:21:16 -0500, Gianni Mariani wrote:

In 1999 I built a binary XML format that could be "parsed" in a fraction
of the time. But for some systems, even this one was too expensive.

No need to reinvent the wheel, have a look at ASN.1. Parsers abundand BTW.


ASN.1 is different - the binary format I'm talking about has a 1:1
correlation to XML. The format was simply more efficient to parse than
XML text - admitedly the XML parser I wrote was slower than molasses in
a blizzard ... :-)

Jul 22 '05 #9
Charles T. wrote:
Hi,

I currently writing a serialize/unserialize architecture. The read/write
function will read/write from a binary file.

My question is is there some sort on defined standart to use when
representing data type (int , int32, int64, double, string, etc....) ?


I have an application in which the compactness of binary representation
(as compared with, say, XML) is important, but where portability of that
binary file, regardless of endianess, is also important. My solution is
very simple: I just choose an endianess and stick with it, and make sure
to write/read one byte at a time to construct/reconstruct the data. It
works fine. The binary file is as compact as if I didn't care about
portability, and it works with all kinds of endianess. The reading and
the writing in principle takes a little longer because of the
disassembling/assembling that takes place here, but in practice it is
not a problem at all because of buffering. I just read, say, 1k at a
time and the problem disappears. Also, there are usually layers of
buffering involved anyway, in the OS, in the disk etc.

/David
Jul 22 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
9107
by: J. Campbell | last post by:
OK...I'm in the process of learning C++. In my old (non-portable) programming days, I made use of binary files a lot...not worrying about endian issues. I'm starting to understand why C++ makes it difficult to read/write an integer directly as a bit-stream to a file. However, I'm at a bit of a loss for how to do the following. So as not to obfuscate the issue, I won't show what I've been attempting ;-) What I want to do is the...
2
3625
by: geskerrett | last post by:
In the '80's, Microsoft had a proprietary binary structure to handle floating point numbers, In a previous thread, Bengt Richter posted some example code in how to convert these to python floats; http://groups.google.com/group/comp.lang.python/browse_thread/thread/42150ccc20a1d8d5/4aadc71be8aeddbe#4aadc71be8aeddbe I copied this code and modified it slightly, however, you will notice that for one of the examples, the conversion isn't...
3
9613
by: Tanuki | last post by:
Hi All: I encounter a programming problem recently. I need to read a binary file. I need to translate the binary data into useful information. I have the format at hand, like 1st byte = ID, next 4 byte (int) = serial number etc. The first problem is Big Endian/ Little Endian problem. I can decipher if the format is big or little endian. But got confuse as to how to decipher the data.
8
9512
by: Yeow | last post by:
hello, i was trying to use the fread function on SunOS and ran into some trouble. i made a simple test as follows: i'm trying to read in a binary file (generated from a fortran code) that contains the following three floating-point numbers: 1.0 2.0 3.0
6
2245
by: alice | last post by:
hi all, Can anybody please tell the advantages which the binary files offers over the character files. Thanks, Alice walls
9
4304
by: PengYu.UT | last post by:
Hi, I write the content of a in file "data" (in Sun Machine). Then I read "data" in both SunOS and linux. But the result is different. Do you know how to make it binary data portable. Best wishes, Peng
68
5210
by: vim | last post by:
hello everybody Plz tell the differance between binary file and ascii file............... Thanks in advance vim
14
544
by: Default User | last post by:
I work in software research and development and we're going to be doing some investigations into message traffic. This is for embedded systems. What we're looking at right now is XML encoded messages and want to look into binary or compressed XML, using network services (probably IP). As such, we'd like to find some libraries that would aid our investigation. I'm busily doing web and newsgroup searches, including some of the WBXML...
7
7012
by: smith4894 | last post by:
Hello all, I'm working on writing my own streambuf classes (to use in my custom ostream/isteam classes that will handle reading/writing data to a mmap'd file). When reading from the mmap file, I essentially have a char buffer in my streambuf class, that I'm registering with setp(). on an overflow() call, I simply copy the contents of the buffer into the mmap'd file via memcpy().
7
19212
by: elliotng.ee | last post by:
I have a text file that contains a header 32-bit binary. For example, the text file could be: %%This is the input text %%test.txt Date: Tue Dec 26 14:03:35 2006 00000000000000001111111111111111 11111111111111111111111111111111 00000000000000000000000000000000 11111111111111110000000000000000
0
8339
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8851
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8629
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7360
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6181
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5650
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4176
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
1982
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1739
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.