platform independent serialization of a long

RA Scheltema

hi all,
A small question about serializing and deserializing a long in a platform
independent manner. Can this be done with the following code ?:
char buf[4];
long val = 35456;

/* serialize ... on for example intel */
buf[0] = (unsigned char) ((val & 0xff000000) >> 24);
buf[1] = (unsigned char) ((val & 0x00ff0000) >> 16);
buf[2] = (unsigned char) ((val & 0x0000ff00) >> 8);
buf[3] = (unsigned char) ((val & 0x000000ff) >> 0);

/* deserialize ... on for example mac */
val = 0;
val = val | ((unsigned long) buf[0]) << 24;
val = val | ((unsigned long) buf[1]) << 16;
val = val | ((unsigned long) buf[2]) << 8;
val = val | ((unsigned long) buf[3]) << 0;
According to a collegue of mine, the & (in the first part of the code)
ensures that the least significant and most significant byte is always
intact on whatever platform the buffer is deserialized. I don't agree, any
suggestions ?
kind regards,
richard

Nov 14 '05 #1

Subscribe Reply

2994

tom_usenet

On Fri, 23 Jan 2004 12:37:23 +0100, "RA Scheltema"
<r.a.scheltem a[viral][s][p]@[m]dacolian.nl> wrote:

hi all,
A small question about serializing and deserializing a long in a platform
independent manner. Can this be done with the following code ?:
No, the code assumes that sizeof(long) == 4 (not true on some 64-bit
platforms) and that CHAR_BIT == 8 (not true on some other platforms)
and that all platforms store negative numbers in the same way (not
true on 1s complement platforms, etc.), and use all bits in the value
representation of long.
char buf[4];
long val = 35456;

/* serialize ... on for example intel */
buf[0] = (unsigned char) ((val & 0xff000000) >> 24);
buf[1] = (unsigned char) ((val & 0x00ff0000) >> 16);
buf[2] = (unsigned char) ((val & 0x0000ff00) >> 8);
buf[3] = (unsigned char) ((val & 0x000000ff) >> 0);

/* deserialize ... on for example mac */
val = 0;
val = val | ((unsigned long) buf[0]) << 24;
val = val | ((unsigned long) buf[1]) << 16;
val = val | ((unsigned long) buf[2]) << 8;
val = val | ((unsigned long) buf[3]) << 0;
According to a collegue of mine, the & (in the first part of the code)
ensures that the least significant and most significant byte is always
intact on whatever platform the buffer is deserialized. I don't agree, any
suggestions ?

Your collegue is correct. Note that the code assumes that all
platforms use the same type of longs, barring byte order. This isn't
true - e.g. sign-magnitude, 1s-complement, 16-bit chars, 64-bit longs,
etc. It is true on most 32-bit desktop platforms though, they have
8-bit chars, 32-bit longs and use 2s-complement for negative numbers.

Tom

C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html

Nov 14 '05 #2

Tom St Denis

"tom_usenet " <to********@hot mail.com> wrote in message
news:3d******** *************** *********@4ax.c om...

char buf[4];
long val = 35456;

/* serialize ... on for example intel */
buf[0] = (unsigned char) ((val & 0xff000000) >> 24);
buf[1] = (unsigned char) ((val & 0x00ff0000) >> 16);
buf[2] = (unsigned char) ((val & 0x0000ff00) >> 8);
buf[3] = (unsigned char) ((val & 0x000000ff) >> 0);

/* deserialize ... on for example mac */
val = 0;
val = val | ((unsigned long) buf[0]) << 24;
val = val | ((unsigned long) buf[1]) << 16;
val = val | ((unsigned long) buf[2]) << 8;
val = val | ((unsigned long) buf[3]) << 0;
According to a collegue of mine, the & (in the first part of the code)
ensures that the least significant and most significant byte is always
intact on whatever platform the buffer is deserialized. I don't agree, anysuggestions ?

Your collegue is correct. Note that the code assumes that all
platforms use the same type of longs, barring byte order. This isn't
true - e.g. sign-magnitude, 1s-complement, 16-bit chars, 64-bit longs,
etc. It is true on most 32-bit desktop platforms though, they have
8-bit chars, 32-bit longs and use 2s-complement for negative numbers.

I don't see this as something that can fail [regardless of how the actual
data is stored]. If you have a type which is at least 32-bits then
val&0xFF000000U L is always "defined". All this means is that on platforms
where they store integer types using fluxums and kawalachums instead of bits
they will have to EMULATE!

It's just like platforms with no FPU or support for 32-bit types. They have
to emulate them with stuff they do have.

So yes, you can portably store/load any integer type in an array of unsigned
chars.

Tom

Nov 14 '05 #3

Martijn Lievaart

On Fri, 23 Jan 2004 13:39:04 +0000, Tom St Denis wrote:

I don't see this as something that can fail [regardless of how the actual
data is stored]. If you have a type which is at least 32-bits then
val&0xFF000000U L is always "defined". All this means is that on platforms
where they store integer types using fluxums and kawalachums instead of bits
they will have to EMULATE!
No, you are assuming that all computers use the same layout for binary
numbers. That assumption is not true. Computers that use ones-complement
(do these exist in reality any more?) store numbers in a different way
than computers using two complement. If you use this method of
transporting between one- and two-complement machines, it will only work
for positive numbers.

Also, transporting this way when there are more than 32 bits will lose
information. Again, this will not work for nagative numbers, even in the
more common two's complement. And becuase the OP mentioned that this was
about transporting a long, there are machines out there that have 64 bit
long.
It's just like platforms with no FPU or support for 32-bit types. They have
to emulate them with stuff they do have.
Not a real comparison. We're talking about systems that have the required
integer types, but happen to store them differently. A better comparison
is to portably store/load floating point types. As the underlying
representations differ from implementation to implementation, this cannot
be done.
So yes, you can portably store/load any integer type in an array of unsigned
chars.

No, you can at most portably store/load positive integers. This is
guarenteed by both C and C++ IIRC. The C++ standard has some vague wording
oon the requirements on integer types that boil down to "unsigned integer
types must use normal binary encoding, positive integers stored in signed
integer types must have the same bit pattern as their unsigned
counterpart". I don't have the C standard, but I know it has a slightly
different wording that basically boils down to the same.

Now in practice, all computers nowadays use two's complement, so in
practice this will work
- between machines that use 32-bit longs.
- when your values are positive and have no more than 32 bits (provided
you zeroed out the extra bits beforehand).

HTH
M4

Nov 14 '05 #4

Martijn Lievaart

On Fri, 23 Jan 2004 12:37:23 +0100, RA Scheltema wrote:

hi all,
A small question about serializing and deserializing a long in a platform
independent manner. Can this be done with the following code ?:
char buf[4];
long val = 35456;

/* serialize ... on for example intel */
buf[0] = (unsigned char) ((val & 0xff000000) >> 24);
buf[1] = (unsigned char) ((val & 0x00ff0000) >> 16);
buf[2] = (unsigned char) ((val & 0x0000ff00) >> 8);
buf[3] = (unsigned char) ((val & 0x000000ff) >> 0);

/* deserialize ... on for example mac */
val = 0;
val = val | ((unsigned long) buf[0]) << 24;
val = val | ((unsigned long) buf[1]) << 16;
val = val | ((unsigned long) buf[2]) << 8;
val = val | ((unsigned long) buf[3]) << 0;
According to a collegue of mine, the & (in the first part of the code)
ensures that the least significant and most significant byte is always
intact on whatever platform the buffer is deserialized. I don't agree, any
suggestions ?

See my other reply in this thread on why whis is a bad idea. It only works
in some situations.

Three other solutions come to mind.

- If your platform has htonl/ntohl (most do), it is an easy way to achieve
the same and much more portably.

- Use integer arithmetic instead of bitwise operations.

- My favorite: transport as text, not binary.

HTH,
M4

Nov 14 '05 #5

Tom St Denis

"Martijn Lievaart" <m@remove.this. part.rtij.nl> wrote in message
news:pa******** *************** ****@remove.thi s.part.rtij.nl. ..

On Fri, 23 Jan 2004 13:39:04 +0000, Tom St Denis wrote:

I don't see this as something that can fail [regardless of how the actual data is stored]. If you have a type which is at least 32-bits then
val&0xFF000000U L is always "defined". All this means is that on platforms where they store integer types using fluxums and kawalachums instead of bits they will have to EMULATE!
No, you are assuming that all computers use the same layout for binary
numbers. That assumption is not true. Computers that use ones-complement
(do these exist in reality any more?) store numbers in a different way
than computers using two complement. If you use this method of
transporting between one- and two-complement machines, it will only work
for positive numbers.

I don't see that as being valid. "unsigned long" must have at least 32-bits
of precision.

By your logic

unsigned long x, y;

y = 255UL*256UL*256 UL*256UL;
x = some_func();
x &= y;
x >>= 24;

Is undefined because x/y may not be a 2s complement?

WRONG. The value of X will lie in 0..255 and will be the bits 23..31 of the
return of some_func(). In reality this "might use walazaums for bits"
comes into play if you memcpy or otherwise directly copy. So on a 1s
complement machine it would have to emulate as appropriate.

For example, ARMv4 processors don't have FPUs. By your logic

float x = 4.0;

is undefined?
Also, transporting this way when there are more than 32 bits will lose
information. Again, this will not work for nagative numbers, even in the
more common two's complement. And becuase the OP mentioned that this was
about transporting a long, there are machines out there that have 64 bit
long.

Yeah you have to specify precision. However, many algorithms use fixed
precision (re: block ciphers).

Tom

Nov 14 '05 #6

Martijn Lievaart

On Fri, 23 Jan 2004 15:01:06 +0000, Tom St Denis wrote:

No, you are assuming that all computers use the same layout for binary
numbers. That assumption is not true. Computers that use ones-complement
(do these exist in reality any more?) store numbers in a different way
than computers using two complement. If you use this method of
transporting between one- and two-complement machines, it will only work
for positive numbers.
I don't see that as being valid. "unsigned long" must have at least 32-bits
of precision.

Yes.

By your logic

unsigned long x, y;
Hey, where did that unsigned creep in? Maybe you want to reread what I
said.

y = 255UL*256UL*256 UL*256UL;
x = some_func();
x &= y;
x >>= 24;

Is undefined because x/y may not be a 2s complement?
I said no such thing.

WRONG. The value of X will lie in 0..255 and will be the bits 23..31 of
I'm not wrong, you are reading wrong. And please loose the caps, it's
annoying.
the return of some_func(). In reality this "might use walazaums for
bits" comes into play if you memcpy or otherwise directly copy. So on a
1s complement machine it would have to emulate as appropriate.
There is nothing to emulate on a ones complement machine. It can just use
it native types, which happen to have different representations for
negative numbers than the more common twos complement. Completely valid
in both C and C++, no walazaums involved anywhere.

You might want to read up on what happens when converting negative signed
long values to unsigned long, because that is exactily what we are facing
here.

For example, ARMv4 processors don't have FPUs. By your logic

float x = 4.0;

is undefined?
What twist of logic are you trying to achieve here? I'm positively baffled
by your conlusion, I cannot follow you.

Also, transporting this way when there are more than 32 bits will lose
information. Again, this will not work for nagative numbers, even in
the more common two's complement. And becuase the OP mentioned that
this was about transporting a long, there are machines out there that
have 64 bit long.

Yeah you have to specify precision. However, many algorithms use fixed
precision (re: block ciphers).

Obvious. When transporting between machines you'll always have to specify
the valid ranges.

M4

Nov 14 '05 #7

tom_usenet

On Fri, 23 Jan 2004 15:01:06 GMT, "Tom St Denis" <to********@iah u.ca>
wrote:

"Martijn Lievaart" <m@remove.this. part.rtij.nl> wrote in message
news:pa******* *************** *****@remove.th is.part.rtij.nl ...
On Fri, 23 Jan 2004 13:39:04 +0000, Tom St Denis wrote:
>
> I don't see this as something that can fail [regardless of how theactual > data is stored]. If you have a type which is at least 32-bits then
> val&0xFF000000U L is always "defined". All this means is that onplatforms > where they store integer types using fluxums and kawalachums instead ofbits > they will have to EMULATE!
No, you are assuming that all computers use the same layout for binary
numbers. That assumption is not true. Computers that use ones-complement
(do these exist in reality any more?) store numbers in a different way
than computers using two complement. If you use this method of
transporting between one- and two-complement machines, it will only work
for positive numbers.

I don't see that as being valid. "unsigned long" must have at least 32-bits
of precision.

He just said it is valid for positive numbers! What has "unsigned
long" got to do with negative numbers?

By your logic

unsigned long x, y;
Where did "unsigned long" come from? The OP was using "long".

y = 255UL*256UL*256 UL*256UL;
x = some_func();
x &= y;
x >>= 24;

Is undefined because x/y may not be a 2s complement?

2s complement doesn't apply to unsigned types. It is a convenient way
of representing negative numbers in binary.

Tom

C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html

Nov 14 '05 #8

Dan Pop

In <40************ *********@news. xs4all.nl> "RA Scheltema" <r.a.scheltem a[viral][s][p]@[m]dacolian.nl> writes:

A small question about serializing and deserializing a long in a platform
independent manner. Can this be done with the following code ?:
It still assumes that longs are 32-bit entities (4 bytes x 8 bits) on
both platforms. There is no easy way of eliminating this assumption,
short of using a textual representation of the value, instead of a binary
one, i.e. serialise with sprintf and deserialise with sscanf and convert
the native strings to and from BCD (to also remove the assumption that
both platforms use the same character set).
char buf[4];
MUST be unsigned char.
long val = 35456;
MUST be either an unsigned long or contain a positive value. Otherwise,
see below.
/* serialize ... on for example intel */
buf[0] = (unsigned char) ((val & 0xff000000) >> 24);
buf[1] = (unsigned char) ((val & 0x00ff0000) >> 16);
buf[2] = (unsigned char) ((val & 0x0000ff00) >> 8);
buf[3] = (unsigned char) ((val & 0x000000ff) >> 0);
All the casts to unsigned char are superfluous.
/* deserialize ... on for example mac */
val = 0;
val = val | ((unsigned long) buf[0]) << 24;
If the original value was negative, additional assumptions are needed:
both platforms use the same representation for negative values and the
conversion of an unsigned long value that cannot be represented by a long
preserves the bit pattern. Both assumptions are reasonable, but neither
is guaranteed by the language.
val = val | ((unsigned long) buf[1]) << 16;
val = val | ((unsigned long) buf[2]) << 8;
val = val | ((unsigned long) buf[3]) << 0;

According to a collegue of mine, the & (in the first part of the code)
ensures that the least significant and most significant byte is always
intact on whatever platform the buffer is deserialized. I don't agree, any
suggestions ?

He is perfectly right. Because you're operating on the full
representation of the value, you can be sure that buf[0] will contain
the most significant byte of the value, regardless of the byte order.
And because the value is reconstructed using arithmetic operations,
you can also be sure that the result is correct, again regardless of the
byte order. But getting the byte order right is not enough if you need
to deal with negative values, too.

The proper handling of negative values without the additional assumptions
mentioned above is easy if the implementation also supports long long's
or some other form of integer that provides more than 32 bits. The
first step requires assigning val to uval, an unsigned long variable.
The result is independent of the way nagative values are represented.
Serialise and deserialise uval.

typedef long long big_t;

if ((uval & 0x80000000) != 0)
val = (big_t)uval - (big_t)ULONG_MA X - 1;
else
val = uval;

As you can see, doing the job right even in not a 100% platform
independent way is more complex than just taking care of the byte order.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de

Nov 14 '05 #9

Sean Kelly

You might also want to look at the socket calls htonl() and ntohl().
Sean

Nov 14 '05 #10

Similar topics

437

platform independent serialization of a long

by: RA Scheltema | last post by:

hi all, A small question about serializing and deserializing a long in a platform independent manner. Can this be done with the following code ?: char buf; long val = 35456;

C / C++

4605

Best IDE platform: Eclipse?

by: gamaron | last post by:

Summary --------- I'm looking for a C++ IDE. If I'm going to learn something, I prefer it to be free-of-charge (preferably and open-source base), extensible (read: language independent, plug-in capable, etc), and platform independent. A broad community of support would be nice, too.

C / C++

2144

cross-platform programs

by: John Salerno | last post by:

Bear with me, but I've been reading a lot about how the .NET languages are platform independent, and I assume this means a program written in C# can be run on a Unix or Mac machine. If this assumption is wrong, then nevermind! :) But if it's true, what will be necessary to run these programs on a Mac, for example? I know they would need an equivalent to the .NET Framework, but does something like this exist yet, or is the "platform...

C# / C Sharp

3926

Platform Independence

by: Andy | last post by:

Hi, I have read that 'C' is platform-independent and portable. I can'tsee much a difference between the two terms. Could anyone differentiate the two? Also is the statement actually true? Thanks Andy

C / C++

2473

Truly platform-independent DB access in Python?

by: bobrik | last post by:

Hello, I am using the Python DB API for access to MySQL. But it is not platform-independent - I need a module not included in Python by default - python-mysql, and it uses a compiled binary _mysql.so. So it is not platform-independent because for each web-server on different platform, I would have to download it and extra compile it specifically for that platform. Do you know of any Python solution for MySQL access that is 100%...

Python

1251

Version independent serialization of generics

by: Michael | last post by:

I have a problem with serialization in my project. Serialization is working fine so far but when I increase the version of the application in the AssemblyInfo.cs I get an exception. So I created a test project and found out that the problems is related to generic lists. When I serialize a List<T> with version 1.0.0.0 of the application and deserialize it with version 1.0.0.1 of the application I get an exception stating: Could not load...

.NET Framework

2367

DataSet and Platform Dependency

by: Maxwell2006 | last post by:

Hi, We are having a debate over using DataSet as return value type for web services. The problem is that we don't know whether Java applications can use DataSet

.NET Framework

2288

Launching an independent Python program in a cross-platform way (including mac)

by: =?iso-8859-1?B?QW5kcuk=?= | last post by:

I would like to find out how I can launch an independent Python program from existing one in a cross-platform way. The result I am after is that a new terminal window should open (for io independent of the original script). The following seems to work correctly under Ubuntu and Windows ... but I haven't been able to find a way to make it work under Mac OS. def exec_external(code, path): """execute code in an external process

Python

7504

Platform independent code?

by: saneman | last post by:

I have read that Python is a platform independent language. But on this page: http://docs.python.org/tut/node4.html#SECTION004220000000000000000 it seems that making a python script executable is platform dependant: 2.2.2 Executable Python Scripts On BSD'ish Unix systems, Python scripts can be made directly executable, like shell scripts, by putting the line

Python

3699

Platform independent types

by: joseph cook | last post by:

The recent discussion on PIMPL and opaque pointers made me think about a construct I have seen commonly with respect to defining types on a platform by platform basis. Unlike structs, obviously types needs to be 'complete' in header files for ease of use. What I mean is something like the following... #ifdef PLATFORM1 typedef unsigned long unsigned64; .... #elseif PLATFORM2

C / C++

9568

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

9398

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

10160

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

10007

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

8831

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

5275

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

3924

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

3531

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

2805

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General