469,922 Members | 2,172 Online

# Anyone recognize this numeric storage format - similar to "float", but not quite

We are working on a project to decipher a record structure of an old
accounting system that originates from the late80's mid-90's.
We have come across a number format that appears to be a "float" but
doesn't match any of the more standard implementations.
so we are hoping this is a recognizable number storage format with an
identifiable name AND pre-built conversion method
similiar to the "struct" modules available in python.

Here is what we have determined so far.

Example Number: 1234567890

This get stored on disk as 8 bytes, resulting in the following HEX
characters;
00 00 00 A4 05 2c 13 9f

If we changed the order so that it is "little Endian" we get;
9F 13 2c 05 A4 00 00 00

If the HEX is converted to binary it looks like;
10011111 00010011 00101100 00000101 10100100 00000000 000000000
00000000

If the example number 1234567890 is converted to binary it looks like;

10010011 00101100 00000101 1010010

To extract the example number, you need to do the following;
1) take the decimal value of the first byte and subtract 128
2) This tells you how many of the following bits to are significant and
3) Once the remaining bits are read, reverse the first bit of that
group (ie if it is a 0 make it a 1)
4) convert the result to decimal
.... and presto, the example number !

Using a fixed width font it is easy to see the match at the bit level;

10011111 00010011001011000000010110100100000000000000000000 000000
-------- 1001001100101100000001011010010
If you are interested, the following are three other examples;

Orig Hex: 00 00 00 60 92 96 72 A0
Actual Value: 4069954144

Orig Hex: 00 00 80 22 A3 26 3C A1
Actual Value: 6313297477
So ... does anyone recognize this ??
Is there a "built-in" conversion method in Python ??

Aug 24 '05 #1
5 1821
This appears to be a repost, perhaps not by the op but due to a glitch

<ge********@hotmail.com> wrote in message
We are working on a project to decipher a record structure of an old
accounting system that originates from the late80's mid-90's.
We have come across a number format that appears to be a "float" but
doesn't match any of the more standard implementations.
so we are hoping this is a recognizable number storage format with an
identifiable name AND pre-built conversion method
similiar to the "struct" modules available in python.

Here is what we have determined so far.

Example Number: 1234567890

This get stored on disk as 8 bytes, resulting in the following HEX
characters;
00 00 00 A4 05 2c 13 9f

If we changed the order so that it is "little Endian" we get;
9F 13 2c 05 A4 00 00 00

If the HEX is converted to binary it looks like;
10011111 00010011 00101100 00000101 10100100 00000000 000000000
00000000

If the example number 1234567890 is converted to binary it looks like;

10010011 00101100 00000101 1010010

To extract the example number, you need to do the following;
1) take the decimal value of the first byte and subtract 128
2) This tells you how many of the following bits to are significant and
3) Once the remaining bits are read, reverse the first bit of that
group (ie if it is a 0 make it a 1)
4) convert the result to decimal
... and presto, the example number !

Using a fixed width font it is easy to see the match at the bit level;

10011111 00010011001011000000010110100100000000000000000000 000000
-------- 1001001100101100000001011010010
If you are interested, the following are three other examples;

Orig Hex: 00 00 00 60 92 96 72 A0
Actual Value: 4069954144

Orig Hex: 00 00 80 22 A3 26 3C A1
Actual Value: 6313297477
So ... does anyone recognize this ??
Is there a "built-in" conversion method in Python ??

--
http://mail.python.org/mailman/listinfo/python-list

Aug 24 '05 #2
On 23 Aug 2005 19:04:45 -0700, ge********@hotmail.com wrote:
We are working on a project to decipher a record structure of an old
accounting system that originates from the late80's mid-90's.
We have come across a number format that appears to be a "float" but
doesn't match any of the more standard implementations.
so we are hoping this is a recognizable number storage format with an
identifiable name AND pre-built conversion method
similiar to the "struct" modules available in python.

Here is what we have determined so far.

Example Number: 1234567890

This get stored on disk as 8 bytes, resulting in the following HEX
characters;
00 00 00 A4 05 2c 13 9f

If we changed the order so that it is "little Endian" we get;
9F 13 2c 05 A4 00 00 00

If the HEX is converted to binary it looks like;
10011111 00010011 00101100 00000101 10100100 00000000 000000000
00000000

If the example number 1234567890 is converted to binary it looks like;

10010011 00101100 00000101 1010010

To extract the example number, you need to do the following;
1) take the decimal value of the first byte and subtract 128
2) This tells you how many of the following bits to are significant and
3) Once the remaining bits are read, reverse the first bit of that
group (ie if it is a 0 make it a 1)
4) convert the result to decimal
... and presto, the example number !

Using a fixed width font it is easy to see the match at the bit level;

10011111 00010011001011000000010110100100000000000000000000 000000
-------- 1001001100101100000001011010010
If you are interested, the following are three other examples;

Orig Hex: 00 00 00 60 92 96 72 A0
Actual Value: 4069954144

Orig Hex: 00 00 80 22 A3 26 3C A1
Actual Value: 6313297477
So ... does anyone recognize this ??
Is there a "built-in" conversion method in Python ??

Not looking too closely, but I recall something similar (although I suspect that the bit you
are "reversing" is a sign bit that shadows a known constant MSB 1 for non-zero numbers, and
shouldn't just be reversed):

Regards,
Bengt Richter
Aug 24 '05 #3
On Wed, 24 Aug 2005 04:10:07 -0400, "Terry Reedy" <tj*****@udel.edu> wrote:
<ge********@hotmail.com> wrote in message
We are working on a project to decipher a record structure of an old
accounting system that originates from the late80's mid-90's.
We have come across a number format that appears to be a "float" but
doesn't match any of the more standard implementations.
so we are hoping this is a recognizable number storage format with an
identifiable name AND pre-built conversion method
similiar to the "struct" modules available in python.
[...]

<moved from top-posted position>This appears to be a repost, perhaps not by the op but due to a glitch

</moved>
UIAM the more or less recent original you are thinking of turned out to be
straight IEEE double format, and I think this is not, though I think it looks
like one that was answered (by me ;-) quite a while ago (Dec 1 2003).

Regards,
Bengt Richter
Aug 24 '05 #4
Thanks Bengt for directing me to your previous post.
I think I agree with you on the "reversing bit" and the constant MSB.
In reworking my examples I was always changing the 0 to 1.

Aug 24 '05 #5
I am not sure if you are still watching this thread, but I seem to have
a bit of a problem with the code sample you so graciously provided.
It seems to work in all instances, except the original example I
provided (namely, 1234567890). On my system, the number 1234567890,
gets converted to 1234567895.5.

I made a few changes to your original program, but it is largely the
same with different test samples samples. Any thoughts ??

Sample Code Below ----------------------
# Conversion of Microsoft Binary Format numbers to Python Floats

import binascii as bn
import struct as st

data = [(1234567890,'000000AF052C139F'),
(4069954144,'00000060929672A0'),
( 33333.33, 'b047e17a54350290'),
( 1500.34, '7814ae47e18a3b8b'),
( 42345.00, '0000000000692590'),
]

def msd2float(bytes):
if sum(bytes) in [0,72,127]: #take out values that don't make
sense possible the NaN and Infinity ??
return 0.0
b = bytes[:]
sign = bytes[-2]&0x80
b[-2] |= 0x80 #hidden most sig bit in place of sign
exp = bytes[-1] - 0x80 - 56 #exponent offset
acc = 0L
for i,byte in enumerate(b[:-1]):
acc |=(long(byte)<<(i*8))
return (float(acc)*2.0**exp)*((1.,-1.)[sign!=0])

for line in data:
val = line[0]
binval = bn.unhexlify(line[1])
le_bytes = list(st.unpack('BBBBBBBB',binval))
test = msd2float(le_bytes)
print " In:",val, "\nOut:",test,"\n"

Sample Output ------------------------
C:/Python24/pythonw.exe -u "C:/pytest/dms/Test MBF.pyw"
In: 1234567890
Out: 1234567895.5

In: 4069954144
Out: 4069954144.0

In: 999999.99
Out: 999999.99

In: 88888.88
Out: 88888.88

In: 22222.22
Out: 22222.22

In: 33333.33
Out: 33333.33

In: 1500.34
Out: 1500.34

In: 42345.0
Out: 42345.0

Aug 25 '05 #6

### This discussion thread is closed

Replies have been disabled for this discussion.