unsigned 32 bit arithmetic type?

Robin Becker

Hi, just trying to avoid wheel reinvention. I have need of an unsigned 32 bit
arithmetic type to carry out a checksum operation and wondered if anyone had
already defined such a beast.

Our current code works with 32 bit cpu's, but is failing with 64 bit
comparisons; it's clearly wrong as we are comparing a number with a negated
number; the bits might drop off in 32 bits, but not in 64.
--
Robin Becker

Oct 25 '06 #1

Subscribe Reply

4164

Martin v. Löwis

Robin Becker schrieb:

Hi, just trying to avoid wheel reinvention. I have need of an unsigned
32 bit arithmetic type to carry out a checksum operation and wondered if
anyone had already defined such a beast.

Our current code works with 32 bit cpu's, but is failing with 64 bit
comparisons; it's clearly wrong as we are comparing a number with a
negated number; the bits might drop off in 32 bits, but not in 64.

Not sure what operations you are doing: In Python, bits never drop off
(at least not in recent versions).

If you need to drop bits, you need to do so explicitly, by using the
bit mask operations. I could tell you more if you'd tell us what
the specific operations are.

Regards,
Martin

Oct 25 '06 #2

Robin Becker

Martin v. Löwis wrote:

Robin Becker schrieb:
>Hi, just trying to avoid wheel reinvention. I have need of an unsigned
32 bit arithmetic type to carry out a checksum operation and wondered if
anyone had already defined such a beast.

Our current code works with 32 bit cpu's, but is failing with 64 bit
comparisons; it's clearly wrong as we are comparing a number with a
negated number; the bits might drop off in 32 bits, but not in 64.

Not sure what operations you are doing: In Python, bits never drop off
(at least not in recent versions).

If you need to drop bits, you need to do so explicitly, by using the
bit mask operations. I could tell you more if you'd tell us what
the specific operations are.

This code is in a contribution to the reportlab toolkit that handles TTF fonts.
The fonts contain checksums computed using 32bit arithmetic. The original
Cdefintion is as follows

ULONG CalcTableChecks um(ULONG *Table, ULONG Length)
{
ULONG Sum = 0L;
ULONG *Endptr = Table+((Length+ 3) & ~3) / sizeof(ULONG);

while (Table < EndPtr)
Sum += *Table++;
return Sum;
}

so effectively we're doing only additions and letting bits roll off the end.

Of course the actual semantics is dependent on what C unsigned arithmetic does
so we're relying on that being the same everywhere.

This algorithm was pretty simple in Python until 2.3 when shifts over the end of
ints started going wrong. For some reason we didn't do the obvious and just do
everything in longs and just mask off the upper bits. For some reason (probably
my fault) we seem to have accumulated code like

def _L2U32(L):
'''convert a long to u32'''
return unpack('l',pack ('L',L))[0]
if sys.hexversion> =0x02030000:
def add32(x, y):
"Calculate (x + y) modulo 2**32"
return _L2U32((long(x) +y) & 0xffffffffL)
else:
def add32(x, y):
"Calculate (x + y) modulo 2**32"
lo = (x & 0xFFFF) + (y & 0xFFFF)
hi = (x >16) + (y >16) + (lo >16)
return (hi << 16) | (lo & 0xFFFF)

def calcChecksum(da ta):
"""Calculat es TTF-style checksums"""
if len(data)&3: data = data + (4-(len(data)&3))* "\0"
sum = 0
for n in unpack(">%dl" % (len(data)>>2), data):
sum = add32(sum,n)
return sum

and also silly stuff like

def testAdd32(self) :
"Test add32"
self.assertEqua ls(add32(10, -6), 4)
self.assertEqua ls(add32(6, -10), -4)
self.assertEqua ls(add32(_L2U32 (0x80000000L), -1), 0x7FFFFFFF)
self.assertEqua ls(add32(0x7FFF FFFF, 1), _L2U32(0x800000 00L))

def testChecksum(se lf):
"Test calcChecksum function"
self.assertEqua ls(calcChecksum (""), 0)
self.assertEqua ls(calcChecksum ("\1"), 0x01000000)
self.assertEqua ls(calcChecksum ("\x01\x02\x03\ x04\x10\x20\x30 \x40"), 0x11223344)
self.assertEqua ls(calcChecksum ("\x81"), _L2U32(0x810000 00L))
_L2U32(0x800000 00L))

where while it might be reasonable to do testing it seems the tests aren't very
sensible eg what is -6 doing in a u32 test? This stuff just about works on a 32
bit machine, but is failing miserably on a 64bit AMD. As far as I can see I just
need to use masked longs throughout.

In a C extension I can still do the computation exfficiently on a 32bit machine,
but I need to do masking for a 64 bit machine.
--
Robin Becker

Oct 25 '06 #3

Martin v. Löwis

Robin Becker schrieb:

Of course the actual semantics is dependent on what C unsigned
arithmetic does so we're relying on that being the same everywhere.

Assuming that ULONG has the same width on all systems, the outcome
is actually mandated by the C standard: unsigned arithmetic is
defined to operate modulo (max_uint+1) (even if that is not a power
of two).

This algorithm was pretty simple in Python until 2.3 when shifts over
the end of ints started going wrong.

Actually, they start going *right* :-) Addition of two positive numbers
never gives a negative result, in mathematics.

where while it might be reasonable to do testing it seems the tests
aren't very sensible eg what is -6 doing in a u32 test? This stuff just
about works on a 32 bit machine, but is failing miserably on a 64bit
AMD. As far as I can see I just need to use masked longs throughout.

Exactly.

In a C extension I can still do the computation exfficiently on a 32bit
machine, but I need to do masking for a 64 bit machine.

Well, no. You just need to find a 32-bit unsigned integer type on the
64-bit machine. Typically, "unsigned int" should work fine (with
only the Cray being a notable exception, AFAIK). IOW, replace ULONG
with uint32_t wherever you really mean an unsigned 32-bit type,
then use stdint.h where available, else define it to unsigned int
(with a build-time or run-time test whether sizeof(unsigned int)==4).

Regards,
Martin

Oct 25 '06 #4

sturlamolden

Robin Becker wrote:

>
ULONG CalcTableChecks um(ULONG *Table, ULONG Length)
{
ULONG Sum = 0L;
ULONG *Endptr = Table+((Length+ 3) & ~3) / sizeof(ULONG);

while (Table < EndPtr)
Sum += *Table++;
return Sum;
}

Is this what you want?

import numpy
def CalcTableChecks um(Table, Length=None):
tmp = numpy.array(Tab le,dtype=numpy. uint32)
if Length == None: Length = tmp.size
endptr = ((Length+3) & ~3) / 4
return (tmp[0:endptr]).sum()

as nx
type(nx.array([1,2,3],dtype=nx.uint3 2)[0])

so effectively we're doing only additions and letting bits roll off the end.

Of course the actual semantics is dependent on what C unsigned arithmetic does
so we're relying on that being the same everywhere.

This algorithm was pretty simple in Python until 2.3 when shifts over the end of
ints started going wrong. For some reason we didn't do the obvious and just do
everything in longs and just mask off the upper bits. For some reason (probably
my fault) we seem to have accumulated code like

def _L2U32(L):
'''convert a long to u32'''
return unpack('l',pack ('L',L))[0]
if sys.hexversion> =0x02030000:
def add32(x, y):
"Calculate (x + y) modulo 2**32"
return _L2U32((long(x) +y) & 0xffffffffL)
else:
def add32(x, y):
"Calculate (x + y) modulo 2**32"
lo = (x & 0xFFFF) + (y & 0xFFFF)
hi = (x >16) + (y >16) + (lo >16)
return (hi << 16) | (lo & 0xFFFF)

def calcChecksum(da ta):
"""Calculat es TTF-style checksums"""
if len(data)&3: data = data + (4-(len(data)&3))* "\0"
sum = 0
for n in unpack(">%dl" % (len(data)>>2), data):
sum = add32(sum,n)
return sum

and also silly stuff like

def testAdd32(self) :
"Test add32"
self.assertEqua ls(add32(10, -6), 4)
self.assertEqua ls(add32(6, -10), -4)
self.assertEqua ls(add32(_L2U32 (0x80000000L), -1), 0x7FFFFFFF)
self.assertEqua ls(add32(0x7FFF FFFF, 1), _L2U32(0x800000 00L))

def testChecksum(se lf):
"Test calcChecksum function"
self.assertEqua ls(calcChecksum (""), 0)
self.assertEqua ls(calcChecksum ("\1"), 0x01000000)
self.assertEqua ls(calcChecksum ("\x01\x02\x03\ x04\x10\x20\x30 \x40"), 0x11223344)
self.assertEqua ls(calcChecksum ("\x81"), _L2U32(0x810000 00L))
_L2U32(0x800000 00L))

where while it might be reasonable to do testing it seems the tests aren't very
sensible eg what is -6 doing in a u32 test? This stuff just about works on a 32
bit machine, but is failing miserably on a 64bit AMD. As far as I can see I just
need to use masked longs throughout.

In a C extension I can still do the computation exfficiently on a 32bit machine,
but I need to do masking for a 64 bit machine.
--
Robin Becker

Oct 25 '06 #5

sturlamolden

Robin Becker wrote:

ULONG CalcTableChecks um(ULONG *Table, ULONG Length)
{
ULONG Sum = 0L;
ULONG *Endptr = Table+((Length+ 3) & ~3) / sizeof(ULONG);

while (Table < EndPtr)
Sum += *Table++;
return Sum;
}

Oct 25 '06 #6

Robin Becker

sturlamolden wrote:

import numpy
def CalcTableChecks um(Table, Length=None):
tmp = numpy.array(Tab le,dtype=numpy. uint32)
if Length == None: Length = tmp.size
endptr = ((Length+3) & ~3) / 4
return (tmp[0:endptr]).sum()

it's probably wonderful, but I don't think I can ask people to add numpy to the
list of requirements for reportlab :)

I used to love its predecessor Numeric, but it was quite large.

--
Robin Becker

Oct 25 '06 #7

sturlamolden

Robin Becker wrote:

it's probably wonderful, but I don't think I can ask people to add numpy to the
list of requirements for reportlab :)

Maybe NumPy makes it into the core Python tree one day. At some point
other Python users than die-hard scientists and mathematicans will
realise that for and while loops are the root of all evil when doing
CPU bound operations in an interpreted language. Array slicing and
vectorised statements can be faster by astronomical proportions.
Here is one example: http://tinyurl.com/y79zhc

This statement that required twenty seconds to execute

dim = size(infocbcr);

image = zeros(dim(1), dim(2));

for i = 1:dim(1)
for j = 1:dim(2)
cb = double(infocbcr (i,j,2));
cr = double(infocbcr (i,j,3));
x = [(cb-media_b); (cr-media_r)];
%this gives a mult of 1*2 * 2*2 * 2*1
image(i,j) = exp(-0.5* x'*inv(brcov)* x);
end
end

could be replaced with an equivalent condensed statement that only
required a fraction of a second:

image = reshape(exp(-0.5*sum(((chol( brcov)')\ ...
((reshape(doubl e(infocbcr(:,:, 2:3)),dim(1)*di m(2),2)')...
-repmat([media_b;media_r],1,dim(1)*dim(2 )))).^2)'),dim( 1),dim(2));

This was Matlab, but the same holds for Python and NumPy. The overhead
in the first code sniplet comes from calling the interpreter inside a
tight loop. That is why loops are the root of evilness when doung CPU
bound tasks in an interpreted language. I would think that 9 out of 10
tasks most Python users think require a C extension is actually more
easily solved with NumPy. This is old knowledge from the Matlab
community: even if you think you need a "MEX file" (that is, a C
extension for Matlab), you probably don't. Vectorize and it will be
fast enough.

Oct 25 '06 #8

Robin Becker

sturlamolden wrote:

Robin Becker wrote:

>it's probably wonderful, but I don't think I can ask people to add numpy to the
list of requirements for reportlab :)

.........

This was Matlab, but the same holds for Python and NumPy. The overhead
in the first code sniplet comes from calling the interpreter inside a
tight loop. That is why loops are the root of evilness when doung CPU
bound tasks in an interpreted language. I would think that 9 out of 10
tasks most Python users think require a C extension is actually more
easily solved with NumPy. This is old knowledge from the Matlab
community: even if you think you need a "MEX file" (that is, a C
extension for Matlab), you probably don't. Vectorize and it will be
fast enough.

I think you're preaching to the converted. The very first serious thing I did in
python involved a generational accounting model calculation that was translated
from matlab into Numeric/python. It ran about 10 times faster than matlab and
about 5 times faster than a matlab compiler.
--
Robin Becker

Oct 26 '06 #9

Similar topics

4005

time to get rid of unsigned?

by: John Harrison | last post by:

I knew that unsigned integral data types were the cause of scads of mostly spurious warning messages, but I didn't realise that they were a security risk too (see here http://www.securitytracker.com/alerts/2004/Feb/1009067.html). All for one measly extra bit. So has the time come for C++ to deprecate unsigned integral types? john

C / C++

2248

Three questions about signed/unsigned type representations

by: Rade | last post by:

Following a discussion on another thread here... I have tried to understand what is actually standardized in C++ regarding the representing of integers (signed and unsigned) and their conversions. The reference should be 3.9.1 (Fundamental types), and 4.7 (Integral conversions). It seems to me that the Standard doesn't specify: 1) The "value representation" of any of these types, except that (3.9.1/3) "... The range of nonnegative...

C / C++

5458

unsigned long long + int

by: Peter Ammon | last post by:

When I add an unsigned long long and an int, what type do each of the values get promoted to before the addition is performed? What is the type of the resulting expression? What occurs if the addition overflows or underflows? Thanks, -Peter

C / C++

5145

integral promotion, arithmetic conversion, value preserving, unsigned preserving???

by: TTroy | last post by:

Hello, I'm relatively new to C and have gone through more than 4 books on it. None mentioned anything about integral promotion, arithmetic conversion, value preserving and unsigned preserving. And K&R2 mentions "signed extension" everywhere. Reading some old clc posts, I've beginning to realize that these books are over-generalizing the topic. I am just wondering what the difference between the following pairs of terms are: 1)...

C / C++

36091

signed vs unsigned

by: LuB | last post by:

This isn't a C++ question per se ... but rather, I'm posting this bcs I want the answer from a C++ language perspective. Hope that makes sense. I was reading Peter van der Linden's "Expert C Programming: Deep C Secrets" and came across the following statement: "Avoid unnecessary complexity by minimizing your use of unsigned types. Specifically, don't use an unsigned type to represent a quantity just because it will never be negative...

C / C++

3955

Strange behaviour of long long and unsigned int

by: luke | last post by:

Hi everybody, please, can someone explain me this behaviour. I have the following piece of code: long long ll; unsigned int i = 2; ll = -1 * i; printf("%lld\n", ll);

C / C++

15358

casting unsigned integers

by: techie | last post by:

I have defined a number of unsigned integer types as follows: typedef unsigned char uint8; typedef unsigned short uint16; typedef unsigned int uint32; typedfe long long uint64; Is it necessary to explicitly cast from one type of unsigned integer type to another even though they do so implicitly?

C / C++

5049

Conversion rules between unsigned operands and signed operand

by: somenath | last post by:

Hi All, I am trying to undestand "Type Conversions" from K&R book.I am not able to understand the bellow mentioned text "Conversion rules are more complicated when unsigned operands are involved. The problem is that comparisons between signed and unsigned values are machine- dependent, because they depend on the sizes of the various integer types. For example, suppose that int is 16 bits

C / C++

2340

about operation of unsigned type

by: Steven | last post by:

Hello, everyone! I find a version of strcpy(), I don't know why it return the unsigned char value. Can I change it into return *s1-*s2? int strcmp(const char *s1, const char *s2) { while (*s1 == *s2) {

C / C++

9546

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

10491

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

10268

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

10247

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

10031

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

7571

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

6809

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

5467

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

2941

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General