473,387 Members | 1,596 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

A salutary tale about alignment and undefined behaviour

Having just upgraded a compiler, I found that the Internet stack in one of
our systems no longer functioned - it appeared that the compiler was not
compiling it correctly.

Closer inspection showed that it was actually a really nasty example of
undefined behaviour being invoked.

#define ETHER_ADDR_LEN 6

struct ether_header {
u_char ether_dhost[ETHER_ADDR_LEN];
u_char ether_shost[ETHER_ADDR_LEN];
u_short ether_type;
};

struct sockaddr {
u_short sa_family;
char sa_data[14];
};

{
u_char edst[ETHER_ADDR_LEN];
struct sockaddr *dst;
struct ether_header *eh;
//...

eh = (struct ether_header *)dst->sa_data;
memcpy(edst, eh->ether_dhost, sizeof (edst));
//...
}

This always used to work, despite there being an alignment problem - all
structures on this platform are word (32-bit) aligned, so eh is
incorrectly aligned, invoking undefined behaviour.

Looking at it, it's not clear how it could actually go wrong - you'd
instinctively assume that memcpy will work regardless of the alignment.

But the new compiler recognised memcpy, and decided it might like to inline
it. Further, it knew that the destination was a local array, and thus was
word-aligned on the stack with two bytes of padding after it. It further
"knew" that the source was at the start of a structure, and hence was
word-aligned. It thus decided that the most efficient code to output was an
in-line two-word (8-byte) copy from eh to edst. But eh was not actually
word-aligned, so the load-word instructions failed (silently reading the
wrong data).

The whole thing just goes to show that with today's modern optimising
compilers, you never know when undefined behaviour might bite you. Even
things that instinctively "should work anyway" can go horribly wrong.

--
Kevin Bracey, Principal Software Engineer
Tematic Ltd Tel: +44 (0) 1223 503464
182-190 Newmarket Road Fax: +44 (0) 1728 727430
Cambridge, CB5 8HE, United Kingdom WWW: http://www.tematic.com/
Nov 14 '05 #1
5 1326
On Fri, 18 Feb 2005 12:50:26 GMT, Kevin Bracey
<ke**********@tematic.com> wrote:
eh = (struct ether_header *)dst->sa_data;
This is the bit which bit you, not:
memcpy(edst, eh->ether_dhost, sizeof (edst));

This always used to work, despite there being an alignment problem - all
structures on this platform are word (32-bit) aligned, so eh is
incorrectly aligned, invoking undefined behaviour.
Exactly. On some machines the assignment to eh might cause the pointer
to have bits truncated, or even cause a trap because it's an invalid
pointer. Or the dereferencing as a parameter to memcpy might cause a
trap. Or demons might fly out of your nose.

In spite of it being a very common thing to do in Internet code, which
was mostly written for 'safe' machines where all that was needed was
some endian conversions (that sort of code is in several books
describing how to use sockets, for instance).
The whole thing just goes to show that with today's modern optimising
compilers, you never know when undefined behaviour might bite you. Even
things that instinctively "should work anyway" can go horribly wrong.


Optimisation can cause a lot of problems. Worse than yours is when the
compiler gets its optimisation wrong, because that can be very hard to
track down or reproduce in a small code segment to show to the
developers, especially if your code is correct and similar code works
fine in other places.

Incidentally, which compiler (and version) was it which bit you? It may
well be worth letting the distributors of it know so that they can put
out a warning, you're probably not the only user who is going to get
bitten with network stack code...

Chris C
Nov 14 '05 #2

"Kevin Bracey" <ke**********@tematic.com> schreef in bericht
news:0e****************@tematic.com...
{
u_char edst[ETHER_ADDR_LEN];
struct sockaddr *dst;
struct ether_header *eh;
//...

eh = (struct ether_header *)dst->sa_data;
memcpy(edst, eh->ether_dhost, sizeof (edst));
//...
}

This always used to work, despite there being an alignment problem - all
structures on this platform are word (32-bit) aligned, so eh is
incorrectly aligned, invoking undefined behaviour.


What exactly is wrong with the code then? Because ETHER_ADDR_LEN is not of
the correct size dst and eh get incorrectly aligned or something?
Nov 14 '05 #3
"Servé La" <ie****@microsoft.com> wrote in
<11*************@corp.supernews.com>:

"Kevin Bracey" <ke**********@tematic.com> schreef in bericht
news:0e****************@tematic.com...
{
u_char edst[ETHER_ADDR_LEN];
struct sockaddr *dst;
struct ether_header *eh;
//...

eh = (struct ether_header *)dst->sa_data;
memcpy(edst, eh->ether_dhost, sizeof (edst));
//...
}

This always used to work, despite there being an alignment problem - all
structures on this platform are word (32-bit) aligned, so eh is
incorrectly aligned, invoking undefined behaviour.


What exactly is wrong with the code then? Because ETHER_ADDR_LEN is not of
the correct size dst and eh get incorrectly aligned or something?


The problem is the cast: it lies about the type of sa_data. I saw a similar
problem recently, in code that extracts float 4-vectors (16 bytes, 16 byte
alignment on this platform) from a data packet. The code inappropriately
casted a pointer into the packet buffer to a pointer to the vector type,
fooling the compiler into assuming 16-byte alignment, and "over-optimising"
a subsequent memcpy.

[I posted a similar reply earlier but it seems to have been lost in the
ether]

-- Mat.

Nov 14 '05 #4
Kevin Bracey wrote:
Having just upgraded a compiler, I found that the Internet stack in one of our systems no longer functioned - it appeared that the compiler was not compiling it correctly.

Closer inspection showed that it was actually a really nasty example of undefined behaviour being invoked.
struct sockaddr {
u_short sa_family;
char sa_data[14];
};
eh = (struct ether_header *)dst->sa_data;

This always used to work, despite there being an alignment problem - all structures on this platform are word (32-bit) aligned, so eh is
incorrectly aligned, invoking undefined behaviour.


Yes, you should have alarm bells ringing in your head whenever
you see a cast to a pointer to non-char.. and fire sirens
if the cast is from a pointer to char.

Still, I would have expected a compiler to warn about that.
Investigate the available compiler warnings to see if you
can turn something on that warns about alignment problems.

Nov 14 '05 #5
"Old Wolf" <ol*****@inspire.net.nz> writes:
Kevin Bracey wrote: [snip alignment problem, masked by cast]
Still, I would have expected a compiler to warn about that.


I wouldn't. The cast says to the compiler "Don't annoy me with
warnings about this - I know exactly what I'm doing."

My ideal compiler, however, would warn about any cast (other than
members of a select group, including _some_ arguments to varadic
functions) not accompanied by a suitable explanatory comment.

mlp
Nov 14 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

36
by: Bhalchandra Thatte | last post by:
I am allocating a block of memory using malloc. I want to use it to store a "header" structure followed by structs in my application. How to calculate the alignment without making any assumption...
3
by: aneesh | last post by:
Hi All, Im having a program int main() { char* p = "hello"; printf("%d",*((int*)p)); } char* is acutally non aligned and we r casting it to int* and
25
by: Nitin Bhardwaj | last post by:
Well, i'm a relatively new into C( strictly speaking : well i'm a student and have been doing & studying C programming for the last 4 years).....and also a regular reader of "comp.lang.c" I...
23
by: Ken Turkowski | last post by:
The construct (void*)(((long)ptr + 3) & ~3) worked well until now to enforce alignment of the pointer to long boundaries. However, now VC++ warns about it, undoubtedly to help things work on 64...
67
by: S.Tobias | last post by:
I would like to check if I understand the following excerpt correctly: 6.2.5#26 (Types): All pointers to structure types shall have the same representation and alignment requirements as each...
4
by: Dharma | last post by:
hi, struct a { byte d; byte buf; }text; text.buf = 'A' text.buf = '\0'
13
by: aegis | last post by:
The following was mentioned by Eric Sosman from http://groups.google.com/group/comp.lang.c/msg/b696b28f59b9dac4?dmode=source "The alignment requirement for any type T must be a divisor of...
10
by: Kies Lee | last post by:
Hi everyone! I have a problem about the bus error. main(void) { union { char a; int i; } u; int *p = (int*)&(u.a); *p = 17; } Somebook said that it would cause a bus error for the alignment...
12
by: Yevgen Muntyan | last post by:
Hey, Consider the following code: #include <stdlib.h> #define MAGIC_NUMBER 64 void *my_malloc (size_t n) { char *result = malloc (n + MAGIC_NUMBER);
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.