By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,413 Members | 1,023 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,413 IT Pros & Developers. It's quick & easy.

memcmp for <

P: n/a
I have a struct like

struct MyStruct
{
int a;
int b;
int c:
bool d;
bool e;
}

I want to insert such a struct in a map. I understand I can declare the
< operator for such a struct for lexicographical compare like

x.a < y.a || !(y.a < x.a) && ( x.b < y.b || !(y.b < x.b) && x.c < y.c
.............

or a simple one like this

memcmp(&x,&y,sizeof(MyStruct)) < 0;

This seems to work if I memset and fill the sizeof(MyStruct) with
zeroes in the constructor before I assign a b and c etc. This will take
care of any padding that the compiler adds.

My question is whether the second approach is portable? If so do I
really need the memset ? Does the standard say anything about
initializing the padding bits?

Raj

Jul 23 '05 #1
Share this Question
Share on Google+
16 Replies


P: n/a
ra******@hotmail.com wrote:
I have a struct like

struct MyStruct
{
int a;
int b;
int c:
bool d;
bool e;
}

I want to insert such a struct in a map. I understand I can declare the
< operator for such a struct for lexicographical compare like

x.a < y.a || !(y.a < x.a) && ( x.b < y.b || !(y.b < x.b) && x.c < y.c
............

or a simple one like this

memcmp(&x,&y,sizeof(MyStruct)) < 0;

This seems to work if I memset and fill the sizeof(MyStruct) with
zeroes in the constructor before I assign a b and c etc. This will take
care of any padding that the compiler adds.

My question is whether the second approach is portable? If so do I
really need the memset ? Does the standard say anything about
initializing the padding bits?


1) Yes. 2) Yes. 3) They are uninitialised unless your object has static
storage duration. Beware, though that as soon as your MyStruct ceases
being a POD (because you added a private section or a virtual function or
something of the sort), use of memset and memcpy on it becomes undefined.

V
Jul 23 '05 #2

P: n/a
ra******@hotmail.com schrieb:
I have a struct like

struct MyStruct
{
int a;
int b;
int c:
bool d;
bool e;
}

I want to insert such a struct in a map. I understand I can declare the
< operator for such a struct for lexicographical compare like

x.a < y.a || !(y.a < x.a) && ( x.b < y.b || !(y.b < x.b) && x.c < y.c
............

or a simple one like this

memcmp(&x,&y,sizeof(MyStruct)) < 0;

This seems to work if I memset and fill the sizeof(MyStruct) with
zeroes in the constructor before I assign a b and c etc. This will take
care of any padding that the compiler adds.

My question is whether the second approach is portable? If so do I
really need the memset ? Does the standard say anything about
initializing the padding bits?


Sorry, can't tell you about the padding bits, but it's still not
portable because of endianess issues:

struct Foo
{
int a;
};

Foo f1 = { 1 };
Foo f2 = { 256 };

On big-endian machines a memcmp() compare will work correctly. On a
little-endian machine with 32-bit ints, f1 will contain the byte
sequence 0x01 0x00 0x00 0x00 (minus padding) and f2 will contain 0x00
0x01 0x00 0x00. memcmp() will report f1 as greater than f2.

Cheers,
Malte
Jul 23 '05 #3

P: n/a
You mentioned something about private section. Could you elaborate how
that would change things ?

If the struct carried a vtable pointer or had NON POD could i just
overload new and memset before i call the constructor ?

Raj

Jul 23 '05 #4

P: n/a
<ra******@hotmail.com> wrote in message
news:11**********************@f14g2000cwb.googlegr oups.com...
I have a struct like

struct MyStruct
{
int a;
int b;
int c:
bool d;
bool e;
}

I want to insert such a struct in a map. I understand I can declare the
< operator for such a struct for lexicographical compare like

x.a < y.a || !(y.a < x.a) && ( x.b < y.b || !(y.b < x.b) && x.c < y.c
Note that you can also use the (IMO better) following form:
return (x.a!=y.a) ? x.a<y.a
: (x.b!=y.b) ? x.b<y.b
: (x.c!=y.c) ? x.c<y.c
: (x.d!=y.d) ? x.d<y.d : x.e < y.e;
or a simple one like this

memcmp(&x,&y,sizeof(MyStruct)) < 0;

This seems to work if I memset and fill the sizeof(MyStruct) with
zeroes in the constructor before I assign a b and c etc. This will take
care of any padding that the compiler adds.


But not of endianness and other binary representation issues.
Really, I don't think that saving a few statements is worth the
loss of portability. Plus the explicit form gives you much more
flexibility. So why bother?
--
http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
Jul 23 '05 #5

P: n/a
ra******@hotmail.com wrote:
You mentioned something about private section. Could you elaborate how
that would change things ?
The layout of an object is only mandated within the same access specifier
section. So, as soon as you introduce private or protected non-static
data members, the struct is not a POD any more, and I am not really sure
why that is, but the Standard makes a point of defining POD-struct that
way.
If the struct carried a vtable pointer or had NON POD could i just
overload new and memset before i call the constructor ?


I am not sure what you mean by "overload memset", but yes, essentially,
your task would be to gain control over the "padding bytes" by, for
example, eliminating them using compiler-specific means.

Let me ask a rhetorical questions, though. If you are prepared to give it
overloaded 'new' and 'memset' (let's suppose it's possible somehow), why
don't you just overload the operator < ?

V
Jul 23 '05 #6

P: n/a
Malte Starostik wrote:
[...]
Sorry, can't tell you about the padding bits, but it's still not
portable because of endianess issues:

struct Foo
{
int a;
};

Foo f1 = { 1 };
Foo f2 = { 256 };

On big-endian machines a memcmp() compare will work correctly. On a
little-endian machine with 32-bit ints, f1 will contain the byte
sequence 0x01 0x00 0x00 0x00 (minus padding) and f2 will contain 0x00
0x01 0x00 0x00. memcmp() will report f1 as greater than f2.


But won't it report f1 consistently greater than f2? The purpose of
using memcmp (as I understood it) was to forgo the real operator < and
the memberwise comparison just to see if they were different.

V
Jul 23 '05 #7

P: n/a
"Victor Bazarov" <v.********@comAcast.net> wrote in message
news:N_*******************@newsread1.mlpsca01.us.t o.verio.net...
My question is whether the second approach is portable? If so do I
really need the memset ? Does the standard say anything about
initializing the padding bits?


1) Yes. 2) Yes. 3) They are uninitialised unless your object has static
storage duration.


Beg pardon? Memcmp portable? I don't see why. As a simple example, I
can't think of any place in the standard that requires all equal bool values
to have the same representation. In other words, I don't see anything wrong
with an implementation that stores a byte in a bool and considers zero to be
false and any nonzero value to be true. Under such an implementation,
memcmp might yield unequal for two values that should be considered equal.
Jul 23 '05 #8

P: n/a
I dont care about that as I want just keep them in a set. If A < B I
just want to make sure A < B all the time

Raj

Jul 23 '05 #9

P: n/a
>Let me ask a rhetorical questions, though. If you are prepared to
give it
overloaded 'new' and 'memset' (let's suppose it's possible somehow), whydon't you just overload the operator < ?


Its some legacy code. The idea being if you add a new member it will
work automatically. If you overload <
you will have to manually update it for the new member

Raj

Jul 23 '05 #10

P: n/a
Andrew Koenig wrote:
"Victor Bazarov" <v.********@comAcast.net> wrote in message
news:N_*******************@newsread1.mlpsca01.us.t o.verio.net...

My question is whether the second approach is portable? If so do I
really need the memset ? Does the standard say anything about
initializing the padding bits?


1) Yes. 2) Yes. 3) They are uninitialised unless your object has static
storage duration.

Beg pardon? Memcmp portable? I don't see why. As a simple example, I
can't think of any place in the standard that requires all equal bool values
to have the same representation. In other words, I don't see anything wrong
with an implementation that stores a byte in a bool and considers zero to be
false and any nonzero value to be true. Under such an implementation,
memcmp might yield unequal for two values that should be considered equal.


Beg pardon? How is the internal representations of 'true' or 'false'
relevant in this case? Whether 'true' is 0 or 1, two 'true's will
compare equal and so will two internal representations of 'false'. One
can't really expect two different internal representations from two
different architectures to compare equal, but who cares about that?
The program runs on a virtual machine that cannot have two distinctly
different representations for 'true' during the same run of the program,
can it?

V
Jul 23 '05 #11

P: n/a
ra******@hotmail.com wrote:
Let me ask a rhetorical questions, though. If you are prepared to


give it
overloaded 'new' and 'memset' (let's suppose it's possible somehow),


why
don't you just overload the operator < ?

Its some legacy code. The idea being if you add a new member it will
work automatically. If you overload <
you will have to manually update it for the new member


Maintenance is maintenance. You gotta do it right or you shouldn't be
doing it at all. Doing half-a-job is not really going to buy you much.

V
Jul 23 '05 #12

P: n/a
Victor Bazarov wrote:
Beg pardon? Memcmp portable? I don't see why. As a simple example, I
can't think of any place in the standard that requires all equal bool
values
to have the same representation. In other words, I don't see anything
wrong with an implementation that stores a byte in a bool and considers
zero to be
false and any nonzero value to be true. Under such an implementation,
memcmp might yield unequal for two values that should be considered
equal.
Beg pardon? How is the internal representations of 'true' or 'false'
relevant in this case? Whether 'true' is 0 or 1, two 'true's will
compare equal and so will two internal representations of 'false'.


Read Andrew's response again. His point was that this (e.g. true always
comparing equal to true in memcmp) might not be the case.
One can't really expect two different internal representations from two
different architectures to compare equal, but who cares about that?
The program runs on a virtual machine that cannot have two distinctly
different representations for 'true' during the same run of the program,
can it?


What makes you think it can't?

Jul 23 '05 #13

P: n/a
"Victor Bazarov" <v.********@comAcast.net> wrote in message
news:4J******************@newsread1.mlpsca01.us.to .verio.net...
Beg pardon? How is the internal representations of 'true' or 'false'
relevant in this case? Whether 'true' is 0 or 1, two 'true's will
compare equal and so will two internal representations of 'false'.
I don't think anything in the standard prohibits two values, both of which
are "true", from having different internal representations. Please read
again what I said in my previous post:
In other words, I don't see anything wrong with an implementation that
stores a byte in a bool
and considers zero to be false and any nonzero value to be true.


On such an implementation, two variables might both have the same value but
different representations. Of course the implementation would have to
change representation appropriately if the value were to be treated as an
integer, but I can see no particular difficulty in doing so.

As a historical note, I am quite certain that C and C++ implementations have
existed under which two pointers can compare equal but nevertheless have
different representations. And I am entirely certain that on most modern
computers, two floating-point values with different representations can
compare equal--namely +0 and -0.
Jul 23 '05 #14

P: n/a
Andrew Koenig wrote:
In other words, I don't see anything wrong with an implementation that
stores a byte in a bool
and considers zero to be false and any nonzero value to be true.


I wasn't paying attention apparently. Sorry.
On a historical note, was there ever an implementation that did that?

V
Jul 23 '05 #15

P: n/a
"Victor Bazarov" <v.********@comAcast.net> wrote in message
news:qy*******************@newsread1.mlpsca01.us.t o.verio.net...
Andrew Koenig wrote:
In other words, I don't see anything wrong with an implementation that
stores a byte in a bool
and considers zero to be false and any nonzero value to be true.
I wasn't paying attention apparently. Sorry.
On a historical note, was there ever an implementation that did that?


Not to my knowledge for bool. But definitely for pointers and
floating-point values.

Then there's this issue:

struct X { char a; int b; };

void foo()
{
X x1 = { '?', 42 };
X x2 = x1;
// ...
};

If there's padding between X::a and X::b, I don't think that the
implementation is obligated to copy that padding. In other words, I don't
think there's any guarantee that memcmp will show x1 and x2 as being equal
if executed at the comment.
Jul 23 '05 #16

P: n/a
On 23 Mar 2005 07:23:13 -0800, ra******@hotmail.com wrote in
comp.lang.c++:
I have a struct like

struct MyStruct
{
int a;
int b;
int c:
bool d;
bool e;
}

I want to insert such a struct in a map. I understand I can declare the
< operator for such a struct for lexicographical compare like

x.a < y.a || !(y.a < x.a) && ( x.b < y.b || !(y.b < x.b) && x.c < y.c
............

or a simple one like this

memcmp(&x,&y,sizeof(MyStruct)) < 0;

This seems to work if I memset and fill the sizeof(MyStruct) with
zeroes in the constructor before I assign a b and c etc. This will take
care of any padding that the compiler adds.

My question is whether the second approach is portable? If so do I
really need the memset ? Does the standard say anything about
initializing the padding bits?

Raj


Actually, it is extremely non-portable, and error-prone as well. As
others have pointed out, endianness can be a killer. If int has four
octet size bytes and is little endian like Intel and others, consider
x.a = 256 and y.a = 1. Then they begin with the byte sequences:

x 0x00 0x10 0x00 0x00 ...
y 0x01 0x00 0x00 0x00 ...

So which one will memcmp() find greater?

Also there are real widely used compilers where padding can certainly
trip you up.

Gnu ports for x86, for example, use the Intel 80 bit extended
precision real format for long double, and sizeof(long double) is 12,
so they always start aligned to a 4 byte address.

You can assign two long doubles the same value, then using a union or
pointer punning change the final two bytes of one of them. They will
still compare as equal with ==, but not with memcpy().

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Jul 23 '05 #17

This discussion thread is closed

Replies have been disabled for this discussion.