By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,594 Members | 3,680 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,594 IT Pros & Developers. It's quick & easy.

re-setting an array

P: n/a
Is there a way to reset all elements of an array with a single instruction?
I want to set all elements to zero. Currently looping to do so.

thx,
Scott Kelley
Jul 22 '05 #1
Share this Question
Share on Google+
35 Replies


P: n/a
> Is there a way to reset all elements of an array with a single instruction?
I want to set all elements to zero. Currently looping to do so.

setting to zero could be done with
void *memset(void *dest, int c, size_t count);

the code could look like:
memset(intArray, 0, sizeof(int) * arraySize);

Very fast but you have to take care of the correct byte-count to be set
to zero yourself!

Jul 22 '05 #2

P: n/a
Scott Kelley wrote:
Is there a way to reset all elements of an array with a single instruction?
I want to set all elements to zero. Currently looping to do so.


memset() defined in <cstring>.

Check this: http://www.cplusplus.com/ref/cstring/index.html


Regards,

Ioannis Vranos
Jul 22 '05 #3

P: n/a

"Scott Kelley" <sc****@iccom.com> wrote in message
news:Kq********************@centurytel.net...
Is there a way to reset all elements of an array with a single instruction? I want to set all elements to zero. Currently looping to do so.


If you are talking about integers then memset.

If you are talking about floating point then std::fill or std::fill_n. The
advantage of std::fill and std::fill_n is that they work with many
different types, values and data structures, but for integer arrays set to
zero you aren't going to get any more efficient then memset.

john
Jul 22 '05 #4

P: n/a
John Harrison wrote:
If you are talking about integers then memset.

If you are talking about floating point then std::fill or std::fill_n. The
advantage of std::fill and std::fill_n is that they work with many
different types, values and data structures, but for integer arrays set to
zero you aren't going to get any more efficient then memset.


Actually, memset() is for values in the range of unsigned char (it fills
bytes). I had forgotten about the fill() family. fill() seems more
elegant, so if not needed i would suggest the OP to use the fill()
family, and use memset() only when he has to.


Regards,

Ioannis Vranos
Jul 22 '05 #5

P: n/a
Ioannis Vranos wrote:
Scott Kelley wrote:
Is there a way to reset all elements of an array with a single
instruction?
I want to set all elements to zero. Currently looping to do so.



memset() defined in <cstring>.

Check this: http://www.cplusplus.com/ref/cstring/index.html


As i mentioned in another reply, better use the fill() family instead,
unless you have to use memset().
Check this: http://h30097.www3.hp.com/cplus/fill_n_3c__std.htm


Regards,

Ioannis Vranos
Jul 22 '05 #6

P: n/a

"Ioannis Vranos" <iv*@guesswh.at.grad.com> wrote in message
news:cc***********@ulysses.noc.ntua.gr...
John Harrison wrote:
If you are talking about integers then memset.

If you are talking about floating point then std::fill or std::fill_n. The advantage of std::fill and std::fill_n is that they work with many
different types, values and data structures, but for integer arrays set to zero you aren't going to get any more efficient then memset.


Actually, memset() is for values in the range of unsigned char (it fills
bytes).


True but I have never worked on a platform where integer zero wasn't all
bits zero, so I would happily use memset for integers.

john
Jul 22 '05 #7

P: n/a
John Harrison wrote:
Actually, memset() is for values in the range of unsigned char (it fills
bytes).

True but I have never worked on a platform where integer zero wasn't all
bits zero, so I would happily use memset for integers.


Actually now that you mention it, memset() is only safe to be used in
unsigned char sequences only. std::fill() family is the safe approach
definitely.


Regards,

Ioannis Vranos
Jul 22 '05 #8

P: n/a
Ioannis Vranos wrote:
Actually now that you mention it, memset() is only safe to be used in
unsigned char sequences
and to char and signed char sequences under some preconditions

only. std::fill() family is the safe approach
definitely.



Regards,

Ioannis Vranos
Jul 22 '05 #9

P: n/a
On Thu, 01 Jul 2004 11:38:25 +0300, Ioannis Vranos
<iv*@guesswh.at.grad.com> wrote in comp.lang.c++:
John Harrison wrote:
If you are talking about integers then memset.

If you are talking about floating point then std::fill or std::fill_n. The
advantage of std::fill and std::fill_n is that they work with many
different types, values and data structures, but for integer arrays set to
zero you aren't going to get any more efficient then memset.


Actually, memset() is for values in the range of unsigned char (it fills
bytes). I had forgotten about the fill() family. fill() seems more
elegant, so if not needed i would suggest the OP to use the fill()
family, and use memset() only when he has to.


No, this is quite silly, it's one of comp.lang.c's pedantic myths.

Actually it has to work for all the character types, because both C
and C++ guarantee that if a signed and unsigned variant of any integer
type contain a value that is within the range of both, the bitwise
representation must be exactly the same.

So all bits 0 must represent the value 0 in both a signed and unsigned
char, and therefore also in a plain char.

Even beyond that, the C standard committee already has accepted a DR
to require wording in the next revision of the C standard specifically
stating that all bits 0 is a valid representation of the value 0 for
_all_ integer types.

I doubt if the C++ standard, which does not allow padding bits in
signed char where the C standard does, will ever suffer this nonsense
about all bits 0 not being the value 0 in any integer type.

On the other hand, for pointers or floating point types, and of course
for non-POD types, using memset() is not a good idea at all.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Jul 22 '05 #10

P: n/a
On Thu, 01 Jul 2004 12:24:57 +0300, Ioannis Vranos
<iv*@guesswh.at.grad.com> wrote in comp.lang.c++:
Ioannis Vranos wrote:
Actually now that you mention it, memset() is only safe to be used in
unsigned char sequences


and to char and signed char sequences under some preconditions


Under no preconditions at all.

From paragraph 3 of 3.9.1 Fundamental types:

"The range of nonnegative values of a signed integer type is a
subrange of the corresponding unsigned integer type, and the value
representation of each corresponding signed/unsigned type shall be the
same."

If there is an implementation where all bits 0 is not the value 0 in a
char or signed char type, then whatever that implementation is it is
not C or C++.
only. std::fill() family is the safe approach
definitely.


Regards,

Ioannis Vranos


--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Jul 22 '05 #11

P: n/a
Jack Klein wrote:
Under no preconditions at all.

From paragraph 3 of 3.9.1 Fundamental types:

"The range of nonnegative values of a signed integer type is a
subrange of the corresponding unsigned integer type, and the value
representation of each corresponding signed/unsigned type shall be the
same."

If there is an implementation where all bits 0 is not the value 0 in a
char or signed char type, then whatever that implementation is it is
not C or C++.

I was talking about values larger than numeric_limits<signed
char>::max() but within the range of unsigned char.

0 is safe for char/signed char/unsigned char always.


Regards,

Ioannis Vranos
Jul 22 '05 #12

P: n/a
Jack Klein wrote:
On Thu, 01 Jul 2004 11:38:25 +0300, Ioannis Vranos
<iv*@guesswh.at.grad.com> wrote in comp.lang.c++:

John Harrison wrote:

If you are talking about integers then memset.

If you are talking about floating point then std::fill or std::fill_n. The
advantage of std::fill and std::fill_n is that they work with many
different types, values and data structures, but for integer arrays set to
zero you aren't going to get any more efficient then memset.
Actually, memset() is for values in the range of unsigned char (it fills
bytes). I had forgotten about the fill() family. fill() seems more
elegant, so if not needed i would suggest the OP to use the fill()
family, and use memset() only when he has to.

No, this is quite silly, it's one of comp.lang.c's pedantic myths.

Actually it has to work for all the character types, because both C
and C++ guarantee that if a signed and unsigned variant of any integer
type contain a value that is within the range of both, the bitwise
representation must be exactly the same.

Yes I agree that for 0 it works for all character types, *regardless* of
the representation (it doesn't matter if 0 is all bit zeros or otherwise).

So all bits 0 must represent the value 0 in both a signed and unsigned
char, and therefore also in a plain char.

Bits don't matter here.
Even beyond that, the C standard committee already has accepted a DR
to require wording in the next revision of the C standard specifically
stating that all bits 0 is a valid representation of the value 0 for
_all_ integer types.
Which is entirely off topic in here (otherwise expressed as who cares?). :-)


I doubt if the C++ standard, which does not allow padding bits in
signed char where the C standard does, will ever suffer this nonsense
about all bits 0 not being the value 0 in any integer type.
But this doesn't affect memset() in any way.


On the other hand, for pointers or floating point types, and of course
for non-POD types, using memset() is not a good idea at all

Of course. And for integrals other than char/signed char/unsigned char.
And of course for values larger than numeric_limits<signed char>::max()
it shouldn't be used on signed chars, and for values larger than
numeric_limits<char>::max() it shouldn't be used in chars, and for
values larger than numeric_limits<unsigned char>::max() it shouldn't be
used on unsigned chars.
fill() family is better suitable for all purposes, unless we can't do
otherwise.


Regards,

Ioannis Vranos
Jul 22 '05 #13

P: n/a
* Jack Klein:
On Thu, 01 Jul 2004 12:24:57 +0300, Ioannis Vranos
<iv*@guesswh.at.grad.com> wrote in comp.lang.c++:
Ioannis Vranos wrote:
Actually now that you mention it, memset() is only safe to be used in
unsigned char sequences


and to char and signed char sequences under some preconditions


Under no preconditions at all.

From paragraph 3 of 3.9.1 Fundamental types:

"The range of nonnegative values of a signed integer type is a
subrange of the corresponding unsigned integer type, and the value
representation of each corresponding signed/unsigned type shall be the
same."


It is worth noting that the standard defines the term "value
representation" in a para preceding this one.

It doesn't mean the representation of a value in terms of 0's and 1's...

This is a case where possibly the intent of the standard is different
from the actual wording, but anyway, it's not a good idea to use memset
for anything if it can be avoided, since it's dangerous in many ways:
the possibility of 0 not being represented by all bits 0 for some types,
the ease with which incorrect limits can be specified, the possibility
of being applied to what is actually non-contigous storage.

I'd advice the OP to use std::vector instead of raw arrays.

Safely & efficiently clearing a std::vector v is very very easy:

zeroAllElementsIn( v );

where zeroAllElementsIn can be defined as

template< typename T >
void zeroAllElementsIn( std::vector<T>& v )
{
std::size_t const size = v.size();
v.clear();
v.resize( size );
}

or

template< typename T >
void zeroAllElementsIn( std::vector<T>& v )
{
std::fill( v.begin(), v.end(), T() );
}

or whatever, just not memset.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Jul 22 '05 #14

P: n/a
Alf P. Steinbach wrote:
This is a case where possibly the intent of the standard is different
from the actual wording, but anyway, it's not a good idea to use memset
for anything if it can be avoided, since it's dangerous in many ways:
the possibility of 0 not being represented by all bits 0 for some types,


memset() has nothing to do with bits but with bytes. It sets them to a
value considering them as unsigned chars. For the value 0 this is safe
for all character types, but not for anything else.


Regards,

Ioannis Vranos
Jul 22 '05 #15

P: n/a
"John Harrison" <jo*************@hotmail.com> wrote:

True but I have never worked on a platform where integer zero wasn't all
bits zero, so I would happily use memset for integers.


value zero is guaranteed to be all-bits-zero for the integral types.
Jul 22 '05 #16

P: n/a
* Ioannis Vranos:
Alf P. Steinbach wrote:
This is a case where possibly the intent of the standard is different
from the actual wording, but anyway, it's not a good idea to use memset
for anything if it can be avoided, since it's dangerous in many ways:
the possibility of 0 not being represented by all bits 0 for some types,
memset() has nothing to do with bits but with bytes.


It's a good idea to understand the bit-level of things; that way you can
understand instead of just follow perhaps flawed cookbook recipes.

It so happens that the Holy Standard partially specifies the bit-level
representation of integers.

And the two implicitly allowed representations for unsigned integers
(direct binary, gray code) both have all bits 0 for the value 0.

This follows from the requirements of the shift operators and the
definition of "value representation" mentioned earlier in this thread.

Amazing, isn't it? ;-)
It sets them to a value considering them as unsigned chars.
Whatever.

For the value 0 this is safe
for all character types, but not for anything else.


That is incorrect.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Jul 22 '05 #17

P: n/a
Alf P. Steinbach wrote:
memset() has nothing to do with bits but with bytes.

It's a good idea to understand the bit-level of things; that way you can
understand instead of just follow perhaps flawed cookbook recipes.

It so happens that the Holy Standard partially specifies the bit-level
representation of integers.

And the two implicitly allowed representations for unsigned integers
(direct binary, gray code) both have all bits 0 for the value 0.


However there are paddings bits on other integer types.

It sets them to a value considering them as unsigned chars.

Whatever.
For the value 0 this is safe
for all character types, but not for anything else.

That is incorrect.

Of course it is correct, writing on (bytes containing) padding bits is
undefined behaviour.


Regards,

Ioannis Vranos
Jul 22 '05 #18

P: n/a
* Ioannis Vranos:
* Alf P. Steinbach:
* Ioannis Vranos:

memset() has nothing to do with bits but with bytes.
It's a good idea to understand the bit-level of things; that way you can
understand instead of just follow perhaps flawed cookbook recipes.

It so happens that the Holy Standard partially specifies the bit-level
representation of integers.

And the two implicitly allowed representations for unsigned integers
(direct binary, gray code) both have all bits 0 for the value 0.


However there [can theoretically be] paddings bits on other integer types.


Yes. But with a difference between C and C++. C allows more bits used
as padding on signed types relative to corresponding unsigned, C++ does
not; the C++ requirements boil down to msb being used as sign bit.

For the value 0 this is safe for all character types, but not for
anything else.


That is incorrect.


Of course it is correct


Uhm, no.

writing on (bytes containing) padding bits is undefined behaviour.


Yes, that is correct, but it's not the same. For one can easily ensure
that there are no padding bits (a simple compile time assertion). And
furthermore there are AFAIK no modern machines that use integer padding
bits accessible to a program, so even the un-safety in unchecked context
is purely an academic one, not one that can be encountered in practice.

Hence it should not enter in any consideration of using memset or not.

I've listed other arguments against memset.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Jul 22 '05 #19

P: n/a
Alf P. Steinbach wrote:
Yes. But with a difference between C and C++. C allows more bits used
as padding on signed types relative to corresponding unsigned, C++ does
not; the C++ requirements boil down to msb being used as sign bit.

I did not understand what you are saying. For example, signed int and
unsigned int cannot have padding bits?

writing on (bytes containing) padding bits is undefined behaviour.

Yes, that is correct, but it's not the same. For one can easily ensure
that there are no padding bits (a simple compile time assertion).

?
And
furthermore there are AFAIK no modern machines that use integer padding
bits accessible to a program, so even the un-safety in unchecked context
is purely an academic one, not one that can be encountered in practice.


What do you mean they are not accessible in a program. All objects are
accessible including their padding bits.


Regards,

Ioannis Vranos
Jul 22 '05 #20

P: n/a
* Ioannis Vranos:
Alf P. Steinbach wrote:
Yes. But with a difference between C and C++. C allows more bits used
as padding on signed types relative to corresponding unsigned, C++ does
not; the C++ requirements boil down to msb being used as sign bit.
I did not understand what you are saying. For example, signed int and
unsigned int cannot have padding bits?


They can. In C more of the total bits can be padding bits in the signed
int. In C++ signed and unsigned are required to have the same number of
value representation bits, per the definition of "value representation".

writing on (bytes containing) padding bits is undefined behaviour.


Yes, that is correct, but it's not the same. For one can easily ensure
that there are no padding bits (a simple compile time assertion).


?


In short it boils down to "is undefined behavior" versus "can be
undefined behavior". In the case of padding bits accessible to the
program it is undefined behavior. In the more general case of integer
types in C++ it isn't necessarily undefined behavior, and the
possibility of UB is only on antiquated machinery for which I'm not even
sure that C++ compilers exist, and furthermore that remote, purely
academic possibility can be avoided by a simple compile time assertion.

A compile time assertion is any statement that makes the program not
compile on a machine where some specified assumption does not hold.

So, wrt. integers in general, as opposed to the hypothetical situation
of having padding bits, it's not "is", it is "can hypothetically be".

And
furthermore there are AFAIK no modern machines that use integer padding
bits accessible to a program, so even the un-safety in unchecked context
is purely an academic one, not one that can be encountered in practice.


What do you mean they are not accessible in a program.


What do you mean "they are not accessible in a program"?

All objects are accessible including their padding bits.


All C++ objects are accessible. You would however find it difficult to
access processor registers in pure standard C++ without any platform
specific library or language extensions. Similarly, you would find it
difficult to access any padding bits used internally by some processor
(said processor being just as hypothetical and far-fetched as one that
uses padding bits visible to C++). The context of my remark was
hypothetical _machines_ that use integer padding bits -- because it
seems you think such machines should be considered in deciding whether
to use memset or not. To understand that remark better, consider a real
machine, namely the ordinary PC, that uses extra bits in floating point
calculations (those are not padding bits, but). Whether the machine
itself uses such bits is one thing, and that is similar to the context
of the remark. Whether they're accessible to C++ is another; in old
versions of Visual C++ they were, in Visual C++ 7.1 they aren't.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Jul 22 '05 #21

P: n/a

Why does the following compile:

int BlahChar(char)
{
char aaa = 5;
return aaa;
}

int BlahChar(signed char)
{
signed char bbb = 5;
return bbb;
}

int BlahChar(unsigned char)
{
unsigned char ccc = 5;
return ccc;
}
int main()
{
signed char jk;

Blah(jk);
}

For instance:

short == short int == signed short == signed short int

int == signed int

long == long int = signed long = signed long int
Is the above correct in ALL circumstances?

I can only presume that:

char == signed char

char == unsigned char

is not neccessilary true, and that it's implementation defined. Would I be
right?
-JKop
Jul 22 '05 #22

P: n/a

"JKop" <NU**@NULL.NULL> wrote in message
news:V7*****************@news.indigo.ie...

Why does the following compile:

int BlahChar(char)
{
char aaa = 5;
return aaa;
}

int BlahChar(signed char)
{
signed char bbb = 5;
return bbb;
}

int BlahChar(unsigned char)
{
unsigned char ccc = 5;
return ccc;
}
int main()
{
signed char jk;

Blah(jk);
}

For instance:

short == short int == signed short == signed short int

int == signed int

long == long int = signed long = signed long int
Is the above correct in ALL circumstances?

AFAIK there is not such type as signed short, signed int or signed long. But
short == short int and long == long int always.


I can only presume that:

char == signed char

char == unsigned char

is not neccessilary true, and that it's implementation defined. Would I be
right?


Those are never true. Even though char is either signed or unsigned it is
not the same type as unsigned char, or signed char.

This mess is the history of C/C++. In the original C is was left undefined
whether char is signed or unsigned. I guess signed char was intorduced to
overcome this, at least you can now explicitly say you want a signed char.

john
Jul 22 '05 #23

P: n/a
* JKop:

I can only presume that:

char == signed char

char == unsigned char

is not neccessilary true, and that it's implementation defined. Would I be
right?


Yes, if you mean what I think. 3.9.1/1: "Plain 'char', 'signed char',
and 'usigned char' are three distinct types."

'char' is either signed or unsigned in a given implementation, but not
the _same_ type as 'signed char' or 'unsigned char'.

The same situation holds for 'wchar_t', which is a distinct type but
mapped to another integer type, it's "underlying type" in the HS.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Jul 22 '05 #24

P: n/a
In message <V7*****************@news.indigo.ie>, JKop <NU**@NULL.NULL>
writes

Why does the following compile:

int BlahChar(char)
{
char aaa = 5;
return aaa;
}

int BlahChar(signed char)
{
signed char bbb = 5;
return bbb;
}

int BlahChar(unsigned char)
{
unsigned char ccc = 5;
return ccc;
}
int main()
{
signed char jk;

Blah(jk);
}

Why wouldn't it? You have three distinct function overloads there.
For instance:

short == short int == signed short == signed short int

int == signed int

long == long int = signed long = signed long int

Is the above correct in ALL circumstances?

I can only presume that:

char == signed char

char == unsigned char

is not neccessilary true, and that it's implementation defined. Would I be
right?


Depends what you mean by "==". Why not buy that copy of the Standard and
turn to 3.9.1?

Plain char, signed char and unsigned char are three distinct types, but
plain char can take exactly the same values as one of the other two
types, and this choice is implementation-defined.

--
Richard Herring
Jul 22 '05 #25

P: n/a
Richard Herring posted:
In message <V7*****************@news.indigo.ie>, JKop <NU**@NULL.NULL>
writes

Why does the following compile:

int BlahChar(char)
{
char aaa = 5;
return aaa;
}

int BlahChar(signed char)
{
signed char bbb = 5;
return bbb;
}

int BlahChar(unsigned char)
{
unsigned char ccc = 5;
return ccc;
}
int main()
{
signed char jk;

Blah(jk);
}

Why wouldn't it? You have three distinct function overloads there.

If I'd known that do you think I would've written my original post. Given
that, do you think your reply was in anyway stupid, ignorant or arrogant?

For instance:

short == short int == signed short == signed short int

int == signed int

long == long int = signed long = signed long int

Is the above correct in ALL circumstances?

I can only presume that:

char == signed char

char == unsigned char

is not neccessilary true, and that it's implementation defined. Would I
be right?


Depends what you mean by "==". Why not buy that copy of the Standard
and turn to 3.9.1?

You know exactly what I mean by "==", that's just you being ignorant again.

Because it's a rip-off.
-JKop
Jul 22 '05 #26

P: n/a
Alf P. Steinbach wrote:
I did not understand what you are saying. For example, signed int and
unsigned int cannot have padding bits?

They can. In C more of the total bits can be padding bits in the signed
int. In C++ signed and unsigned are required to have the same number of
value representation bits, per the definition of "value representation".

You said it nicely. However both can have padding bits, and when we use
memset() to zero all bytes (including those with the padding bits), we
modify these parts of bytes too.

In short it boils down to "is undefined behavior" versus "can be
undefined behavior". In the case of padding bits accessible to the
program it is undefined behavior.

Ok then we agree.
In the more general case of integer
types in C++ it isn't necessarily undefined behavior, and the
possibility of UB is only on antiquated machinery for which I'm not even
sure that C++ compilers exist, and furthermore that remote, purely
academic possibility can be avoided by a simple compile time assertion.
It is not on antiquated machinery, only reading bytes containing padding
bits is guaranteed to be safe in the standard.


A compile time assertion is any statement that makes the program not
compile on a machine where some specified assumption does not hold.
Compile time assertions - if present - are system specific extensions
and their kind and use differ from system to system.

If you are saying that there are specific systems where we can
manipulate the padding bits and have no side effects, the answer is they
are. However we are talking about portability here.
All C++ objects are accessible. You would however find it difficult to
access processor registers in pure standard C++ without any platform
specific library or language extensions.

Yes, however using register keyword, it is up to the implementation to
decide if it will put a variable in a register.
Similarly, you would find it
difficult to access any padding bits used internally by some processor
(said processor being just as hypothetical and far-fetched as one that
uses padding bits visible to C++). The context of my remark was
hypothetical _machines_ that use integer padding bits -- because it
seems you think such machines should be considered in deciding whether
to use memset or not. To understand that remark better, consider a real
machine, namely the ordinary PC, that uses extra bits in floating point
calculations (those are not padding bits, but). Whether the machine
itself uses such bits is one thing, and that is similar to the context
of the remark. Whether they're accessible to C++ is another; in old
versions of Visual C++ they were, in Visual C++ 7.1 they aren't.

From the standard:
5.3.3 Sizeof

"When applied to a reference or a reference type, the result is the size
of the referenced type. When applied to a class, the result is the
number of bytes in an object of that class including any padding
required for placing objects of that type in an array."

If the object's size returned by sizeof includes padding bits, then
these padding bits have to be accessible.


Regards,

Ioannis Vranos
Jul 22 '05 #27

P: n/a
JKop wrote:
Why does the following compile:

int BlahChar(char)
{
char aaa = 5;
return aaa;
}

int BlahChar(signed char)
{
signed char bbb = 5;
return bbb;
}

int BlahChar(unsigned char)
{
unsigned char ccc = 5;
return ccc;
}
int main()
{
signed char jk;

Blah(jk);
}

For instance:

short == short int == signed short == signed short int

int == signed int

long == long int = signed long = signed long int
Is the above correct in ALL circumstances?

Of course since they are the *same type* respectively!

I can only presume that:

char == signed char

char == unsigned char

is not neccessilary true, and that it's implementation defined. Would I be
right?


The three above are different types, but it is required by the standard
their size to be 1 (1 byte) always.


Regards,

Ioannis Vranos
Jul 22 '05 #28

P: n/a
John Harrison wrote:
AFAIK there is not such type as signed short, signed int or signed long. But
short == short int and long == long int always.



Of course there is a type signed short, it is your "short" and "short
int". Its complete name is "signed short int" and it is glad to meet
you. :-)

unsigned long int is also an existing type.


Regards,

Ioannis Vranos
Jul 22 '05 #29

P: n/a
* Ioannis Vranos:
Alf P. Steinbach wrote:
I did not understand what you are saying. For example, signed int and
unsigned int cannot have padding bits?

They can. In C more of the total bits can be padding bits in the signed
int. In C++ signed and unsigned are required to have the same number of
value representation bits, per the definition of "value representation".

You said it nicely. However both can have padding bits, and when we use
memset() to zero all bytes (including those with the padding bits), we
modify these parts of bytes too.


Well, then, you'll have to come up with at least one C++ compiler
where this is the case for some standard integer type.

But note that that just moves the potential "problem" into the realm of
possibility.

If (against expectation) such compiler is found it doesn't mean that
zeroing arrays of integers via memset is UB in general; just that the
proposition that it _can hypothetically_ be can be amended to _can_.

In the more general case of integer
types in C++ it isn't necessarily undefined behavior, and the
possibility of UB is only on antiquated machinery for which I'm not even
sure that C++ compilers exist, and furthermore that remote, purely
academic possibility can be avoided by a simple compile time assertion.


It is not on antiquated machinery


No? Name one modern such computer, then. I.e. one in use with C++
compiler.

, only reading bytes containing padding
bits is guaranteed to be safe in the standard.


?

A compile time assertion is any statement that makes the program not
compile on a machine where some specified assumption does not hold.


Compile time assertions - if present - are system specific extensions
and their kind and use differ from system to system.


No no no. What needs to asserted is simply that the range of the
relevant integer type requires the full number of bits in the object.
That is very easy to assert in system-independent way.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Jul 22 '05 #30

P: n/a
Alf P. Steinbach wrote:
Well, then, you'll have to come up with at least one C++ compiler
where this is the case for some standard integer type.

But note that that just moves the potential "problem" into the realm of
possibility.

If (against expectation) such compiler is found it doesn't mean that
zeroing arrays of integers via memset is UB in general; just that the
proposition that it _can hypothetically_ be can be amended to _can_.

Everything not defined in the standard is undefined behaviour. The
hypothetically stuff don't fit in this discussion, because some day for
example, I
No no no. What needs to asserted is simply that the range of the
relevant integer type requires the full number of bits in the object.
That is very easy to assert in system-independent way.


With sizeof(), maximum/minimum values and numeric_limits<unsigned
char>::digits it can be done, however "legally" speaking (we are
language lawyers after all) there is no guarantee that there are not
integers with padding bits. However if you use the above mentioned check
and you find that there are not padding bits, then you can zero
everything on integer types. However such a check would imply that there
will be used an alternative when the above check "returns false", so why
not use fill() family in the first place?


Regards,

Ioannis Vranos
Jul 22 '05 #31

P: n/a
Ioannis Vranos wrote:
Fixed to be more comprehensible:

Alf P. Steinbach wrote:
Well, then, you'll have to come up with at least one C++ compiler
where this is the case for some standard integer type.

But note that that just moves the potential "problem" into the realm of
possibility.

If (against expectation) such compiler is found it doesn't mean that
zeroing arrays of integers via memset is UB in general; just that the
proposition that it _can hypothetically_ be can be amended to _can_.

Everything not defined in the standard is undefined behaviour. The
hypothetically stuff don't fit in this discussion.
No no no. What needs to asserted is simply that the range of the
relevant integer type requires the full number of bits in the object.
That is very easy to assert in system-independent way.

With sizeof(), maximum/minimum values and numeric_limits<unsigned
char>::digits it can be done portably, however "legally" speaking (we
are language lawyers after all) there is no guarantee that there are not
integers with padding bits in some system. However if you use the above
mentioned check and you find that there are not padding bits, then you
can zero everything on integer types and this scheme is portable.
However such a check would imply that there will be used an alternative
when the above check "returns false", so why not use fill() family in
the first place?


Regards,

Ioannis Vranos
Jul 22 '05 #32

P: n/a
* Ioannis Vranos:
No no no. What needs to asserted is simply that the range of the
relevant integer type requires the full number of bits in the object.
That is very easy to assert in system-independent way.

With sizeof(), maximum/minimum values and numeric_limits<unsigned
char>::digits it can be done portably, however "legally" speaking (we
are language lawyers after all) there is no guarantee that there are not
integers with padding bits in some system. However if you use the above
mentioned check and you find that there are not padding bits, then you
can zero everything on integer types and this scheme is portable.
However such a check would imply that there will be used an alternative
when the above check "returns false"


It doesn't necessarily mean an alternative would be used; the simplest
way to guarantee not UB is to have the program not compile when padding
bits are present; then from the requirements of the standard we have
value representation = object representation = all bits 0 for value 0.

However, templating can be used to select at compile time the most
efficient method, depending on whether pad bits exist or not.

On the third hand, I think the probability of that template selecting a
std::fill would be so near zero as to be practically equivalent to zero.

so why not use fill() family in the first place?


Well my position is that the added safety etc. of std::fill in general
far outweights the possible (and in practice more than possible) higher
efficiency of a memset, so that std::fill would nearly always be my
choice; I think we're in agreement there.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Jul 22 '05 #33

P: n/a
Alf P. Steinbach wrote:
so why not use fill() family in the first place?

Well my position is that the added safety etc. of std::fill in general
far outweights the possible (and in practice more than possible) higher
efficiency of a memset, so that std::fill would nearly always be my
choice; I think we're in agreement there.

Which memset() is not guaranteed to be more efficient than fill()
anyway, since it can also use a loop inside it to assign values on bytes
as unsigned chars.

I had said, we should use fill() family unless we cannot do otherwise.
With that I meant that when the use of fill() raises performance
concerns while the use of memset() yields significant benefits, then we
should use memset().


Regards,

Ioannis Vranos
Jul 22 '05 #34

P: n/a
Ioannis Vranos wrote:

Which memset() is not guaranteed to be more efficient than fill()
anyway, since it can also use a loop inside it to assign values on bytes
as unsigned chars.

which would be slower than assigning int types with 0 using fill(), by
the way.

I had said, we should use fill() family unless we cannot do otherwise.
With that I meant that when the use of fill() raises performance
concerns while the use of memset() yields significant benefits, then we
should use memset().



Regards,

Ioannis Vranos
Jul 22 '05 #35

P: n/a
Alf P. Steinbach posted:

In short it boils down to "is undefined behavior" versus "can be
undefined behavior". In the case of padding bits accessible to the
program it is undefined behavior. In the more general case of integer
types in C++ it isn't necessarily undefined behavior, and the
possibility of UB is only on antiquated machinery for which I'm not
even sure that C++ compilers exist, and furthermore that remote, purely
academic possibility can be avoided by a simple compile time assertion.


Then only problem I myself can see with messing with padding bits is the
following:

struct Poo
{
char a;
char b;
char c;
long d;
};
If that long in there has to be on a 4-byte boundary or whatever, then in
memory it'll look like so:

__________
| |
| a |
|__________|
| |
| b |
|__________|
| |
| c |
|__________|
| |
| padding |
|__________|
| |
| d |
|__________|
From looking at that, it may seem harmless to alter the padding bits. But
consider if you had the following:

int main()
{
Poo poo;

char kkar;
}
The system could very well stick kkar into the vacant space:

__________
| |
| a |
|__________|
| |
| b |
|__________|
| |
| c |
|__________|
| |
| kkar |
|__________|
| |
| d |
|__________|
Can that happen?

Other than that, I see no reason for not messing with padding bits.
-JKop

Jul 22 '05 #36

This discussion thread is closed

Replies have been disabled for this discussion.