473,466 Members | 1,397 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Size of int (once again, sorry)

Hi, sorry about the multiple posting, technical difficulties....

-----
What does exactly the size of the int datatype depends in C++?
Recenlty I've heard that it depends on the machine's type, i.e. on 16-bit
machines it's 16 bit, on 32-bit machines it's 32 etc.
Is this true? Is this true for _all_ C++ compilers?
Nov 17 '05 #1
14 1605
Agoston Bejo wrote:
What does exactly the size of the int datatype depends in C++?
Recenlty I've heard that it depends on the machine's type, i.e. on
16-bit machines it's 16 bit, on 32-bit machines it's 32 etc.
Is this true? Is this true for _all_ C++ compilers?


No, that's wrong. If you run a 16-bit compiler on a 32-bit computer, int
will typically be 16 bits.

In other words, it's up to the compiler to decide the size of an int.
--
Sigurd
http://utvikling.com
Nov 17 '05 #2
"Agoston Bejo" <gu***@freemail.hu> wrote in message news:<#Y**************@TK2MSFTNGP09.phx.gbl>...
Hi, sorry about the multiple posting, technical difficulties....

-----
What does exactly the size of the int datatype depends in C++?
Recenlty I've heard that it depends on the machine's type, i.e. on 16-bit
machines it's 16 bit, on 32-bit machines it's 32 etc.
Is this true? Is this true for _all_ C++ compilers?


An int is defined as being the "natural size" for the target machine,
so a 32 bit compiler would have 32 bits ints, and so on.

Arnaud
MVP - VC
Nov 17 '05 #3
Agoston Bejo wrote:
Hi, sorry about the multiple posting, technical difficulties....

-----
What does exactly the size of the int datatype depends in C++?
Recenlty I've heard that it depends on the machine's type, i.e. on 16-bit
machines it's 16 bit, on 32-bit machines it's 32 etc.
Is this true? Is this true for _all_ C++ compilers?


The following relationship holds for the four basic integer types:

sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)

There are also minimum ranges:

char >= 8 bits
short >= 16 bits
int >= 16 bits
long >= 32 bits

Certain parts of the standard library become impossible to implement
correctly if sizeof(char) == sizeof(int), so on most systems, sizeof(char) <
sizeof(int). See <limits.h> for lots of interesting macros which describe
the integer types for a given system.

Now, to answer your question, int is intended to map to the "natural" word
size for the target architecture, and it's intended to be the first type you
reach for when you need an integer type. So, a compiler targeting a 16 bit
machine will typically define int as 16 bits, a compiler targeting a 32 bit
machine will define int as 32 bits, and so on. Well, almost. A 16 bit int
really is too small in many cases, but a 32 bit int is usually large enough.
So to avoid wasting space with a 64 bit int, not to mention breaking
programs that assume 32 bit int, compilers for 64-bit Windows break with
tradition and keep int 32 bits. I think that's a reasonable decision, but
one has to hope 64-bit CPU designers keep 32 bit ints nice and efficient.
(It would certainly be in their interest to do so. :)

--
Doug Harrison
Microsoft MVP - Visual C++
Nov 17 '05 #4
In message <v0********************************@4ax.com> of Thu, 30 Sep
2004 09:58:59 in microsoft.public.vc.language, "Doug Harrison [MVP]"
<ds*@mvps.org> writes
[snip]
Certain parts of the standard library become impossible to implement
correctly if sizeof(char) == sizeof(int), so on most systems, sizeof(char) <
sizeof(int). See <limits.h> for lots of interesting macros which describe
the integer types for a given system.


What parts and why?
--
Walter Briscoe
Nov 17 '05 #5
Walter Briscoe wrote:
In message <v0********************************@4ax.com> of Thu, 30 Sep
2004 09:58:59 in microsoft.public.vc.language, "Doug Harrison [MVP]"
<ds*@mvps.org> writes
[snip]
Certain parts of the standard library become impossible to implement
correctly if sizeof(char) == sizeof(int), so on most systems, sizeof(char) <
sizeof(int). See <limits.h> for lots of interesting macros which describe
the integer types for a given system.


What parts and why?


Off the top of my head, fgetc and the <ctype.h> functions. They need to be
able to distinguish EOF from char values, when stored in int. IOW, there
needs to be an int value that doesn't correspond to any char value. (I think
that might also be true for the C++ char_traits<char> specialization, but
there you probably aren't restricted to just int; any standard integer type
larger than char would do. I'd have to double-check the standard to be sure
about that, though.)

--
Doug Harrison
Microsoft MVP - Visual C++
Nov 17 '05 #6
Doug Harrison [MVP] wrote:
Walter Briscoe wrote:
In message <v0********************************@4ax.com> of Thu, 30 Sep
2004 09:58:59 in microsoft.public.vc.language, "Doug Harrison [MVP]"
<ds*@mvps.org> writes
[snip]
Certain parts of the standard library become impossible to implement
correctly if sizeof(char) == sizeof(int), so on most systems, sizeof(char) <
sizeof(int). See <limits.h> for lots of interesting macros which describe
the integer types for a given system.
What parts and why?


Note: Below I've replaced "char" with "unsigned char", which makes it right.
Off the top of my head, fgetc and the <ctype.h> functions. They need to be
able to distinguish EOF from unsigned char values, when stored in int. IOW, there
needs to be an int value that doesn't correspond to any unsigned char value.


An even better way to put it is this. In order for fgetc and <ctype.h> to
work right, int has to be able to faithfully hold all values of unsigned
char, plus EOF. Since all bit patterns of unsigned char are valid unsigned
char values, this means sizeof(int) has to be greater than sizeof(char).

It's even messier with char_traits<char>, so I didn't try to fix that part,
but the same consideration applies to its char_type and int_type types. See
the subthread starting here for more on the aforementioned messiness:

http://groups.google.com/groups?selm...rldnet.att.net

Someone has recently proposed at least a partial way to clean this up:

http://www.open-std.org/jtc1/sc22/wg...ctive.html#467

--
Doug Harrison
Microsoft MVP - Visual C++
Nov 17 '05 #7
Hi,

"Doug Harrison [MVP]" <ds*@mvps.org> wrote in message
news:r3********************************@4ax.com...
(...) Since all bit patterns of unsigned char are valid unsigned
char values, this means sizeof(int) has to be greater than sizeof(char).


I believe that is somehow wrong, because C++ does not specify that every
value that you can put into an unsigned char, must correspond to a
character.
I fact, I used to work with a C++ compiler that sizeof(char) == sizeof(long)
and char was 64 bits long.

Lucas/
Nov 17 '05 #8
Lucas Galfaso wrote:
Hi,

"Doug Harrison [MVP]" <ds*@mvps.org> wrote in message
news:r3********************************@4ax.com.. .
(...) Since all bit patterns of unsigned char are valid unsigned
char values, this means sizeof(int) has to be greater than sizeof(char).
I believe that is somehow wrong, because C++ does not specify that every
value that you can put into an unsigned char, must correspond to a
character.


I'm talking about unsigned char, not "characters", in the context of fgetc
and <ctype.h> functions.
I fact, I used to work with a C++ compiler that sizeof(char) == sizeof(long)
and char was 64 bits long.


Let me give you an example. For that compiler, we can use the relationship I
posted earlier to deduce that sizeof(int) is the same as sizeof(char) and
sizeof(long). The macro EOF is some integer constant expression, typically
an int equal to -1. For the sake of argument, let's assume the usual two's
complement signed integer representation. Then EOF has the representation
with all bits set. Now consider the definition of fgetc. It reads unsigned
chars and returns them as ints. All bit patterns of unsigned char are valid,
so unless int can represent all of them, we've got a problem. But two's
complement signed ints can do this, so that's not an issue. The problem is
that fgetc returns EOF when it reaches end of file or encounters an error,
and EOF is -1, which has all bits set in two's complement. Thus, there's no
way to distinguish an EOF return value from an unsigned char in the file
that had all bits set. So, like I said, a compiler that wants to implement
fgetc needs sizeof(int) > sizeof(char).

I'm curious how the compiler you described dealt with this. Perhaps it was a
freestanding implementation, one that doesn't supply all the standard
library, in particular the problematic <ctype.h> and <stdio.h>.

--
Doug Harrison
Microsoft MVP - Visual C++
Nov 17 '05 #9

"Doug Harrison [MVP]" <ds*@mvps.org> wrote in message
news:s5********************************@4ax.com...
Lucas Galfaso wrote:
Hi,

"Doug Harrison [MVP]" <ds*@mvps.org> wrote in message
news:r3********************************@4ax.com. ..
(...) Since all bit patterns of unsigned char are valid unsigned
char values, this means sizeof(int) has to be greater than sizeof(char).
I believe that is somehow wrong, because C++ does not specify that every
value that you can put into an unsigned char, must correspond to a
character.


I'm talking about unsigned char, not "characters", in the context of fgetc
and <ctype.h> functions.
I fact, I used to work with a C++ compiler that sizeof(char) ==
sizeof(long)
and char was 64 bits long.


Let me give you an example. For that compiler, we can use the relationship
I
posted earlier to deduce that sizeof(int) is the same as sizeof(char) and
sizeof(long). The macro EOF is some integer constant expression, typically
an int equal to -1. For the sake of argument, let's assume the usual two's
complement signed integer representation. Then EOF has the representation
with all bits set. Now consider the definition of fgetc. It reads unsigned
chars and returns them as ints. All bit patterns of unsigned char are
valid,
so unless int can represent all of them, we've got a problem. But two's
complement signed ints can do this, so that's not an issue. The problem is
that fgetc returns EOF when it reaches end of file or encounters an error,
and EOF is -1, which has all bits set in two's complement. Thus, there's
no
way to distinguish an EOF return value from an unsigned char in the file
that had all bits set. So, like I said, a compiler that wants to implement
fgetc needs sizeof(int) > sizeof(char).

I'm curious how the compiler you described dealt with this. Perhaps it was
a
freestanding implementation, one that doesn't supply all the standard
library, in particular the problematic <ctype.h> and <stdio.h>.


I am not 100% sure (this was +5 years ago) but I think we do not have fgetc,
just the C++ libs. Anyway the lack of this library does not make the
compiler not standard.

--
Doug Harrison
Microsoft MVP - Visual C++

Nov 17 '05 #10
Lucas Galfaso wrote:
I am not 100% sure (this was +5 years ago) but I think we do not have fgetc,
just the C++ libs. Anyway the lack of this library does not make the
compiler not standard.


True. Like I said:

<q>
Certain parts of the standard library become impossible to implement
correctly if sizeof(char) == sizeof(int), so on most systems, sizeof(char) <
sizeof(int).
</q>

<q>
So, like I said, a compiler that wants to implement
fgetc needs sizeof(int) > sizeof(char).

I'm curious how the compiler you described dealt with this. Perhaps it was a
freestanding implementation, one that doesn't supply all the standard
library, in particular the problematic <ctype.h> and <stdio.h>.
</q>

Those headers are required by a hosted C++ implementation, i.e. what people
normally think of as Standard C++, but not by freestanding implementations,
which you might encounter in embedded environments.

--
Doug Harrison
Microsoft MVP - Visual C++
Nov 17 '05 #11

"Doug Harrison [MVP]" <ds*@mvps.org> skrev i meddelandet
news:s5********************************@4ax.com...
Lucas Galfaso wrote:
Hi,

"Doug Harrison [MVP]" <ds*@mvps.org> wrote in message
news:r3********************************@4ax.com. ..
(...) Since all bit patterns of unsigned char are valid unsigned
char values, this means sizeof(int) has to be greater than
sizeof(char).


I believe that is somehow wrong, because C++ does not specify that
every
value that you can put into an unsigned char, must correspond to a
character.


I'm talking about unsigned char, not "characters", in the context of
fgetc
and <ctype.h> functions.
I fact, I used to work with a C++ compiler that sizeof(char) ==
sizeof(long)
and char was 64 bits long.


Let me give you an example. For that compiler, we can use the
relationship I
posted earlier to deduce that sizeof(int) is the same as sizeof(char)
and
sizeof(long). The macro EOF is some integer constant expression,
typically
an int equal to -1. For the sake of argument, let's assume the usual
two's
complement signed integer representation. Then EOF has the
representation
with all bits set. Now consider the definition of fgetc. It reads
unsigned
chars and returns them as ints. All bit patterns of unsigned char are
valid,
so unless int can represent all of them, we've got a problem. But
two's
complement signed ints can do this, so that's not an issue. The
problem is
that fgetc returns EOF when it reaches end of file or encounters an
error,
and EOF is -1, which has all bits set in two's complement. Thus,
there's no
way to distinguish an EOF return value from an unsigned char in the
file
that had all bits set. So, like I said, a compiler that wants to
implement
fgetc needs sizeof(int) > sizeof(char).


No, that is not a requirement. If the char type is large enough (like
CHAR_BITS == 32), it can't possibly use all bit patterns for valid
characters. One pattern can then be reserved for EOF.

That is exactly what we generally do for wchar_t and WEOF.
Bo Persson
Nov 17 '05 #12
Bo Persson wrote:
So, like I said, a compiler that wants to
implement fgetc needs sizeof(int) > sizeof(char).
No, that is not a requirement.


Yes, it is.
If the char type is large enough (like
CHAR_BITS == 32), it can't possibly use all bit patterns for valid
characters. One pattern can then be reserved for EOF.
Please, review the documentation for fgetc and <ctype.h>. They deal in
unsigned char cast to int. (And why would plain char be restricted as you
describe? What if plain char is unsigned?)
That is exactly what we generally do for wchar_t and WEOF.


I'm not familiar with those details so won't comment.

--
Doug Harrison
Microsoft MVP - Visual C++
Nov 17 '05 #13

"Doug Harrison [MVP]" <ds*@mvps.org> skrev i meddelandet
news:o5********************************@4ax.com...
Bo Persson wrote:
So, like I said, a compiler that wants to
implement fgetc needs sizeof(int) > sizeof(char).
No, that is not a requirement.


Yes, it is.
If the char type is large enough (like
CHAR_BITS == 32), it can't possibly use all bit patterns for valid
characters. One pattern can then be reserved for EOF.


Please, review the documentation for fgetc and <ctype.h>. They deal in
unsigned char cast to int. (And why would plain char be restricted as
you
describe? What if plain char is unsigned?)


But if char, unsigned char, and int are all the same size (like 32
bits), not all values will be used for characters so there is room for
reserving one value for EOF.

That is exactly what we generally do for wchar_t and WEOF.


I'm not familiar with those details so won't comment.


Ok, that is C++ where wide the character type wchar_t uses the value
wchar_t(-1) as the end-of-file signal WEOF for wide streams.
Bo Persson
Nov 17 '05 #14
Bo Persson wrote:
"Doug Harrison [MVP]" <ds*@mvps.org> skrev i meddelandet
news:o5********************************@4ax.com.. .
Bo Persson wrote:
So, like I said, a compiler that wants to
implement fgetc needs sizeof(int) > sizeof(char).

No, that is not a requirement.
Yes, it is.
If the char type is large enough (like
CHAR_BITS == 32), it can't possibly use all bit patterns for valid
characters. One pattern can then be reserved for EOF.


Please, review the documentation for fgetc and <ctype.h>. They deal in
unsigned char cast to int. (And why would plain char be restricted as
you
describe? What if plain char is unsigned?)


But if char, unsigned char, and int are all the same size (like 32
bits)


Note that char and unsigned char are always the same size. For any integer
type X, signed X and unsigned X are the same size, and for the type char,
plain char is a distinct type otherwise implemented the same as signed char
or unsigned char.
not all values will be used for characters so there is room for
reserving one value for EOF.


There's no basis for saying that.

At any rate, it's irrelevant for fgetc and <ctype.h>. Did you review their
documentation? Here's a proof of what I've been saying:

1. fgetc reads unsigned chars and returns them as ints.
2. For character types, all bits in the object representation participate in
the value representation.
3. All bit patterns of unsigned char are valid numbers.
4. Thus, if unsigned char and int are the same size, unsigned char will
necessarily use up all the values of int.
5. Therefore, there is no int value left over to represent EOF.
6. One cannot implement fgetc, <ctype.h>, and a handful of other library
functions if char and int are the same size, because the int value EOF
cannot be distinguished from a valid unsigned char represented by int, and
those functions require that distinction.
That is exactly what we generally do for wchar_t and WEOF.


I'm not familiar with those details so won't comment.


Ok, that is C++ where wide the character type wchar_t uses the value
wchar_t(-1) as the end-of-file signal WEOF for wide streams.


Actually, WEOF appears in the Standard C header <wchar.h>, and it has the
type wint_t, which is not necessarily wchar_t:

http://www.lysator.liu.se/c/na1.html
<q>
typedef ... wint_t;
An integral type unchanged by integral promotion. It must be capable of
holding every valid wide character, and also the value WEOF (described
below). It can be the same type as wchar_t.

WEOF
is an object*like macro which evaluates to a constant expression of type
wint_t. It need not be negative nor equal EOF, but it serves the same
purpose: the value, which must not be a valid wide character, is used to
represent an end of file or as an error indication.
</q>

This is explicitly spelled out compared to the situation with unsigned char,
int, EOF, fgetc, and <ctype.h>, for which you have to synthesize the
constraints from more fundamental properties.

--
Doug Harrison
Microsoft MVP - Visual C++
Nov 17 '05 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
by: Brett L. Moore | last post by:
Hi, I have had trouble determining whether the STL list.size() operation is O(1) or O(n). I know the list is a doubly-linked list, so if the size() operation begins at the head, then counts to...
19
by: dchow | last post by:
Our database size is currently 4G and is incrementing at a rate of 45M/day. What is the max size of a SQL database? And what is the size beyond which the server performance will start to go down?
23
by: BobK | last post by:
Hello Everyone, I am updating the CSS for my site but want new and original page to appear the same. My original CSS permits small increments of zooming (ctrl/mouse wheel) so that you get...
22
by: Wynand Winterbach | last post by:
I think every C programmer can relate to the frustrations that malloc allocated arrays bring. In particular, I've always found the fact that the size of an array must be stored separately to be a...
11
by: KarimL | last post by:
Thanks for your advices... but i need to get the Image height because i dynamically resize the height of my webcontrol based on the image height. More i just have the url (relative parth) to the...
5
by: Pohihihi | last post by:
Why can't the following work when it has a set property defined? this.textBox1.Size.Width = 25; I get error Cannot modify the return value of 'System.Windows.Forms.Control.Size' because it is...
5
by: Gerrit | last post by:
Hi all, I'm getting an OutOfMemoryException when I initialize a byte array in C# like this: Byte test = new Byte; I'm using ASP.NET 2.0. In ASP.Net 1.1 it works fine. So what am I doing
18
by: Diogenes | last post by:
Hi All; I, like others, have been frustrated with designing forms that look and flow the same in both IE and Firefox. They simply did not scale the same. I have discovered, to my chagrin,...
53
by: Jonas Smithson | last post by:
In his book "CSS: The Definitive Guide" 2nd edition (pgs. 116-117), Eric Meyer has an interesting discussion about "font-size-adjust" that was evidently dropped in CSS 2.1 due to browser...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.