473,327 Members | 2,069 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,327 software developers and data experts.

Natural size: int


On modern 32-Bit PC's, the following setup is common:

char: 8-Bit
short: 16-Bit
int: 32-Bit
long: 32-Bit

"char" is commonly used to store text characters.
"short" is commonly used to store large arrays of numbers, or perhaps wide
text characters (via wchar_t).
"int" is commonly used to store an integer.
"long" is commonly used to store an integer greater than 65535.

Now that 64-Bit machines are coming in, how should the integer types be
distributed? It makes sense that "int" should be 64-Bit... but what should
be done with "char" and "short"? Would the following be a plausible setup?

char: 8-Bit
short: 16-Bit
int: 64-Bit
long: 64-Bit

Or perhaps should "short" be 32-Bit? Or should "char" become 16-Bit (i.e.
16 == CHAR_BIT).

Another semi-related question:

If we have a variable which shall store the quantity of elements in an
array, then should we use "size_t"? On a system where "size_t" maps to
"long unsigned" rather than "int unsigned", it would seem to be inefficient
most of the time. "int unsigned" guarantees us at least 65535 array
elements -- what percentage of the time do we have an array any bigger than
that? 2% maybe? Therefore would it not make sense to use unsigned rather
than size_t to store array lengths (or the positive result of subtracting
pointers)?

--

Frederick Gotham
Aug 8 '06 #1
78 3785
Frederick Gotham wrote:
On modern 32-Bit PC's, the following setup is common:

char: 8-Bit
short: 16-Bit
int: 32-Bit
long: 32-Bit

"char" is commonly used to store text characters.
"short" is commonly used to store large arrays of numbers, or perhaps wide
text characters (via wchar_t).
"int" is commonly used to store an integer.
"long" is commonly used to store an integer greater than 65535.

Now that 64-Bit machines are coming in, how should the integer types be
distributed? It makes sense that "int" should be 64-Bit... but what should
be done with "char" and "short"? Would the following be a plausible setup?

char: 8-Bit
short: 16-Bit
int: 64-Bit
long: 64-Bit

Or perhaps should "short" be 32-Bit? Or should "char" become 16-Bit (i.e.
16 == CHAR_BIT).
For windows systems, Microsoft decided that with 64 bit machines it
will be
char 8, short 16, int 32, long 32, __int64 64

For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64
>
Another semi-related question:

If we have a variable which shall store the quantity of elements in an
array, then should we use "size_t"? On a system where "size_t" maps to
"long unsigned" rather than "int unsigned", it would seem to be inefficient
most of the time. "int unsigned" guarantees us at least 65535 array
elements -- what percentage of the time do we have an array any bigger than
that? 2% maybe? Therefore would it not make sense to use unsigned rather
than size_t to store array lengths (or the positive result of subtracting
pointers)?
There is no difference in 32 bit machines since a register will be 32
bits. If you fill 16 bits only, the other are wasted.

If you store the index data in memory in global variables or in disk,
where space is more important you *could* have some space gains by
using a short, or even a char. But beware of alignment issues. The
compiler will align data to 32 bits in most machines so the gains
could be very well zero.
Aug 8 '06 #2

"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:Xs*******************@news.indigo.ie...
>
On modern 32-Bit PC's, the following setup is common:

char: 8-Bit
short: 16-Bit
int: 32-Bit
long: 32-Bit

"char" is commonly used to store text characters.
"short" is commonly used to store large arrays of numbers, or perhaps wide
text characters (via wchar_t).
"int" is commonly used to store an integer.
"long" is commonly used to store an integer greater than 65535.

Now that 64-Bit machines are coming in, how should the integer types be
distributed? It makes sense that "int" should be 64-Bit... but what should
be done with "char" and "short"? Would the following be a plausible setup?

char: 8-Bit
short: 16-Bit
int: 64-Bit
long: 64-Bit
If you use int you want an integer.
If the manufacturer has kindly provided 64 bit registers, obviously he wants
you to use 64-bit integers.
So it seems pretty obvious what to do.
>
Or perhaps should "short" be 32-Bit? Or should "char" become 16-Bit (i.e.
16 == CHAR_BIT).

Another semi-related question:

If we have a variable which shall store the quantity of elements in an
array, then should we use "size_t"? On a system where "size_t" maps to
"long unsigned" rather than "int unsigned", it would seem to be
inefficient
most of the time. "int unsigned" guarantees us at least 65535 array
elements -- what percentage of the time do we have an array any bigger
than
that? 2% maybe? Therefore would it not make sense to use unsigned rather
than size_t to store array lengths (or the positive result of subtracting
pointers)?
size_t was a nice idea - a type to hold a size of an object in memory.
Sadly the implications weren't thought through - if you can't use an int to
index an array, then the machine manufacturer has done something weird and
wonderful with his address bus.

characters for character data
integers for integral data
double precision for floating point numbers.

That's all the world really needs, except byte for chunks of 8-bit data in
the rare cases where memory size matters.
--
www.personal.leeds.ac.uk/~bgy1mm
freeware games to download.
Aug 8 '06 #3
Malcolm wrote:
"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:Xs*******************@news.indigo.ie...
[size and nature of integral types]
size_t was a nice idea - a type to hold a size of an object in memory.
Sadly the implications weren't thought through - if you can't use an int to
index an array, then the machine manufacturer has done something weird and
wonderful with his address bus.
Consider a system with 64-bit pointers and 32-bit ints -- not that
far-fetched, right? On such a system size_t might be a 64-bit type as well.
You can still *use* ints to index an array, but not the huge arrays the
system might allow. (Whether that matters is another thing.)
characters for character data
You mean like Unicode?
integers for integral data
25! = 15511210043330985984000000
double precision for floating point numbers.
Which is double of what?
That's all the world really needs, except byte for chunks of 8-bit data in
the rare cases where memory size matters.
That is almost never the reason why bytes are used.

S.
Aug 8 '06 #4
jacob navia posted:
For windows systems, Microsoft decided that with 64 bit machines it
will be
char 8, short 16, int 32, long 32, __int64 64

For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64


What's the point in having a 64-Bit system if it's not taken advantage of? It
would be less efficient to use 32-Bit integers on a 64-Bit machine. It would
probably be more efficient to use 32-Bit integers on a 32-Bit machine rather
than on a 64-Bit machine, no?

When people use an "int", they expect it to be the most efficient integer
type.

--

Frederick Gotham
Aug 8 '06 #5
"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:Xs*******************@news.indigo.ie...
>
On modern 32-Bit PC's, the following setup is common:

char: 8-Bit
short: 16-Bit
int: 32-Bit
long: 32-Bit
Or long might be 64-bit. Or int might be 16-bit. You can find just
about every allowable combination out there in the wild.

....
Now that 64-Bit machines are coming in, how should the integer types
be distributed? It makes sense that "int" should be 64-Bit... but
what
should be done with "char" and "short"? Would the following be a
plausible setup?

char: 8-Bit
short: 16-Bit
int: 64-Bit
long: 64-Bit
That's referred to as ILP64, and there are indeed systems out there
like that. However, I32LP64 and IL32LLP64 are arguably more common.
Or perhaps should "short" be 32-Bit? Or should "char" become 16-Bit
(i.e. 16 == CHAR_BIT).
A char should be the smallest addressable unit of memory; if your
system only supports 16-bit (or greater) loads, it may be reasonable
to have CHAR_BIT==16, but expect to have to hack up virtually every
program you try to port. Even Cray and the DEC Alpha had to
synthesize 8-bit loads for the char type, because not doing so was
suicide.
Another semi-related question:

If we have a variable which shall store the quantity of elements
in an
array, then should we use "size_t"? On a system where "size_t" maps
to "long unsigned" rather than "int unsigned", it would seem to be
inefficient most of the time.
You assume that shorter ints are somehow more efficient than longer
ints; many modern processors have slow shorts and int is no faster
than long or long long.

Premature optimization is the root of all evil. Avoid the temptation
unless profiling shows it matters and the change actually helps in a
significant way. Then document the heck out of it.
"int unsigned" guarantees us at least 65535 array elements -- what
percentage of the time do we have an array any bigger than that?
2% maybe? Therefore would it not make sense to use unsigned
rather than size_t to store array lengths
If you use int (or long), you always have to worry about what happens
if/when you're wrong; use size_t and and you can promptly forget about
it.

I've seen all kinds of systems which crashed when one more record was
added, and it was always due to coders who assumed "we'll never have
more than 32767 employees/customers" or some such.
(or the positive result of subtracting pointers)?
The result of subtracting pointers is already ptrdiff_t, so why use
something else? ssize_t is about the only reasonable replacement, but
it's not portable. size_t is fine if you test to make sure the
difference is positive first. Do you really care so much about the
extra two or three letters it takes to use a type that is _guaranteed_
to work that you're willing to accept your program randomly breaking?

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking

--
Posted via a free Usenet account from http://www.teranews.com

Aug 8 '06 #6
Frederick Gotham <fg*******@SPAM.comwrites:
On modern 32-Bit PC's, the following setup is common:

char: 8-Bit
short: 16-Bit
int: 32-Bit
long: 32-Bit

"char" is commonly used to store text characters.
"short" is commonly used to store large arrays of numbers, or perhaps wide
text characters (via wchar_t).
"int" is commonly used to store an integer.
"long" is commonly used to store an integer greater than 65535.

Now that 64-Bit machines are coming in, how should the integer types be
distributed? It makes sense that "int" should be 64-Bit... but what should
be done with "char" and "short"? Would the following be a plausible setup?

char: 8-Bit
short: 16-Bit
int: 64-Bit
long: 64-Bit

Or perhaps should "short" be 32-Bit? Or should "char" become 16-Bit (i.e.
16 == CHAR_BIT).
Making int 64 bits leaves a gap in the type system; unless the
implementation has C99 extended integer types, either there's no
16-bit integer type or there's no 32-bit integer type.

A common setup on 64-bit systems is:

char: 8 bits
short: 16 bits
int: 32 bits
long: 64 bits
long long: 64 bits

Of course there are other possibilities.

Most 64-bit systems, I think, can perform 32-bit operations reasonably
efficiently, so there's not much disadvantage in defining int as 32
bits.

Also, unless you're an implementer, you don't have much influence.
Compiler writers get to decide who big the fundamental types are going
to be; as users, most of us just have to deal with whatever they
provide.
Another semi-related question:

If we have a variable which shall store the quantity of elements in an
array, then should we use "size_t"? On a system where "size_t" maps to
"long unsigned" rather than "int unsigned", it would seem to be inefficient
most of the time. "int unsigned" guarantees us at least 65535 array
elements -- what percentage of the time do we have an array any bigger than
that? 2% maybe? Therefore would it not make sense to use unsigned rather
than size_t to store array lengths (or the positive result of subtracting
pointers)?
If you're wondering about percentages like that, then you're
approaching the problem from the wrong perspective.

If you're sure an array can never have more than 65535 elements, go
ahead and use unsigned int to index it. If you're not sure how big it
can be, size_t is a reasonable choice.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Aug 8 '06 #7
Frederick Gotham <fg*******@SPAM.comwrites:
What's the point in having a 64-Bit system if it's not taken advantage of? It
would be less efficient to use 32-Bit integers on a 64-Bit machine. It would
probably be more efficient to use 32-Bit integers on a 32-Bit machine rather
than on a 64-Bit machine, no?
This is not necessarily the case. It may be just as efficient to
work with 32- or 64-bit integers on a system with 64-bit
general-purpose registers. Using 32-bit integers can save a good
deal of memory (especially in arrays or structures), so it may
make sense to use a 32-bit `int' on such systems.
--
"If I've told you once, I've told you LLONG_MAX times not to
exaggerate."
--Jack Klein
Aug 8 '06 #8
Frederick Gotham wrote:
jacob navia posted:
>For windows systems, Microsoft decided that with 64 bit machines it
will be
char 8, short 16, int 32, long 32, __int64 64

For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64

What's the point in having a 64-Bit system if it's not taken advantage of?
Why do you think it's not taken advantage of?
It
would be less efficient to use 32-Bit integers on a 64-Bit machine.
Of course a 64-bit ALU or instruction path might be able to handle two
32-bit integers in a similar number of clock cycles as the 32-bit system
can handle one.

A 64-bit architecture might be able to do things in the instruction
decoding, pipeline, or other datapath considerations that are not
possible in a 32-bit architecture.

Or maybe, a 64-bit machine deals with 32-bit integers by sign-extending
the leftmost 32-bits. Is that less efficient? Do you have metrics for
your platform?
It would
probably be more efficient to use 32-Bit integers on a 32-Bit machine rather
than on a 64-Bit machine, no?
This is not necessarily true; certainly untrue for every situation.
When people use an "int", they expect it to be the most efficient integer
type.
When people use an "int" in C, they expect it to be a numeric data type
no larger than a "long".

Aug 8 '06 #9
Stephen Sprunk posted:

>Or perhaps should "short" be 32-Bit? Or should "char" become 16-Bit
(i.e. 16 == CHAR_BIT).

A char should be the smallest addressable unit of memory; if your
system only supports 16-bit (or greater) loads, it may be reasonable
to have CHAR_BIT==16, but expect to have to hack up virtually every
program you try to port. Even Cray and the DEC Alpha had to
synthesize 8-bit loads for the char type, because not doing so was
suicide.

I don't see why (unless you're reading/writing data to/from disk perhaps?).

>Another semi-related question:

If we have a variable which shall store the quantity of elements
in an
array, then should we use "size_t"? On a system where "size_t" maps
to "long unsigned" rather than "int unsigned", it would seem to be
inefficient most of the time.

You assume that shorter ints are somehow more efficient than longer
ints; many modern processors have slow shorts and int is no faster
than long or long long.

No, I presume that int is faster than long (or they're both as fast as each
other).

I would expect the following:

speed_of(int) >= speed_of(long)

speed_of(int) >= speed_of(short) >= speed_of(char)

--

Frederick Gotham
Aug 8 '06 #10


Frederick Gotham wrote On 08/08/06 15:52,:
jacob navia posted:

>>For windows systems, Microsoft decided that with 64 bit machines it
will be
char 8, short 16, int 32, long 32, __int64 64

For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64


What's the point in having a 64-Bit system if it's not taken advantage of? It
Please define precisely what you mean by "a 64-Bit system."
What specific characteristic or set of characteristics causes
you to classify system X as "64-Bit" and system Y as "48-bit?"

Register width? Memory transaction width? Virtual address
width? Physical address width? What's your chosen criterion,
and why does it dominate the others? "64-bit CPUs" have been
around for a dozen years or so, and have evolved a variety of
different traits; simply saying "64-bit" is not sufficiently
specific.
would be less efficient to use 32-Bit integers on a 64-Bit machine. It would
probably be more efficient to use 32-Bit integers on a 32-Bit machine rather
than on a 64-Bit machine, no?

When people use an "int", they expect it to be the most efficient integer
type.
Please define precisely what you mean by "most efficient,"
or at least by "efficient." Are you concerned about instruction
speed inside the ALU? Access speed between ALU, assorted caches,
RAM, and swap? Total data size, with accompanying effects on
cache misses and page fault rates? Can you formulate a definition
that captures the crucial issues for all applications?

--
Er*********@sun.com

Aug 8 '06 #11
"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:UX*******************@news.indigo.ie...
jacob navia posted:
>For windows systems, Microsoft decided that with 64 bit machines it
will be
char 8, short 16, int 32, long 32, __int64 64

For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64

What's the point in having a 64-Bit system if it's not taken
advantage
of? It would be less efficient to use 32-Bit integers on a 64-Bit
machine.
If the programmer needed a 64-bit type, he would have used long long.
If he used int, he doesn't need more than 16 bits (or, given how bad
many coders are, 32 bits).
It would probably be more efficient to use 32-Bit integers on
a 32-Bit machine rather than on a 64-Bit machine, no?
This is not like the opposite case of 64-bit ints being slow on a
32-bit system.

It is not hard to imagine a 64-bit system that executed half-width
operations faster than full-width operations, and 32-bit ints or longs
definitely take up less memory, reducing cache misses and improving
overall memory performance and footprint. The speed hit you get
merely from using 64-bit pointers often makes systems slower in 64-bit
mode than they are in 32-bit mode; force them to 64-bit ints and it'd
get worse.
When people use an "int", they expect it to be the most efficient
integer
type.
The _largest_ integer type the system supports may not be the _most
efficient_.

If I'm working with really small numbers, using 16-bit shorts or even
8-bit chars is likely to be much faster than (or at worse, the same
speed as) 64+ bit long longs, even if I've got some whiz-bang 128-bit
processor.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking

--
Posted via a free Usenet account from http://www.teranews.com

Aug 8 '06 #12
"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:fk*******************@news.indigo.ie...
Stephen Sprunk posted:

>>Or perhaps should "short" be 32-Bit? Or should "char" become
16-Bit
(i.e. 16 == CHAR_BIT).

A char should be the smallest addressable unit of memory; if your
system only supports 16-bit (or greater) loads, it may be
reasonable
to have CHAR_BIT==16, but expect to have to hack up virtually every
program you try to port. Even Cray and the DEC Alpha had to
synthesize 8-bit loads for the char type, because not doing so was
suicide.


I don't see why (unless you're reading/writing data to/from disk
perhaps?).

>>Another semi-related question:

If we have a variable which shall store the quantity of
elements
in an
array, then should we use "size_t"? On a system where "size_t"
maps
to "long unsigned" rather than "int unsigned", it would seem to be
inefficient most of the time.

You assume that shorter ints are somehow more efficient than longer
ints; many modern processors have slow shorts and int is no faster
than long or long long.

No, I presume that int is faster than long (or they're both as fast
as each
other).

I would expect the following:

speed_of(int) >= speed_of(long)
True in every implementation I've seen; int is typically chosen to be
the fastest size that the platform supports (of at least 16 bits).
However, there is no guarantee that the designers of a particular
implementation possess common sense.
speed_of(int) >= speed_of(short) >= speed_of(char)
Not always true. shorts are slower than chars even on some common
platforms (e.g. Intel P6-based cores), and chars may be faster than
ints and shorts due to reduced memory pressure.

I would expect ints to be faster than shorts or chars in most
situations, but it's possible that memory effects may make chars
faster for particular tasks.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking

--
Posted via a free Usenet account from http://www.teranews.com

Aug 8 '06 #13

Frederick Gotham wrote:
On modern 32-Bit PC's, the following setup is common:

char: 8-Bit
short: 16-Bit
int: 32-Bit
long: 32-Bit

"char" is commonly used to store text characters.
"short" is commonly used to store large arrays of numbers, or perhaps wide
text characters (via wchar_t).
"int" is commonly used to store an integer.
"long" is commonly used to store an integer greater than 65535.

Now that 64-Bit machines are coming in, how should the integer types be
distributed?
Now coming in ?????

Discussions like this were common place 10 or 15 years ago when
mainstream 64-bit processors were coming in, and the subject was done
to death then. You'll find lots of interesting discussion of this
subject in the archives of this group and comp.std.c, among other
places.

The main issues to be considered are memory usage and compatibility
with the huge quantities of non-portable code which makes assumptions
(often implicit) about the size of objects. These are part of why 'long
long' eventually made it into C99. The questions of speed of integers
are what led to the int_fast types in C99. It's arguable that most of
the core language changes and extensions in C99 were driven by the
64-bit processors which had become widespread since C89 was
standardized.

Aug 8 '06 #14
"Malcolm" <re*******@btinternet.comwrote in message
news:S7******************************@bt.com...
"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:Xs*******************@news.indigo.ie...
>Now that 64-Bit machines are coming in, how should the integer
types be distributed? It makes sense that "int" should be 64-Bit...
but what should be done with "char" and "short"? Would the
following be a plausible setup?
....
If you use int you want an integer.
You mean an integer which is not required to hold values outside the
range -32767 to 32767, and which is probably the fastest integer type.
If the manufacturer has kindly provided 64 bit registers, obviously
he wants you to use 64-bit integers.
So it seems pretty obvious what to do.
What the manufacturer wants has nothing to do with what I want or what
the compiler writers want. I want my code to (1) work, and (2) run as
fast as possible. The manufacturer wants to extract money from my
wallet. There are no shortage of cases where 32-bit ints are faster
than 64-bit types on a processor that "kindly provided 64 bit
registers."

If I need an integer with at least 64 bits, I'll use long long; if I
want a fast integer, I'll use int; if I want a fast integer with at
least 32 bits, I'll use long. They may not be the same size, and it's
up to the compiler folks to pick which combination is best for a given
platform.
size_t was a nice idea - a type to hold a size of an object in
memory.
Sadly the implications weren't thought through - if you can't use an
int to index an array, then the machine manufacturer has done
something weird and wonderful with his address bus.
And such weird and wonderful things are allowed by the standard,
because they existed prior to its creation and the purpose of ANSI C
was, for the most part, to document what existed and not to create an
ideal language.

You can assume int or long (or long long) is good enough, and in most
cases you'll be right, but that non-portable assumption will
eventually come crashing down -- typically while you're doing a demo
for a customer, or years after the original coder left and you're
stuck with maintaining his crap. Use size_t and you'll never need to
worry about it. Thanks, ANSI.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking

--
Posted via a free Usenet account from http://www.teranews.com

Aug 8 '06 #15
Frederick Gotham <fg*******@SPAM.comwrites:
Stephen Sprunk posted:
>>Or perhaps should "short" be 32-Bit? Or should "char" become 16-Bit
(i.e. 16 == CHAR_BIT).

A char should be the smallest addressable unit of memory; if your
system only supports 16-bit (or greater) loads, it may be reasonable
to have CHAR_BIT==16, but expect to have to hack up virtually every
program you try to port. Even Cray and the DEC Alpha had to
synthesize 8-bit loads for the char type, because not doing so was
suicide.


I don't see why (unless you're reading/writing data to/from disk perhaps?).
Because, even though C doens't guarantee CHAR_BIT==8, there are a lot
of advantages to having CHAR_BIT==8. In particular, both Cray and DEC
Alpha systems run Unix-like operating systems; I suspect that
implementing Unix with 64-bit characters would be a nightmare. Even
if it's possible, exchanging data with other Unix-like systems would
be difficult.

For Cray vector systems, the code that uses most of the CPU time is
doing floating-point calculations; if some of the other code happens
to be slow, it doesn't matter a whole lot.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Aug 8 '06 #16
jacob navia wrote:
>
For windows systems, Microsoft decided that with 64 bit machines it
will be
char 8, short 16, int 32, long 32, __int64 64

For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64
gcc decided? I don't think so.

LP64 is by far the most common model on UNIX and UNIX like systems and
the main reason is probably pragmatic - it's the most straightforward
model to port 32 bit applications to.

--
Ian Collins.
Aug 8 '06 #17
On Tue, 08 Aug 2006 19:42:19 +0200, jacob navia
<ja***@jacob.remcomp.frwrote:
>
For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64
For *some* Unix systems, perhaps.

On HP-UX, a long is still 32 bits.

--
Al Balmer
Sun City, AZ
Aug 8 '06 #18
On Tue, 08 Aug 2006 19:52:20 GMT, Frederick Gotham
<fg*******@SPAM.comwrote:
>jacob navia posted:
>For windows systems, Microsoft decided that with 64 bit machines it
will be
char 8, short 16, int 32, long 32, __int64 64

For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64

What's the point in having a 64-Bit system if it's not taken advantage of? It
would be less efficient to use 32-Bit integers on a 64-Bit machine.
Not necessarily.
>It would
probably be more efficient to use 32-Bit integers on a 32-Bit machine rather
than on a 64-Bit machine, no?
Not necessarily. The same data path that can carry a 64-bit integer
can carry two 32-bit integers simultaneously.
>
When people use an "int", they expect it to be the most efficient integer
type.
--
Al Balmer
Sun City, AZ
Aug 8 '06 #19
Ian Collins wrote:
jacob navia wrote:
>>For windows systems, Microsoft decided that with 64 bit machines it
will be
char 8, short 16, int 32, long 32, __int64 64

For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64

gcc decided? I don't think so.

LP64 is by far the most common model on UNIX and UNIX like systems and
the main reason is probably pragmatic - it's the most straightforward
model to port 32 bit applications to.
Microsoft disagrees... :-)

I am not saying that gcc's decision is bad, I am just stating this as
a fact without any value judgement. Gcc is by far the most widely
used compiler under Unix, and they decided LP64, what probably is a good
decision for them.

Microsoft decided otherwise because they have another code base.

And lcc-win32 did not decide anything. Under windows I compile
for long 32 bits, under Unix I compiler for long 64 bits.

I have to follow the lead compiler in each system. By the way, the
lead compiler in an operating system is the compiler that compiled
the Operating System: MSVC under windows, gcc under linux, etc.

Aug 8 '06 #20
Al Balmer wrote:
On Tue, 08 Aug 2006 19:42:19 +0200, jacob navia
<ja***@jacob.remcomp.frwrote:

>>For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64


For *some* Unix systems, perhaps.

On HP-UX, a long is still 32 bits.
Interesting...

What does gcc do in those systems?

It follows HP or remains compatible with itself?
Aug 8 '06 #21
Al Balmer <al******@att.netwrites:
On Tue, 08 Aug 2006 19:42:19 +0200, jacob navia
<ja***@jacob.remcomp.frwrote:
>>For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64

For *some* Unix systems, perhaps.

On HP-UX, a long is still 32 bits.
gcc has 32-bit longs on some systems, 64-bit longs on others. For the
most part, the division is between "32-bit systems" and "64-bit
systems", though neither phrase is particularly meaningful. I know
that some versions of HP-UX are 32-bit systems, so it wouldn't
surprise me to see 32-bit longs. If there are 64-bit versions of
HP-UX, I'd expect gcc to have 64-bit long on that system, though I
wouldn't depend on it.

I *think* that gcc typically has sizeof(long) == sizeof(void*). In
fact, all the systems I currently use, perhaps even all the systems
I've ever used, have sizeof(long)==sizeof(void*) (either 32 or 64
bits), though of course there's no requirement for them to be the same
size.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Aug 8 '06 #22
Keith Thompson wrote:
Al Balmer <al******@att.netwrites:
>>On Tue, 08 Aug 2006 19:42:19 +0200, jacob navia
<ja***@jacob.remcomp.frwrote:
>>>For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64

For *some* Unix systems, perhaps.

On HP-UX, a long is still 32 bits.


gcc has 32-bit longs on some systems, 64-bit longs on others. For the
most part, the division is between "32-bit systems" and "64-bit
systems", though neither phrase is particularly meaningful. I know
that some versions of HP-UX are 32-bit systems, so it wouldn't
surprise me to see 32-bit longs. If there are 64-bit versions of
HP-UX, I'd expect gcc to have 64-bit long on that system, though I
wouldn't depend on it.

I *think* that gcc typically has sizeof(long) == sizeof(void*). In
fact, all the systems I currently use, perhaps even all the systems
I've ever used, have sizeof(long)==sizeof(void*) (either 32 or 64
bits), though of course there's no requirement for them to be the same
size.
That's a VERY wise decision.

Microsoft's decision of making sizeof(long) < sizeof(void *)
meant a LOT OF WORK at a customer's site recently. It was a basic
assumption of thir code.

Aug 8 '06 #23
jacob navia wrote:
Ian Collins wrote:
>jacob navia wrote:
>>For windows systems, Microsoft decided that with 64 bit machines it
will be
char 8, short 16, int 32, long 32, __int64 64

For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64

gcc decided? I don't think so.

LP64 is by far the most common model on UNIX and UNIX like systems and
the main reason is probably pragmatic - it's the most straightforward
model to port 32 bit applications to.

Microsoft disagrees... :-)
My comment was based on my experience with Sparc, I'm sure LP64 was
adopted as the 64 bit ABI (mainly to support mixed 32 and 64
applications IIRC) before gcc had a 64 bit Sparc port.

A bit OT, but can you mix 32 and 64 bit applications on 64 bit windows?
>
I have to follow the lead compiler in each system. By the way, the
lead compiler in an operating system is the compiler that compiled
the Operating System: MSVC under windows, gcc under linux, etc.
You have to, if you want to link with the system libraries!

--
Ian Collins.
Aug 8 '06 #24
On Wed, 09 Aug 2006 00:45:03 +0200, jacob navia
<ja***@jacob.remcomp.frwrote:
>Al Balmer wrote:
>On Tue, 08 Aug 2006 19:42:19 +0200, jacob navia
<ja***@jacob.remcomp.frwrote:

>>>For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64


For *some* Unix systems, perhaps.

On HP-UX, a long is still 32 bits.
Interesting...

What does gcc do in those systems?

It follows HP or remains compatible with itself?
It follows HP, naturally. It even uses HP libraries. On HP-UX
Integrity, it's a decent compiler, but can't yet produce optimized
code for the Itanium processor.

I find the idea of a compiler remaining "compatible with itself" to be
rather odd.

--
Al Balmer
Sun City, AZ
Aug 8 '06 #25
Ian Collins wrote:
jacob navia wrote:
>>Ian Collins wrote:

>>>jacob navia wrote:
For windows systems, Microsoft decided that with 64 bit machines it
will be
char 8, short 16, int 32, long 32, __int64 64

For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64
gcc decided? I don't think so.

LP64 is by far the most common model on UNIX and UNIX like systems and
the main reason is probably pragmatic - it's the most straightforward
model to port 32 bit applications to.

Microsoft disagrees... :-)

My comment was based on my experience with Sparc, I'm sure LP64 was
adopted as the 64 bit ABI (mainly to support mixed 32 and 64
applications IIRC) before gcc had a 64 bit Sparc port.

A bit OT, but can you mix 32 and 64 bit applications on 64 bit windows?
No. You can't load a 32 bit DLL (shared object) from a 64 bit
process. Of course you can use system() to call a 32 bit
application but that is another process that runs in the 32 bit
emulation layer.
>
>>I have to follow the lead compiler in each system. By the way, the
lead compiler in an operating system is the compiler that compiled
the Operating System: MSVC under windows, gcc under linux, etc.

You have to, if you want to link with the system libraries!
Yes, that's the reason.
Aug 8 '06 #26
jacob navia writes:
For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64
gcc decided to do whatever the system was already doing to the best of
its ability, so that gcc-compiled code could be used with existing
libraries (like the C library:-) It contains a lot of code to figure
that out.

If it didn't, nobody would have used it. Remember, gcc is older than
Linux & co. When the GNU project was started, Unixes were full of
proprietary closed-source libraries - and there are still plenty of them
around, on systems where gcc is in use.

--
Hallvard
Aug 9 '06 #27
The funny thing is this issue was partly solved in 1958, 1964, and in
1971.

In 1958 Grace Hopper and Co. designed COBOL so you could actually
declare variables and their allowed range of values! IIRC something
like:

001 DECLARE MYPAY PACKED-DECIMAL PICTURE "999999999999V999"
001 DECLARE MYPAY USAGE IS COMPUTATIONAL-/1/2/3

Miracle! A variable with predictable and reliable bounds! Zounds!
But this still isnt quite the ticket, you can only specify decimal
numbers to arbitrary and dependable precision, floating-point numbers
are still a hodge-podge.

Then in 1964 PL/I (may it rest in peace) tried to combine COBOL and
Fortran and Algol, with mixed results. But they did extend
declarations to make variable declarations a bit better, so you could
specify, separately, decimal or binary math, and the number of decimal
places before and after the decimal point:

dcl J fixed binary (7);
dcl K fixed bin (8) unsigned;
dcl L fixed bin (15);
dcl M fixed bin;
dcl N fixed bin (31);
dcl O fixed bin (31,16);
dcl S fixed dec (7,2);

Still not quite the ticket.
Then in 1971 Pascal went in a slightly different direction, but not far
enough, by letting you specify explicit variable bounds, but only for
integer and scalar (enum) types:

var Year: 1900..3000;

------

What the typical user REALLY needs, and may not be doable in any simple
or portable way, is a wide choice of what one wants:

Case (1): I need an integer that can represent the part numbers in a
1996 Yugo, well known to range from 0 to 12754. Pascal leads the way
here: var PartNumber: 0..12754.

Case (2): I need a real that can represent the age of the universe, in
Planck units, about 3.4E-42 to 12.E84. The accuracy of measurement is
only plus or minus 20 percent, so two significant digits is plenty.
PL/I is the only language I know of that can adequately declare this:
DECLARE AGE FLOAT BINARY (2,85). Which hopefully the compiler will map
to whatever floating-point format can handle that exponent.

case (3): I need an integer that can do exact math with decimal prices
from 0.01 to 999,999,999,99. COBOL and PL/I can do this.

case (4): I need a 32-bit integer. Pascal and PL/I are the only
languages that can do this: var I32: -$7FFFFFFF..$7FFFFFFF; PL/I:
DECLARE I32 FIXED BINARY (32).

case (5): I need whatever integer format is fastest on this CPU and
is at least 32 bits wide. Don't know of any language that has this
capability.

------------------------

<soap>
I think it's high time a language has the ability to do the very basic
and simple things programmers need to write portable software: the
ability to specify, unambiguously, what range of values they need to
represent, preferably in decimal, binary, floating, fixed, and even
arbitrary bignum formats. Not to mention hardware-dependent perhaps
bit widths. There's no need for the compiler to be able to actually
*do* any arbitrarily difficult arithmetic, but at least give the
programmer the ability to ASK and if the compiler is capable, and get
DEPENDABLE math. I don't think this is asking too much.

The biness of C having wink-wink recognized defacto binary integer
widths is IMHO just way below contempt. The way was shown many
different times since 1958, why can't we get something usable, portable
and reasonable now quite a ways into the 21st century?
</soap>

Aug 9 '06 #28
"Ancient_Hacker" <gr**@comcast.netwrites:
>
<soap>
I think it's high time a language has the ability to do the very basic
and simple things programmers need to write portable software: the
ability to specify, unambiguously, what range of values they need to
represent, preferably in decimal, binary, floating, fixed, and even
arbitrary bignum formats. Not to mention hardware-dependent perhaps
bit widths. There's no need for the compiler to be able to actually
*do* any arbitrarily difficult arithmetic, but at least give the
programmer the ability to ASK and if the compiler is capable, and get
DEPENDABLE math. I don't think this is asking too much.
It is if you want the system to perform.

But than ADA does all that you want and more.

And its not usually the compiler that does any arithmetic : its normally
runtime bounds checks. Can be very time consuming. And one reason why C
doesnt do it. C isnt about things like that.
Aug 9 '06 #29
"Ancient_Hacker" <gr**@comcast.netwrote:
The biness of C having wink-wink recognized defacto binary integer
widths is IMHO just way below contempt. The way was shown many
different times since 1958, why can't we get something usable, portable
and reasonable now quite a ways into the 21st century?
Darling, if you want Ada, you know where to find her.

Richard
Aug 9 '06 #30
On 2006-08-08 18:56:53 -0400, jacob navia <ja***@jacob.remcomp.frsaid:
Keith Thompson wrote:
>Al Balmer <al******@att.netwrites:
>>On Tue, 08 Aug 2006 19:42:19 +0200, jacob navia
<ja***@jacob.remcomp.frwrote:

For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64

For *some* Unix systems, perhaps.

On HP-UX, a long is still 32 bits.

gcc has 32-bit longs on some systems, 64-bit longs on others. For the
most part, the division is between "32-bit systems" and "64-bit
systems", though neither phrase is particularly meaningful. I know
that some versions of HP-UX are 32-bit systems, so it wouldn't
surprise me to see 32-bit longs. If there are 64-bit versions of
HP-UX, I'd expect gcc to have 64-bit long on that system, though I
wouldn't depend on it.

I *think* that gcc typically has sizeof(long) == sizeof(void*). In
fact, all the systems I currently use, perhaps even all the systems
I've ever used, have sizeof(long)==sizeof(void*) (either 32 or 64
bits), though of course there's no requirement for them to be the same
size.

That's a VERY wise decision.

Microsoft's decision of making sizeof(long) < sizeof(void *)
meant a LOT OF WORK at a customer's site recently. It was a basic
assumption of thir code.
Well, it could be argued that it was a good decision as it helped to
expose flaws in their code.
--
Clark S. Cox, III
cl*******@gmail.com

Aug 9 '06 #31
In article <11**********************@m73g2000cwd.googlegroups .com"Ancient_Hacker" <gr**@comcast.netwrites:
....
Case (1): I need an integer that can represent the part numbers in a
1996 Yugo, well known to range from 0 to 12754. Pascal leads the way
here: var PartNumber: 0..12754.
type PartNumber is range 0 .. 12754;
Case (2): I need a real that can represent the age of the universe, in
Planck units, about 3.4E-42 to 12.E84. The accuracy of measurement is
only plus or minus 20 percent, so two significant digits is plenty.
PL/I is the only language I know of that can adequately declare this:
DECLARE AGE FLOAT BINARY (2,85). Which hopefully the compiler will map
to whatever floating-point format can handle that exponent.
type Age is digits 2 range 3.4E-42 .. 12.0E84;
case (3): I need an integer that can do exact math with decimal prices
from 0.01 to 999,999,999,99. COBOL and PL/I can do this.
type Amount is delta 0.01 range 0.01 .. 999_999_999.99;
case (4): I need a 32-bit integer. Pascal and PL/I are the only
languages that can do this: var I32: -$7FFFFFFF..$7FFFFFFF; PL/I:
DECLARE I32 FIXED BINARY (32).
type I32 is range -16#7FFFFFFF .. 16#7FFFFFFF;
case (5): I need whatever integer format is fastest on this CPU and
is at least 32 bits wide. Don't know of any language that has this
capability.
Long_Integer

(If it is defined of course, if not, there probably are no 32 bit wide
integers available. But if it is defined, it is at least 32 bits wide.)

This all in Ada.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
Aug 9 '06 #32
jacob navia <ja***@jacob.remcomp.frwrote:
Keith Thompson wrote:
I *think* that gcc typically has sizeof(long) == sizeof(void*). In
fact, all the systems I currently use, perhaps even all the systems
I've ever used, have sizeof(long)==sizeof(void*) (either 32 or 64
bits), though of course there's no requirement for them to be the same
size.

That's a VERY wise decision.
Why? Whatever good does it do? Don't tell me you have the beginner's
habit of casting void *s back and forth to integers...
Microsoft's decision of making sizeof(long) < sizeof(void *)
meant a LOT OF WORK at a customer's site recently. It was a basic
assumption of thir code.
It is a very, very unwise assumption - as you experienced.

Richard
Aug 9 '06 #33

Richard Bos wrote:
"Ancient_Hacker" <gr**@comcast.netwrote:
The biness of C having wink-wink recognized defacto binary integer
widths is IMHO just way below contempt. The way was shown many
different times since 1958, why can't we get something usable, portable
and reasonable now quite a ways into the 21st century?

Darling, if you want Ada, you know where to find her.

Richard
!! ewwww. yuck. Somehow after I took one look at Ada, I just put it
out of my mind. I think a lot of people did that. Maybe when there's
a open source Ada compiler that generates good code and doesnt give me
the feeling I'm writing a cruise-missle guidance program..

Aug 9 '06 #34
jacob navia wrote:
Al Balmer wrote:
>On Tue, 08 Aug 2006 19:42:19 +0200, jacob navia
<ja***@jacob.remcomp.frwrote:

>>For unix systems, gcc decided that
char 8, short 16, int 32, long 64, long long 64


For *some* Unix systems, perhaps.

On HP-UX, a long is still 32 bits.
Interesting...

What does gcc do in those systems?

It follows HP or remains compatible with itself?
gcc is retargetable; it has multiple backends. Your question makes no sense.
gcc doesn't make specific assumptions and then looks for platforms with
which it's compatible; gcc is ported by adapting the backend to the platform.

If it didn't, it would be a whole lot less portable, and a whole lot less
useful.

S.
Aug 9 '06 #35
"Ancient_Hacker" <gr**@comcast.netwrites:
Richard Bos wrote:
>"Ancient_Hacker" <gr**@comcast.netwrote:
The biness of C having wink-wink recognized defacto binary integer
widths is IMHO just way below contempt. The way was shown many
different times since 1958, why can't we get something usable, portable
and reasonable now quite a ways into the 21st century?

Darling, if you want Ada, you know where to find her.

Richard

!! ewwww. yuck. Somehow after I took one look at Ada, I just put it
out of my mind. I think a lot of people did that. Maybe when there's
a open source Ada compiler that generates good code and doesnt give me
the feeling I'm writing a cruise-missle guidance program..
<OT>
There is an open source Ada compiler; it's called GNAT, and it's part
of gcc. Nobody can do anything about how you feel when you're writing
code, though.

You complain that no one language provides a set of features. When
someone mentions one that does, you reaction is "ewwww. yuck." Hmm.
</OT>

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Aug 9 '06 #36
Ancient_Hacker wrote:
The funny thing is this issue was partly solved in 1958, 1964, and in
1971.

In 1958 Grace Hopper and Co. designed COBOL so you could actually
declare variables and their allowed range of values! IIRC something
like:

001 DECLARE MYPAY PACKED-DECIMAL PICTURE "999999999999V999"
001 DECLARE MYPAY USAGE IS COMPUTATIONAL-/1/2/3

Miracle! A variable with predictable and reliable bounds! Zounds!
COBOL more than made up for its innovative and genuinely useful approach to
type declarations with its innovative and genuinely horrendous approaches to
syntax, control flow and type conversions. That said, it's good to remember
the ways in which COBOL didn't suck.

<snip>
What the typical user REALLY needs, and may not be doable in any simple
or portable way, is a wide choice of what one wants:
Well, it's still a C group, so let's look at how far our language of
topicality gets us.
Case (1): I need an integer that can represent the part numbers in a
1996 Yugo, well known to range from 0 to 12754. Pascal leads the way
here: var PartNumber: 0..12754.
Although C offers less flexibility and safety in this regard, an adequate
approximation is to use the smallest type guaranteed to contain your range.
In this case, "short" (or "int" if time is of more concern than space, since
"int" is intended to be more "natural" to the platform).

You can use a typedef to abstract away from the actual type and convey the
purpose to humans, but C does not allow true subtyping, so you'd have to do
any range checks yourself. This obviously encourages a style where these
checks are done as little as possible, or possibly never, which is a clear
drawback.
Case (2): I need a real that can represent the age of the universe, in
Planck units, about 3.4E-42 to 12.E84. The accuracy of measurement is
only plus or minus 20 percent, so two significant digits is plenty.
PL/I is the only language I know of that can adequately declare this:
DECLARE AGE FLOAT BINARY (2,85). Which hopefully the compiler will map
to whatever floating-point format can handle that exponent.
And which will hopefully not introduce any glaring rounding or accuracy
errors when involved in calculations with other floating-point types. That
the language offers this is neat, but in order to use these types
effectively special attention is required, which reduces the advantage over
not having these types natively and having to think about the calculations
from the beginning.
case (3): I need an integer that can do exact math with decimal prices
from 0.01 to 999,999,999,99. COBOL and PL/I can do this.
C cannot do this natively, so you'll need libraries. Luckily, C also makes
it possible to implement such libraries efficiently. This is a good way of
highlighting the differences in philosophy.
case (4): I need a 32-bit integer. Pascal and PL/I are the only
languages that can do this: var I32: -$7FFFFFFF..$7FFFFFFF; PL/I:
DECLARE I32 FIXED BINARY (32).
You can do this in C, but only if your platform actually provides a native
integer type of exactly 32 bits. Then #include <stdint.hand use int32_t.
If your platform doesn't have <stdint.h>, porting won't be for free, but not
exactly involved either.

This boils down to the question of *why* you need a 32-bit integer. Do you
absolutely, positively, have to have an integral type that takes up exactly
32 bits, or else your algorithm will simply fail or be inapplicable? Then
int32_t is exactly what you mean.

If you'd merely like an integer type with range
[-2,147,483,647;2,147,483,647] which may use more bits if this is
convenient, then you're back to case (1), and for C "long" will do the
trick. Again, range checking you will have to handle yourself.
case (5): I need whatever integer format is fastest on this CPU and
is at least 32 bits wide. Don't know of any language that has this
capability.
That's odd, because this is just one of those things C was practically made
to do. I should think C's "long" will get you as close as possible. Or
#include <stdint.hand use int_fast32_t to emphasize "fastest". In any
case, if such an integer format exists at all, you will surely find it
implemented in the platform's C compilers.

<snip>
The biness of C having wink-wink recognized defacto binary integer
widths is IMHO just way below contempt.
You're doing the language a disservice by describing it this way. C has
well-defined constraints on its integer types, designed to allow a broad
range of platforms to provide integer calculations in a way most suited to
that platform. It's true that this shifts the burden to the programmer in a
way that may be considered inappropriate for many applications, but it's
certainly not "wink-wink" or "de facto" -- and who cares if it started out
this way?

"De facto" assumptions are the reason programmers get the worst of both
worlds: C leaves the types relatively unconstrained to promote portability,
yet (poor) programmers will assume that the integer types will be identical
across platforms. You can prefer one or the other, but shouldn't be
expecting the opposite of what you actually have and be dismayed by the results.
<soap>
I think it's high time a language has the ability to do the very basic
and simple things programmers need to write portable software:
The one thing programmers need to write portable software is a good grasp of
the ways in which different computer systems are equal, and the ways in
which they are different, and the ways in which this might matter to your
specific program. No individual language will confer such insight.
the ability to specify, unambiguously, what range of values they need to
represent, preferably in decimal, binary, floating, fixed, and even
arbitrary bignum formats. Not to mention hardware-dependent perhaps bit
widths. There's no need for the compiler to be able to actually *do*
any arbitrarily difficult arithmetic, but at least give the programmer
the ability to ASK and if the compiler is capable, and get DEPENDABLE
math. I don't think this is asking too much.
If there's no need for the compiler to be able to do what you ask, then how
is your program portable? You're asking for a system that implements what
you want it to implement, or else your program will not run on it. This is
portability by demanding the system be suitable for your program, rather
than the other way around.

All programming languages are a compromise between allowing the programmer
to write down exactly what they want and allowing the programmer to write
down that which is effectively translatable and executable. You can take
individual languages to task for making the wrong compromises for your
purposes, but taking them to task for compromising at all is slightly
disingenuous.

As others have mentioned, Ada goes a long way towards implementing the
numerical paradise you envision. Of course, Ada has its own problems.
The way was shown many different times since 1958, why can't we get
something usable, portable and reasonable now quite a ways into the 21st
century?
Reasonability is in the eye of the beholder, but anyone who would deny C's
usability or portability, even when only restricted to numerical types, is
being deliberately obtuse.

If you're asking in general how languages are created and adopted, and why
the Right Language always seems to lose to the Right Now Language, that's
quite another topic.

S.
Aug 9 '06 #37
On Wed, 09 Aug 2006 20:40:52 +0200, Skarmander
<in*****@dontmailme.comwrote:
>You can use a typedef to abstract away from the actual type and convey the
purpose to humans, but C does not allow true subtyping, so you'd have to do
any range checks yourself. This obviously encourages a style where these
checks are done as little as possible, or possibly never, which is a clear
drawback.
Perhaps, but it also allows a style where such checks are done only
when necessary.

--
Al Balmer
Sun City, AZ
Aug 9 '06 #38
"Stephen Sprunk" <st*****@sprunk.orgwrote
>
Do you really care so much about the extra two or three letters it takes
to use a type that is _guaranteed_ to work that you're willing to accept
your program randomly breaking?
Yes. If we just have one variable in scope, a bit of gibberish at the start
and end of the type is neither here nor there.
But if we've got several in scope, and we need another gibberish prefix like
xlu_ to represent the library we're using, and another gibberish suffix like
U8 because someone has insisted on that notation, and the code is littered
with gibberish casts and gibberish comments, and gibberish constructs like
for(;;) and "123 = xlu_i_U8;" then the gibberish accumulates, and suddenly
you find youself in the situation where the code is no longer readable.
--
www.personal.leeds.ac.uk/~bgy1mm
freeware games to download.

Aug 9 '06 #39


--
www.personal.leeds.ac.uk/~bgy1mm
freeware games to download.

"Stephen Sprunk" <st*****@sprunk.orgwrote
"Malcolm" <re*******@btinternet.comwrote in message
>"Frederick Gotham" <fg*******@SPAM.comwrote in message
news:Xs*******************@news.indigo.ie...
>>Now that 64-Bit machines are coming in, how should the integer
types be distributed? It makes sense that "int" should be 64-Bit...
but what should be done with "char" and "short"? Would the
following be a plausible setup?
...
>If you use int you want an integer.

You mean an integer which is not required to hold values outside the
range -32767 to 32767, and which is probably the fastest integer type.
My programmers are bunny rabbits. They only count to five. Anything bigger
than that is "haraba".
>
What the manufacturer wants has nothing to do with what I want or what the
compiler writers want. I want my code to (1) work, and (2) run as fast as
possible. The manufacturer wants to extract money from my wallet. There
are no shortage of cases where 32-bit ints are faster than 64-bit types on
a processor that "kindly provided 64 bit registers."
The nice engineer, believe me, wants you to have a machine that will run
your programs efficiently. Of course the nasty money men who run his company
are trying to screw money out of the nasty money men who run yours, but that
hardly affects people like us.
>
If I need an integer with at least 64 bits, I'll use long long; if I want
a fast integer, I'll use int; if I want a fast integer with at least 32
bits, I'll use long. They may not be the same size, and it's up to the
compiler folks to pick which combination is best for a given platform.
You see I just want an integer. Integers count things. If it can't count the
bytes in the computer, then I would be annoyed.
>
And such weird and wonderful things are allowed by the standard, because
they existed prior to its creation and the purpose of ANSI C was, for the
most part, to document what existed and not to create an ideal language.

You can assume int or long (or long long) is good enough, and in most
cases you'll be right, but that non-portable assumption will eventually
come crashing down -- typically while you're doing a demo for a customer,
or years after the original coder left and you're stuck with maintaining
his crap. Use size_t and you'll never need to worry about it. Thanks,
ANSI.
As long as your assumption that 32 bits is enough is accurate. If you assume
that an integer can count anything, then it is responsibility of the
compiler maker to make sure it can count anything, or at least anything that
is likely to crop up on that machine - you may have a company with over
32000 employees, for example, but it is not going to try to run its payroll
on a machine with 16-bit registers.
Aug 9 '06 #40
"jacob navia" <ja***@jacob.remcomp.frwrote in message
news:44*********************@news.orange.fr...
Ian Collins wrote:
>LP64 is by far the most common model on UNIX and UNIX like
systems and the main reason is probably pragmatic - it's the most
straightforward model to port 32 bit applications to.

Microsoft disagrees... :-)
MS also disagrees with C standard (and hundreds of other standards)
compliance; they're hardly an authority unless your only goal is to be
compatible with their stuff.

IL32LLP64 makes sense when either (a) 32-bit ops are faster than
64-bit ones, or (b) you know the vast majority of the codebase assumes
longs are only 32-bit. Both apply to MS's target market for their
compiler and OS; they don't apply to the general UNIX world.
I am not saying that gcc's decision is bad, I am just stating this
as
a fact without any value judgement. Gcc is by far the most widely
used compiler under Unix, and they decided LP64, what probably is
a good decision for them.
I'm sure whoever came up with the Linux ABI consulted the GCC folks,
but such things are primarily determined by the OS developers, not the
compiler developers. If the compiler doesn't follow the OS API, it's
close to useless.
I have to follow the lead compiler in each system. By the way, the
lead compiler in an operating system is the compiler that compiled
the Operating System: MSVC under windows, gcc under linux, etc.
You're making distinctions where they aren't needed. Each platform
has an official ABI (or two), and that ABI is determined jointly by
the compiler folks and the OS folks. If you weren't included in that
discussion, all you can do is follow the ABI that they specified.

Heck, that's the position GCC is in on most platforms, and they have a
hell of a lot more influence than you do.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking

--
Posted via a free Usenet account from http://www.teranews.com

Aug 9 '06 #41

Skarmander wrote:

If there's no need for the compiler to be able to do what you ask, then how
is your program portable? You're asking for a system that implements what
you want it to implement, or else your program will not run on it.
Well, at least the compiler will print at compile time: "Sorry, I
can't do 35 decimal digits" instead of "TRAP-INTOVFLO" at inconvenient
moments.

Reasonability is in the eye of the beholder, but anyone who would deny C's
usability or portability, even when only restricted to numerical types, is
being deliberately obtuse.
Most of the C programs I see have portability grafted on thru very
clever sets of

#ifdef HP_UX || AIX_1.0
#endif

I may be way iconoclastic, but I don't consider that "portability".

If you're asking in general how languages are created and adopted, and why
the Right Language always seems to lose to the Right Now Language, that's
quite another topic.
Yep, it happens almost every time.

Thanks for your thoughful comments.

Aug 9 '06 #42
"Ancient_Hacker" <gr**@comcast.netwrites:
[...]
Most of the C programs I see have portability grafted on thru very
clever sets of

#ifdef HP_UX || AIX_1.0
#endif

I may be way iconoclastic, but I don't consider that "portability".
(Or something like that; AIX_1.0 isn't a valid preprocessor symbol,
but the point is clear enough.)

Ideally, a large portion of a well-written C application is portable,
and doesn't require such preprocessor tricks. Some things,
particularly involving interaction with the operating system, are
going to require non-portable solutions. The various "#ifdef" tricks
are a way to manage that non-portability -- which should still be kept
as isolated as practical.

It's entirely possible to write useful C code that's entirely portable
to any conforming hosted implementation. It's also entirely possible
to write C code that's 90% portable, with only carefully controlled
portions of it using non-portable constructs.

And, of course, it's also entirely possible to write a non-portable
mess in *any* language. Arguably C makes this easier than some other
languages, but it doesn't make it necessary.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Aug 9 '06 #43
In article <11**********************@m73g2000cwd.googlegroups .com>
Ancient_Hacker <gr**@comcast.netwrote:
>I think it's high time a language has the ability to do the very basic
and simple things programmers need to write portable software: the
ability to specify, unambiguously, what range of values they need to
represent, preferably in decimal, binary, floating, fixed, and even
arbitrary bignum formats. Not to mention hardware-dependent perhaps
bit widths.
You are not the first to observe this; and it certainly seems like
a reasonable request.
>There's no need for the compiler to be able to actually
*do* any arbitrarily difficult arithmetic ...
I think that if one drops this requirement, one winds up with
compilers that promise the moon, but deliver only cheese.
>The biness of C having wink-wink recognized defacto binary integer
widths is IMHO just way below contempt.
And yet, oddly, it sort-of mostly works. Kinda, sorta. :-)
>why can't we get something usable, portable
and reasonable now quite a ways into the 21st century?
The problem is, apparently it is quite difficult. We keep ending
up with PL/I. :-)

(Pascal's, or Ada's, integer ranges seem to mostly work OK. Lisp's
handling of integers works even better. The real problems seem to
lie in floating-point. Well, that, and language design is just
plain hard: unless one limits the problem domain quite strictly,
the usual result is something horribly complicated, and yet never
complete.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Aug 9 '06 #44
Ancient_Hacker wrote:
Skarmander wrote:

>If there's no need for the compiler to be able to do what you ask, then how
is your program portable? You're asking for a system that implements what
you want it to implement, or else your program will not run on it.

Well, at least the compiler will print at compile time: "Sorry, I
can't do 35 decimal digits" instead of "TRAP-INTOVFLO" at inconvenient
moments.
Overflow detection for integers is a sticky issue, especially with C, which
essentially offers no portable way to detect overflow (so the only way to
deal with it is to make sure it never happens, which is a pain). C's excuse
is that not all platforms have overflow detection in the first place, and
for those that do the cost of overflow checking may be unacceptable. Still,
a maybe-not-available-but-standardized-if-it-is language mechanism might
have been preferable.

But aside from that, a compiler couldn't detect overflow at compile time,
except for very obvious cases that even a C compiler or static checking tool
might diagnose. As I understood it, you wanted the compiler to refuse things
like a request for a floating-point type with 4 significant digits in base
12 if it can't reasonably calculate with such a type.

The best a language (and a compiler) could offer you is a single definition
of overflow and a promise to consistently detect it, but avoiding or
handling it would still be up to the programmer. This is quite reasonable,
though, and a far cry from your original "I'd like to be able to use any
well-defined arithmetic type and have the compiler tell me if it can do it
or not".
>Reasonability is in the eye of the beholder, but anyone who would deny C's
usability or portability, even when only restricted to numerical types, is
being deliberately obtuse.

Most of the C programs I see have portability grafted on thru very
clever sets of

#ifdef HP_UX || AIX_1.0
#endif

I may be way iconoclastic, but I don't consider that "portability".
Three issues usually are at play here:

- Programmers who wouldn't know portability if, to paraphrase Blackadder,
"it painted itself purple and danced naked on a harpsichord singing
'portability is here again'". That is, gratuitously assuming things the
standard doesn't guarantee because it makes things "easier", when perfectly
portable alternatives exist; or refusing to write a single portable
implementation but instead focusing on individual platforms because it's
"faster". This is especially true in the case of numerical types; having
platform tests for those is just plain stupid.

- Needing functionality that is not defined by the standard, but available
on many or most target platforms; or being able to use optional
functionality that is available on some platforms, but not all. For
UNIX-like systems, you can hope the POSIX standard has what you need, but if
not, you may need to interface with different and incompatible interfaces to
system routines.

- Workarounds. What you're doing ought to work according to the standard(s),
but it doesn't on some platform, and you do not have the option of telling
your customers to get their platform fixed and accept a broken program in
the meantime.

We can dismiss case #1. For case #2, detecting what is possible at runtime
is usually not an option, nor is roll-your-own. The proper approach is to
sequester all platform-dependent functionality behind unified interfaces and
use those. This is the radical concept of a *library*, often eschewed out of
sheer laziness or out of a mistaken conviction that directly using a
specific interface is faster than defining and translating to a generic
interface, if such is required. (It may be, but whether the runtime cost of
this actually matters compared to the maintenance cost is almost never
considered.)

Whatever mechanisms are used to detect and enable the proper implementation
at compile time should be centralized in one place, preferably not in the
source at all, but the build process (this has its own drawbacks, but I
consider it a lot better than uglifying the source with "just one more
#ifdef"). If you must use #ifdef...#endif, though, it should ideally be
restricted to a few files that do nothing more than #include the proper
alternative.

Unfortunately, a combination of laziness, infatuation with the preprocessor
and being restricted to a spartan build system often win, and people will
squeeze 14 essentially different functions in one file, each contained in
its own #ifdef...#endif, or worse, there's just one function which has code
in a maze of #ifs, rendering it practically unreadable. The people who do
this will usually defend it with the observation that code is being shared
between the alternatives, which usually just means that they're not dividing
up their functions properly, or overestimating the usefulness of sharing
code between incompatible implementations in the first place.

FInally, for case #3, I'd consider an incidental #ifdef acceptable, if the
#ifdef focused on what workaround is being enabled, not what platform is
being compiled on. E.g. "#ifdef USE_MEMFROB_RETURNS_LONG_WORKAROUND", not
"#if defined(WIN32) && WIN32_VERSION 3 && PHASE_OF_MOON != WANING".

S.
Aug 9 '06 #45
Al Balmer wrote:
On Wed, 09 Aug 2006 20:40:52 +0200, Skarmander
<in*****@dontmailme.comwrote:
>You can use a typedef to abstract away from the actual type and convey the
purpose to humans, but C does not allow true subtyping, so you'd have to do
any range checks yourself. This obviously encourages a style where these
checks are done as little as possible, or possibly never, which is a clear
drawback.

Perhaps, but it also allows a style where such checks are done only
when necessary.
I considered this, but we're talking an optimization for which a compiler
may be very proficient at eliminating unnecessary checks, while the
programmer may be very proficient at forgetting a necessary check. If the
range is truly part of the type's semantics, and not just something to be
checked at module entry/exit, then not checking should be the exception
rather than the rule.

In the C spirit, ranged types could be toggled as entirely unchecked if
desired, leaving the range only as an aid to humans (and a way for static
checking tools and compilers to flag obviously incorrect statements).
Unfortunately, also in the C spirit, this would mean that ranged types would
almost never be checked in the first place.

Ultimately, though, it's probably not worth it adding such types to C
retroactively, so the point is moot. "If you want Pascal..."

S.
Aug 9 '06 #46
Ancient_Hacker wrote:
>
<soap>
I think it's high time a language has the ability to do the very basic
and simple things programmers need to write portable software: the
ability to specify, unambiguously, what range of values they need to
represent, preferably in decimal, binary, floating, fixed, and even
arbitrary bignum formats. Not to mention hardware-dependent perhaps
bit widths. There's no need for the compiler to be able to actually
*do* any arbitrarily difficult arithmetic, but at least give the
programmer the ability to ASK and if the compiler is capable, and get
DEPENDABLE math. I don't think this is asking too much.
Isn't this a case of if the cap doesn't fit, use another one?

You could achieve what you describe in a language that supports operator
overloading on user defined types.
The biness of C having wink-wink recognized defacto binary integer
widths is IMHO just way below contempt. The way was shown many
different times since 1958, why can't we get something usable, portable
and reasonable now quite a ways into the 21st century?
Probably because the native platform ABI's are inconsistent in their types?

--
Ian Collins.
Aug 9 '06 #47
Skarmander <in*****@dontmailme.comwrites:
Al Balmer wrote:
>On Wed, 09 Aug 2006 20:40:52 +0200, Skarmander
<in*****@dontmailme.comwrote:
>>You can use a typedef to abstract away from the actual type and
convey the purpose to humans, but C does not allow true subtyping,
so you'd have to do any range checks yourself. This obviously
encourages a style where these checks are done as little as
possible, or possibly never, which is a clear drawback.
Perhaps, but it also allows a style where such checks are done only
when necessary.
I considered this, but we're talking an optimization for which a
compiler may be very proficient at eliminating unnecessary checks,
while the programmer may be very proficient at forgetting a necessary
check. If the range is truly part of the type's semantics, and not
just something to be checked at module entry/exit, then not checking
should be the exception rather than the rule.

In the C spirit, ranged types could be toggled as entirely unchecked
if desired, leaving the range only as an aid to humans (and a way for
static checking tools and compilers to flag obviously incorrect
statements). Unfortunately, also in the C spirit, this would mean that
ranged types would almost never be checked in the first place.
Here's a thought.

Defined a new type declaration syntax:

signed(expr1, expr2) is a signed integer type that can hold values
in the range expr1 to expr2, both of which are integer constant
expressions. This will simply refer to some existing predefined
integer type; for example, signed(-32768, 32767) might just mean
short. The resulting type isn't necessarily distinct from any
other type.

unsigned(expr1, expr2): as above, but unsigned.

If an expression of one of these types yields a result outside the
declared bounds, the behavior is undefined. (Or it's
implementation-defined, chosen from some set of possibilities.)

All an implementation *has* to do is map each of these declarations to
some predefined type that meets the requirements. For example, if I
do this:

signed(-10, 10) x;
x = 7;
x *= 2;

I might get the same code as if I had written:

int x;
x = 7;
x *= 2;

But a compiler is *allowed* to perform range-checking.

The effort to implement this correctly would be minimal, but it would
allow for the possibility of full range-checking for integer
operations.

Any thoughts?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Aug 9 '06 #48
"Ancient_Hacker" <gr**@comcast.netwrote:
Richard Bos wrote:
"Ancient_Hacker" <gr**@comcast.netwrote:
The biness of C having wink-wink recognized defacto binary integer
widths is IMHO just way below contempt. The way was shown many
different times since 1958, why can't we get something usable, portable
and reasonable now quite a ways into the 21st century?
Darling, if you want Ada, you know where to find her.

!! ewwww. yuck. Somehow after I took one look at Ada, I just put it
out of my mind.
And yet, as Dik demonstrated, Ada is exactly what you asked for, in your
usual high-horsey C-is-too-primitive-to-be-believed way. If you
preachers won't accept your own paradise, how are we poor doomed, evil C
programmers supposed to trust you?

Richard
Aug 10 '06 #49
Skarmander wrote:
Ancient_Hacker wrote:
>case (3): I need an integer that can do exact math with decimal prices
from 0.01 to 999,999,999,99. COBOL and PL/I can do this.
C cannot do this natively, so you'll need libraries. Luckily, C also
makes it possible to implement such libraries efficiently. This is a
good way of highlighting the differences in philosophy.
using a signed 64 bit integer type and working in cents
should be able to handle money quantities up to

92 233 720 368 547 758 US$ and 8 cents.

Enough to accomodate the now considerable total US
debt... :-)

Using native types is quite easy considering the progres in
hardware in the last years. Besides, accounting people that
need those extreme money amounts will not shudder to buy a
latest model PC for a few thousand dollars.

You can work in decimals of cents if you need sensible rounding.
Aug 10 '06 #50

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: C. Barnes | last post by:
Summary: Sorts strings in a way that seems natural to humans. If the strings contain integers, then the integers are ordered numerically. For example, sorts into the order . Code:
3
by: Derek Basch | last post by:
Hello All, I need to sort a list using an unnatural sequence. I have a list like so: foo = print foo.sort()
1
by: Connelly Barnes | last post by:
Summary: Sorts strings in a way that seems natural to humans. If the strings contain integers, then the integers are ordered numerically. For example, sorts into the order . Code: ...
4
by: Andrew E | last post by:
Hi all I've written a python program that adds orders into our order routing simulation system. It works well, and has a syntax along these lines: ./neworder --instrument NOKIA --size 23...
7
by: tommaso.gastaldi | last post by:
This is a curious question. I'd like to know your opinion. I am attaching a drawing resize to a mouse wheel event. Frankly, to me was most natural that, if I mouse wheel UP, the shape gets...
1
by: Hendri Adriaens | last post by:
Hi, I have an onmouseover script that displays a div with some comment. There are several links getting this onmouseover functionality, all with their own comment text. Now I want the div to...
1
by: Steven Bird | last post by:
NLTK — the Natural Language Toolkit — is a suite of open source Python modules, data sets and tutorials supporting research and development in natural language processing. It comes with 50k lines...
3
by: anon538 | last post by:
I am creating an application that has to check for natural numbers. By natural numbers, I mean positive whole numbers. //natural needs to equal 1,2,3, etc. if(minutes % 60 == natural) {...
10
by: Ben | last post by:
Hi at all I am looking for the natural size of an image I found this function : while ((imgHeight = image.getHeight(this)) == -1 ) { // loop until image loaded
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.