473,320 Members | 1,965 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

What's the deal with size_t?

Hello, World!

Reading this group for some time I came to the conclusion that
people here are split into several fractions regarding size_t,
including, but not limited to,

* size_t is the right thing to use for every var that holds the
number of or size in bytes of things.

* size_t should only be used when dealing with library functions.

* size_t should really be a signed type (less warnings)

* size_t is unnecessary (size of object in memory never exceeds
what can be held in an integer).

* size_t is visually unpleasant.

* size_t clutters up / is an uglification of the language
(solving only a theoretical problem).

* size_t usage may be non-portable because it won't be around
anymore in 100 years.

Sooo... what's the real deal with size_t? Where should it be
used/avoided (examples?)

Nov 6 '07
89 5616
Malcolm McLean said:

<snip>
The claims [about size_t] have been justified time after time on this ng.
I have yet to see a justification that has any merit.
Obviously not
at sufficient length to persuade those who are rather slow.
You don't need lengthy arguments, merely good ones. You haven't presented
any so far that I can recall.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Nov 11 '07 #51
"Richard Heathfield" <rj*@see.sig.invalidwrote in message
Malcolm McLean said:

<snip>
>The claims [about size_t] have been justified time after time on this ng.

I have yet to see a justification that has any merit.
>Obviously not
at sufficient length to persuade those who are rather slow.

You don't need lengthy arguments, merely good ones. You haven't
presented any so far that I can recall.
The claim has been made that there is "no basis" to my claims. Now I don't
think any fair person would say that I make pure assertions - that size_t is
bad because I say so, and for no other reason. But obviously there is a
demand for more material.

If you disagree and think that size_t ought to be retained that's quite a
different matter from saying that "no justification has been advanced". It
is also quite different from saying that "no justification with any merit
for abolishing size_t has been advanced".

To write

size_t i;

for(i=0;i<N;i++)
{
ptr[i]++;
}

is misleading, because it implies that i is a "size type" when it is nothing
of the sort. It is an index.

Don't you think that is a justification that has some merit? Or maybe you
are making the stronger claim that, if you personally disagree with a
proposal, all arguments for it are not just weaker than the arguments
against (which you must believe), but have "no merit". I suspect this is the
case. "Has no merit" is Heathfieldese for "I disagree".

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Nov 11 '07 #52
"Malcolm McLean" <re*******@btinternet.comwrites:
"Ben Bacarisse" <be********@bsb.me.ukwrote in message
>However, someone needs to cry "foul" (or at least "not agreed") if you
keep telling beginners that size_t will mess up their programs.
I don't jump on every use of size_t in newbie code and say "aha, that
will be deprecated within ten years. Better take it out now".

However if someone starts a thread "what's the deal with size_t"
naturally I'll give my opinion, without claiming that it is the only
one that can be held.
The difference between claiming that your opinion is the only one that
can be held and claiming that those who disagree are "rather slow",
lacking any sensitivity or insight, and arrogant is a rather subtle
one, don't you think?

Please take just a moment and consider the possibility that you might
be wrong.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Nov 11 '07 #53
"Malcolm McLean" <re*******@btinternet.comwrites:
[...]
The claim has been made that there is "no basis" to my claims. Now I don't
think any fair person would say that I make pure assertions - that size_t is
bad because I say so, and for no other reason. But obviously there is a
demand for more material.
I think fair people *are* saying that you make pure assertions, or
nearly so.
If you disagree and think that size_t ought to be retained that's quite a
different matter from saying that "no justification has been advanced". It
is also quite different from saying that "no justification with any merit
for abolishing size_t has been advanced".
Yes, it's quite different.
To write

size_t i;

for(i=0;i<N;i++)
{
ptr[i]++;
}

is misleading, because it implies that i is a "size type" when it is nothing
of the sort. It is an index.

Don't you think that is a justification that has some merit?
No.
Or maybe you
are making the stronger claim that, if you personally disagree with a
proposal, all arguments for it are not just weaker than the arguments
against (which you must believe), but have "no merit". I suspect this is the
case. "Has no merit" is Heathfieldese for "I disagree".
Richard Heathfield (who generally writes in English, not
Heathfieldese) is entirely capable of expressing disagreement without
saying that any arguments on the other side have no merit. I don't
presume to speak for him, but I believe he reserves claims that an
argument has no merit for cases where he believes that an argument has
no merit.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Nov 11 '07 #54

"Keith Thompson" <ks***@mib.orgwrote in message
The difference between claiming that your opinion is the only one that
can be held and claiming that those who disagree are "rather slow",
lacking any sensitivity or insight, and arrogant is a rather subtle
one, don't you think?

Please take just a moment and consider the possibility that you might
be wrong.
The "rather slow" ones think that no justification has been offered. As I
said, it is demand for more material, which is rather irritating since I
consider plenty to have been provided.
Those whio lack "sensitivity and insight" are those who firstly don't simply
accept that most integers are ultimately used for memory index operations,
or in intermediate calulations to derive such indices, and secondly don't
see the force of a statistical proof when it is provided. This is no
slowness, but it something that someone with no real feel for computer
programming would think.

The "arrogance" refers to a philosophical fallacy, which I will call the
"debunking fallacy". This states that if some objection can be advanced to a
piece evidence, it disappears, it may no longer be used.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm


Nov 11 '07 #55
Keith Thompson <ks***@mib.orgwrites:
"Malcolm McLean" <re*******@btinternet.comwrites:
[...]
>The claim has been made that there is "no basis" to my claims. Now I don't
think any fair person would say that I make pure assertions - that size_t is
bad because I say so, and for no other reason. But obviously there is a
demand for more material.

I think fair people *are* saying that you make pure assertions, or
nearly so.
>If you disagree and think that size_t ought to be retained that's quite a
different matter from saying that "no justification has been advanced". It
is also quite different from saying that "no justification with any merit
for abolishing size_t has been advanced".

Yes, it's quite different.
>To write

size_t i;

for(i=0;i<N;i++)
{
ptr[i]++;
}

is misleading, because it implies that i is a "size type" when it is nothing
of the sort. It is an index.

Don't you think that is a justification that has some merit?

No.
I do. A size means something totally different to a number of elements.
number of elements * size of elements = total size.

This is fairly basic nomenclature and difficult to disagree with.
>
> Or maybe you
are making the stronger claim that, if you personally disagree with a
proposal, all arguments for it are not just weaker than the arguments
against (which you must believe), but have "no merit". I suspect this is the
case. "Has no merit" is Heathfieldese for "I disagree".

Richard Heathfield (who generally writes in English, not
Heathfieldese) is entirely capable of expressing disagreement without
Richard Heathfield writes in flowery prose that sometimes appears to be
designed to confuse non native speakers from what I can gather.
saying that any arguments on the other side have no merit. I don't
presume to speak for him, but I believe he reserves claims that an
argument has no merit for cases where he believes that an argument has
no merit.
Isn't that like saying "when he thinks he's right he thinks he's right?
Or is my parser now broekn?
Nov 11 '07 #56
Malcolm McLean said:

<snip>
The claim has been made that there is "no basis" to my claims.
As I understand it, the basis you advance for your claims is two-fold:

(a) the size_t name is ugly;
(b) the size_t type is unsigned.

I have already pointed out that I agree, pretty much, with (a), but that I
don't consider it to be a particularly persuasive or meritorious argument
for abandoning or deprecating size_t. It might make a reasonable argument
for suggesting a name change, although of course the chance of getting ISO
to agree on a name change is, in reality, zero.

The fact that size_t is unsigned fits naturally with its role (as
demonstrated by the cases in which the standard library uses it) as a way
for storing object sizes and object counts. A negative size for an object
is meaningless, as is a negative count of the number of objects (in the C
sense of the word). So I don't consider this argument to be particularly
persuasive or meritorious either, because the unsignedness of size_t is
perfectly natural and sensible, given the nature of its intended purpose.

<snip>
To write

size_t i;

for(i=0;i<N;i++)
{
ptr[i]++;
}

is misleading, because it implies that i is a "size type" when it is
nothing of the sort. It is an index.
It's an object count. It measures the distance, expressed in object units,
between the start of the array and the point in that array where can be
found the object that we care about. This is entirely consistent with the
usage of size_t in functions such as fread, fwrite, and calloc.
Don't you think that is a justification that has some merit?
No.
Or maybe you
are making the stronger claim that, if you personally disagree with a
proposal, all arguments for it are not just weaker than the arguments
against (which you must believe), but have "no merit".
No. For example, I personally disagree with ISO's decision massively to
extend the standard library to include many types that are intended to
increase portability but whose successful use, as far as I understand it,
depends on the whims of implementors. Nevertheless, I would not say that
the arguments put forward for this change have no merit. I have seen
several defences of the decision, and it seemed to me that these arguments
did indeed have some merit. They did not persuade me, but they did at
least make me think hard about the issues, and that in itself is an
indication that they are not empty arguments.
I suspect this is
the case. "Has no merit" is Heathfieldese for "I disagree".
In discussions in this newsgroup, I have disagreed with a great many
people, including Chris Torek, Steve Summit, Lawrence Kirby, Dan Pop, Dann
Corbit... names that will, I trust, be familiar to you. I think you will
agree that I have, on at the very least /some/ such occasions, managed to
disagree with them /without/ claiming that their arguments have no merit.
Therefore, your argument that "has no merit" is my way of saying "I
disagree" has no merit.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Nov 11 '07 #57
"Malcolm McLean" <re*******@btinternet.coma écrit dans le message de news:
4u******************************@bt.com...
>
"Keith Thompson" <ks***@mib.orgwrote in message
>The difference between claiming that your opinion is the only one that
can be held and claiming that those who disagree are "rather slow",
lacking any sensitivity or insight, and arrogant is a rather subtle
one, don't you think?

Please take just a moment and consider the possibility that you might
be wrong.
What is wrong with Keith's news server, I don't receive any of his posts
(neither free.fr, nor aioe.org carry them). Is this some sort of agency
mantated embargo, as the name suggests ?

--
Chqrlie.
Nov 11 '07 #58
"Richard Heathfield" <rj*@see.sig.invalidwrote in message
Malcolm McLean said:

<snip>
>The claim has been made that there is "no basis" to my claims.

As I understand it, the basis you advance for your claims is two-fold:

(a) the size_t name is ugly;
(b) the size_t type is unsigned.
(c) is confusing - a bit like

typedef struct
{
double r;
double i;
} imaginary;

(d) is in fact a major change to the language, which wasn't appreciated at
the time it was implemented.

(e) has more theoretical than practical value - an unsigned type is only
needed to represent sizes of memory greater than half the address space,
which is a rather unusual need.

(f) won't in fact be adopted uniformly, leading to a legacy of broken code
when programs are proted from one platform to another.

(g) increases the type matrix by one.

However (a) is the major objection.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm


Nov 11 '07 #59
Malcolm McLean wrote, On 11/11/07 21:38:
>
"Keith Thompson" <ks***@mib.orgwrote in message
>The difference between claiming that your opinion is the only one that
can be held and claiming that those who disagree are "rather slow",
lacking any sensitivity or insight, and arrogant is a rather subtle
one, don't you think?

Please take just a moment and consider the possibility that you might
be wrong.
The "rather slow" ones think that no justification has been offered. As
I said, it is demand for more material, which is rather irritating since
I consider plenty to have been provided.
Many seem to disagree.
Those whio lack "sensitivity and insight" are those who firstly don't
simply accept that most integers are ultimately used for memory index
operations, or in intermediate calulations to derive such indices, and
Well, all of my managers and colleagues over the past 20+ years who have
expressed an opinion would disagree with any suggestion that I lack
"sensitivity and insight" in programming.
secondly don't see the force of a statistical proof when it is provided.
Well, you have not provided proof. One study designed for something
completely different with respect to a different language and that
ignores major application areas that C is used for does not prove
anything about how C is used. If you really understood statistics you
would understand why you have not proved your point.
This is no slowness, but it something that someone with no real feel for
computer programming would think.
Ah, this lack of feel for computer programming must explain me making a
successful career in SW development.
The "arrogance" refers to a philosophical fallacy, which I will call the
"debunking fallacy". This states that if some objection can be advanced
to a piece evidence, it disappears, it may no longer be used.
So what do you call your fallacy of assuming that any evidence against
your position is irrelevant?
--
Flash Gordon
Nov 11 '07 #60
"Charlie Gordon" <ne**@chqrlie.orgwrites:
[...]
What is wrong with Keith's news server, I don't receive any of his posts
(neither free.fr, nor aioe.org carry them). Is this some sort of agency
mantated embargo, as the name suggests ?
I don't think there's anything wrong with my news server (which is
"news-server.san.rr.com", provided by my ISP, Time Warner Cable's
Roadrunner service). At least some people are obviously seeing my
messages, and Google Groups shows four of my articls in this thread.

Perhaps someone or something has decided to block messages from
news-server.san.rr.com for some reason. Charlie, perhaps you could
check with the administrators of free.fr and aioe.org? If I'm being
blocked, it's always possible that somebody important is being blocked
too. 8-)}

It also seemed odd that I didn't show up on the latest "Stats for
comp.lang.c (last 7 days)" post, for Nov 3-10. I don't think I've
been *that* quiet.

I'll e-mail this to Charlie as well as posting here.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Nov 11 '07 #61
"Malcolm McLean" <re*******@btinternet.comwrites:
"Richard Heathfield" <rj*@see.sig.invalidwrote in message
>Malcolm McLean said:

<snip>
>>The claim has been made that there is "no basis" to my claims.

As I understand it, the basis you advance for your claims is two-fold:

(a) the size_t name is ugly;
(b) the size_t type is unsigned.
(c) is confusing - a bit like

typedef struct
{
double r;
double i;
} imaginary;
"imaginary" would clearly be a bad name for such a type. "complex"
would clearly be a better name.

Can you suggest a better name for "size_t"? (I've suggested "count_t"
myself.)
(d) is in fact a major change to the language, which wasn't
appreciated at the time it was implemented.
And removing size_t would be *another* major change to the language.
(e) has more theoretical than practical value - an unsigned type is
only needed to represent sizes of memory greater than half the address
space, which is a rather unusual need.
The fact that it's unsigned is a disadvantage only if there's some
clear advantage to making it signed. I don't believe there is. You
do have to be a bit more careful with unsigned computations than with
signed computations, because the lower bound of the type is so close
to the set of reasonable values, but I don't see that as a large
problem. (Yes, you can provide examples where unsigned types can
cause problems; you just have to be aware of such problems when you're
using the type.)
(f) won't in fact be adopted uniformly, leading to a legacy of broken
code when programs are proted from one platform to another.
Do you imagine that removing or deprecating size_t will lead to more
uniform code? Oh, yes, you want to require int to be 64 bits and
deprecate other integer types. That would probably result in greater
uniformity, but at a far greater cost than most of us are willing to
pay.
(g) increases the type matrix by one.

However (a) is the major objection.
Your major objection is that you don't like the name? Sorry, but I
see that as an entirely trivial concern.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Nov 12 '07 #62
"Malcolm McLean" <re*******@btinternet.comwrites:
[...]
For instance one of my cases against size_t is that it is a major
change to the language, because most integers are ultimately used as
indices. When challenged on that I give a little bit of statistical
evidence, but when challenged on that evidence, I give up. There's no
helping some people. The assertion about indices is fairly easily
verified - as long as you know what you are talking about.
If I recall correctly, you have a single example of a study that
involved Java. I'll grant you that your evidence is not nonexistent,
but it certainly seems weak. (I have neither the time nor the
expertise to judge the validity of the study, but others here have
certainly disputed it.)

You continue to assert that most integers are ultimately used as
indices. I don't believe you have sufficient evidence.

[snip]

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Nov 12 '07 #63
"Keith Thompson" <ks***@mib.orga écrit dans le message de news:
ln************@nuthaus.mib.org...
Eric Sosman <Er*********@sun.comwrites:
>Tubular Technician wrote On 11/05/07 20:16,:
[...]
>>* size_t is unnecessary (size of object in memory never exceeds
what can be held in an integer).

This claim has been made, and also refuted with actual
examples of real contemporary machines.
[...]

The claim *as stated* is correct. The size of any object in memory
can never exceed what can be held in an integer (ignoring a nitpicking
controversy about how big an object calloc() can create); size_t is,
after all, an integer type.

Blurring the distinction between "int" and "integer" is one of the
worst errors I see here. There are a number of integer types in C,
ranging from char to long long (and perhaps more if the implementation
provides one or more extended integer types). "int" is just one of
those types (it's also a keyword that can be used in the names of
several other integer types).

(The standard might use the term "integral types" rather than "integer
types"; I'm not sure, it doesn't make much difference to the point,
and my copy of the standard isn't handy at the moment.)
No, the Standard only uses "integral" to refer to the integral part of
floating point values as opposed to the fractional part, with one exception
in note 40 of 6.2.6.1p3 where "successive integral powers of 2" is somewhat
redundant anyway.

Also this is a good time to remind our non native speakers that unlike
"integral", "integer" is pronounced with a soft g as in "just" and
"general". Most French speakers make this mistake ;-)

--
Chqrlie.
Nov 12 '07 #64
In article <47**********************@news.free.fr>
Charlie Gordon <ne**@chqrlie.orgwrote:
>I for one do not particularly like to type or see size_t variables.
I much prefer ssize_t as defined in Posix, to have the ability to
represent non size values such as -1. ...
Given that signed integers behave badly (trap on overflow, and
have weird values like negative zero), but unsigned integers
always "work right" -- so that (unsigned int)x + (unsigned int)-1
is always the same as x-1 -- I could argue that C would be better
if it had nothing but *un*signed types. :-)

Seriously, you can use unsigned types throughout, and ssize_t is
generally unnecessary. POSIX probably should not have defined it,
and should instead have said that read() and write() take a size_t
(as they do) and return a size_t, with the value (size_t)-1 returned
on error. Unfortunately, this would have broken the common idiom:

if (write(fd, buf, len) < 0) ... handle error ...

Many programmers like to assume that write() only ever returns
either its third argument or -1, and think that this test is
somehow "more efficient" than:

if (write(fd, buf, len) == -1)

(which would still work if write() returned a size_t, provided that
size_t was not a "narrow" type, and would have worked even with a
"narrow" size_t if the X3J11 committee had used the correct widening
rules back in the 1980s).

As always, of course, we just have to work within the flawed systems
we have, most of the time anyway. (This applies to both C, which
has a series of minor flaws, and POSIX, which has a number of large
ugly ones, in my opinion anyway.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 12 '07 #65
"Charlie Gordon" <ne**@chqrlie.orgwrites:
"Richard Heathfield" <rj*@see.sig.invalida écrit dans le message de news:
zJ******************************@bt.com...
>Malcolm McLean said:
[...]
>>because most integers are ultimately used as indices.

Are they? Let's find out.
[big snip]
[...]
>So your claim that "most integers are ultimately used as indices" doesn't
seem to hold water for this program, either.
[...]
>
You are playing on words: most integers in your programs are used to index
into arrays or measure array sizes. Your examples side with Malcolm's
point, just not with the exact terms of his assertion. He should rephrase
it as "because most integers are ultimately used as indices or sizes".
Then surely it's Malcolm's job to phrase his challenge correctly.
I for one do not particularly like to type or see size_t variables. I much
prefer ssize_t as defined in Posix, to have the ability to represent non
size values such as -1. I think C would be much simpler if object sizes
were always less than or equal to INT_MAX. The reality is that a lot of
platforms see it as necessary to support larger ones, if only by one bit
(SIZE_MAX == UINT_MAX as in DOS/WIN16 and most 32 bit architectures), or
even much larger as in most 64 bit systems. It is impossible to reset the
clock and prevent that. We just have to live with it.
With modern 64-bit systems, it's becoming more and more common for
object sizes to exceed what can be represented in 32 bits. Making int
bigger than 32 bits causes problems; it leaves a gap in the type
system (if char is 8 bits and int is 64 bits, then short can be either
16 or 32 bits; either there's no 32-bit type or there's no 16-bit
type). C99's extended integer types might solve this, but I don't
know of any implementations that actually use them.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Nov 12 '07 #66
In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.orgwrote:
>"Charlie Gordon" <ne**@chqrlie.orgwrites:
[...]
>What is wrong with Keith's news server, I don't receive any of his posts
(neither free.fr, nor aioe.org carry them). Is this some sort of agency
mantated embargo, as the name suggests ?

I don't think there's anything wrong with my news server (which is
"news-server.san.rr.com", provided by my ISP, Time Warner Cable's
Roadrunner service). At least some people are obviously seeing my
messages, and Google Groups shows four of my articls in this thread.

Perhaps someone or something has decided to block messages from
news-server.san.rr.com for some reason. Charlie, perhaps you could
check with the administrators of free.fr and aioe.org? If I'm being
blocked, it's always possible that somebody important is being blocked
too. 8-)}
The Path header on the copy of the post to which I'm replying that
arrived at news.uwaterloo.ca was:
Path: news.uwaterloo.ca!meganewsservers.com!feeder2.on.m eganewsservers.com!nx01.iad01.newshosting.com!news hosting.com!post01.iad01!roadrunner.com!not-for-mail

Comparing paths from servers where it has arrived may (or may not) give
some clues about which newsswerver along the way is dropping them.

Keith's Message-ID is not broken, which eliminates one possible reason
for losing posts that we've seen in the past.
dave

Nov 12 '07 #67
Charlie Gordon said:

<test info snipped - see upthread>

Test datum #1: index variables: 0% (sample size: 255 lines)
Test datum #2: index variables < 35% (sample size: around 600 lines)

<snip>
You are playing on words: most integers in your programs are used to
index into arrays or measure array sizes.
No, they aren't. Colloquially, "most" means "nearly all", which is clearly
not true, and strictly speaking, "most" means "more than half", and not
even /that/'s true. In one of the samples, the number was a big fat zero,
and in the second, it was considerably less than half, indeed only
slightly over one third.

Unless of course *you* are playing with words, and claiming that, if less
is more, then fewest is most?
Your examples side with Malcolm's
point, just not with the exact terms of his assertion.
I don't see how.

<snip>

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Nov 12 '07 #68
"Richard Heathfield" <rj*@see.sig.invalidwrote in message
Malcolm McLean said:

<snip>
>For instance one of my cases against size_t is that it is a major change
to the language,

That isn't an argument against size_t. Prototypes were a fairly major
change to the language, but I don't see you arguing against them.
>because most integers are ultimately used as indices.

Are they? Let's find out.

<snip>
>Go through a sample of C code, and count every instance of variables
declared as int, long, short, long long, arguably unsigned or signed
char, and derivatives or aliases of these types. Also count pointers such
as int *, int **, and the like. Then see how many times the variable, or
the variable pointed to, is used to ultimately derive index calculations.

Okay, let's see. Time is short (or is it time_t?), so I just picked one C
file on my local disk, and went through it line by line. (My choice wasn't
random, but neither was it 'cooked'. I just listed my projects, and one of
them happened to catch my eye. So it was kinda-sorta random.)

Funnily enough, the code was 255 lines long - just enough to be indexed by
an unsigned char on my system - but I'll put that one down to coincidence.

Const qualifiers have been ignored. So have strings.

int rc: return code
size_t len: string length (i.e. count of char objects)
size_t longest: measure of longest string constructed (i.e. count of char
objects)
size_t maxlinelen: measure of longest line encountered (i.e. count of char
objects)
size_t n: line count
int first: flag

NONE of these objects is used as an index into an array.
Or used to derive index calculations. Which almost certainly you are doing
with the string lengths. That's what I mean by "ultimately".
>
int argc: argument count (should be size_t really, but ISO seems to
disagree)
int rc: return code

So - somewhat to my surprise, actually - *none* of the integer objects in
that program were used for array indexing.
argc - used to derive indices for argv. This becomes obvious if you write
for(i=0;i<argc;i++)
printf("%s\n", argv[i]);

if you say
if(argc == 3)
printf("%s %s\n", argv[1], argv[2]);
else
printf("must have 2 arguments\n");

you might want to argue that it isn't being used to derive the 1 and the 2
indices.
Naturally, we must consider the possibility of a statistical blip. So I
guess I'd better do the whole darn exercise again with a different
program.

Okay, this one is 1156 lines long. I'm going to ignore all the struct
definitions (which contain many integer types, few of which are likely to
be used as indexes) because I haven't got all night.

*int Status: return code
size_t ThisPattern: used as index
size_t len: line length (i.e. object count)
*int Found: flag
size_t SpinnerControl: used as index
int LineCount: line count
size_t len: used as index (this is in a different function to the other
len)
size_t pattern: used as pointer offset, which we'll count as an index
*int Status: return code
size_t ThisPattern: used as index
size_t len: used as pointer offset, i.e. index
*int Found: flag
size_t wcount: word count
size_t width: keeps track of how much horizontal space an output line
takes
up
*int Hit: flag
*int Status: return code
size_t ThisPattern: used as index
size_t len: current line length
size_t wcount: word count
? size_t width: horizontal space tracker
size_t idx: used as index
size_t j: used as index
* int done: flag
size_t curr: used as index
size_t i: used as index
size_t Size: tracks current buffer size
size_t BytesRead: tracks number of input bytes
* int Status: return code
size_t pos: records position of a letter in the alphabet
size_t ThisEntry: used as index
size_t ThisByte: used as index
* (?) size_t Count: counter
size_t pos: records position of a letter in the alphabet
size_t Freq: used as index
size_t ch: used as index
size_t Start: tracks starting position
size_t End: tracks ending position
size_t RangeStart: tracks start of range
size_t RangeEnd: tracks end of range
size_t LineLength: tracks line length
size_t ch: used as index
size_t Start: tracks starting position
size_t End: tracks ending position
size_t RangeStart: tracks start of range
size_t RangeEnd: tracks end of range
size_t LineLength: tracks line length

Well, I'm bored silly, at considerably more than way through the source.
Enough data there, I think, so I will look no further. Let me just count
them up:

non-index: 30
index: 16

So your claim that "most integers are ultimately used as indices" doesn't
seem to hold water for this program, either.
Counts of things in memory are almost always used ultimately to index
things.
>
Of course, I only checked around a thousand lines of source, maybe even
less, so the statistical significance of this result should not be
overplayed.
No, that should give you enough to go on. I've marked with an asterisk
everything I would say is not an index. Obviously I can't see your code, so
I've put in a few question marks, and the range trackers may not be used to
derive indices, but I find this very difficult to believe. So that makes
about 10 non-index variables.

Maybe a better of saying what I am getting at would be "given that every
array in this program could potentially take up all available memory, how
many of my variables need to be size_t's?" The answer, as suggested by your
code, which has been well-written, is the vast majority. Now ask, "how many
are size_t's?" In your case, the vast majority. But what of lesser
programmers?

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Nov 12 '07 #69
"Chris Torek" <no****@torek.netwrote in message
In article <47**********************@news.free.fr>
Charlie Gordon <ne**@chqrlie.orgwrote:
>>I for one do not particularly like to type or see size_t variables.
I much prefer ssize_t as defined in Posix, to have the ability to
represent non size values such as -1. ...

Given that signed integers behave badly (trap on overflow, and
have weird values like negative zero), but unsigned integers
always "work right" -- so that (unsigned int)x + (unsigned int)-1
is always the same as x-1 -- I could argue that C would be better
if it had nothing but *un*signed types. :-)

Seriously, you can use unsigned types throughout, and ssize_t is
generally unnecessary.
All indices must ultimately be positive. The problem is that intermediate
values can be negative, which doesn't happen often, but not so infrequently
as not to be a problem.

For instance my current program takes a window of 13 residues round a target
in a protein. If I'm taking a residue near the N-terminus, conventionally
regarded as the start, the window might overlap into negative values. Whilst
you can code round the problem of not having negatives, it is much better to
be able to say clearly if(residuei < 0).

A trap on overflow is not bad behaviour incidentally. It is good behaviour.
I wish all my C programs would exit with an error message whenever their
capabilities are exceeded.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Nov 12 '07 #70
Malcolm McLean said:
"Richard Heathfield" <rj*@see.sig.invalidwrote in message
<snip>
>int rc: return code
size_t len: string length (i.e. count of char objects)
size_t longest: measure of longest string constructed (i.e. count of
char objects)
size_t maxlinelen: measure of longest line encountered (i.e. count of
char objects)
size_t n: line count
int first: flag

NONE of these objects is used as an index into an array.
Or used to derive index calculations. Which almost certainly you are
doing with the string lengths. That's what I mean by "ultimately".
Then we have different meanings for the word "derive" (as is further
demonstrated by your later claim that argc is used to derive index
calculations).

<snip>
if you say
if(argc == 3)
printf("%s %s\n", argv[1], argv[2]);
else
printf("must have 2 arguments\n");

you might want to argue that it isn't being used to derive the 1 and the
2 indices.
Indeed, and that's what I mean about us using the word "derive" in
different ways.

<snip>
>Enough data there, I think, so I will look no further. Let me just count
them up:

non-index: 30
index: 16

So your claim that "most integers are ultimately used as indices"
doesn't seem to hold water for this program, either.
Counts of things in memory are almost always used ultimately to index
things.
Again, I don't agree. For example, in my longer example, wcount was used to
count hits in a search. The results were displayed as they were found, and
wcount was just used to count them, so that I could write this:

printf("\n%lu word%s matched.\n",
(unsigned long)wcount,
wcount == 1 ? "" : "s");

at the end of the report. At no time was wcount used to index anything.
Now, *sometimes*, yes, counts are used to limit indexing loops, and I
don't think anyone disputes that, but I think you are overstating the case
when you say "almost always".
>Of course, I only checked around a thousand lines of source, maybe even
less, so the statistical significance of this result should not be
overplayed.
No, that should give you enough to go on. I've marked with an asterisk
everything I would say is not an index. Obviously I can't see your code,
so I've put in a few question marks, and the range trackers may not be
used to derive indices, but I find this very difficult to believe.
Well, I can't do much about your beliefs, but I did pick out your question
marks (I spotted two), and and went back to check them. One, 'Count', is
used to calculate a percentage. No indexing involved. The other, 'width'
(horizontal space tracker), is used like this:

width = 0;
while(q != NULL)
{
printf("%s ", q->Word);
++wcount;
width += len + 1;
if(width ScreenWidth - 2 * len)
{
width = 0;
printf("\n");
}
q = q->Next;
}

Again, I see no indexing calculations there.

<snip>
Maybe a better of saying what I am getting at would be "given that every
array in this program could potentially take up all available memory, how
many of my variables need to be size_t's?" The answer, as suggested by
your code, which has been well-written, is the vast majority. Now ask,
"how many are size_t's?" In your case, the vast majority. But what of
lesser programmers?
Your words are kind, indeed overly kind, but I think we need to be careful
about catering for the lowest common denominator. C isn't about pandering
to "lesser programmers", but about providing power, control, and
portability to those who can use it.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Nov 12 '07 #71
Malcolm McLean said:
"Richard Heathfield" <rj*@see.sig.invalidwrote in message
>>All indices must ultimately be positive.

ITYM non-negative.
Only if you count from zero, which computers do but mathematicians don't.
In comp.lang.c, we can reasonably assume that we're talking about C, not
mathematics, unless a clear statement to the contrary is made.

<snip>
You shouldn't use the overflow trap as the intended error-handling
mechanism of your program.
I agree.
However if you make a mistake, it's there as a
last line of defence.
I disagree. It is not guaranteed to be there.
Except in a few cases, like video games, the worst thing you can do is
return wrong but plausible-seeming results, not to terminate.
The *best* thing you can do is to present information to the user that is
adequate for him or her to work out what's wrong and take appropriate
corrective action. This might involve re-entering data, or checking and
fixing file inputs, or even contacting the program supplier (i.e. you!)
with a bug report that is sufficiently detailed to enable the supplier to
find the precise cause of the fault.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Nov 12 '07 #72
"Richard Heathfield" <rj*@see.sig.invalida écrit dans le message de news:
TI******************************@bt.com...
Charlie Gordon said:

<test info snipped - see upthread>

Test datum #1: index variables: 0% (sample size: 255 lines)
Test datum #2: index variables < 35% (sample size: around 600 lines)

<snip>
>You are playing on words: most integers in your programs are used to
index into arrays or measure array sizes.

No, they aren't. Colloquially, "most" means "nearly all", which is clearly
not true, and strictly speaking, "most" means "more than half", and not
even /that/'s true. In one of the samples, the number was a big fat zero,
and in the second, it was considerably less than half, indeed only
slightly over one third.
You conveniently snipped all the evidence, so here it is:

First example:
int rc: return code
size_t len: string length (i.e. count of char objects)
size_t longest: measure of longest string constructed (i.e. count of char
objects)
size_t maxlinelen: measure of longest line encountered (i.e. count of char
objects)
size_t n: line count
int first: flag

NONE of these objects is used as an index into an array.
len, longest, maxlinelen are string lengths: they measure a count of objects
in an array, which is why they were made size_t. Unless n is intended to
count lines for allocating an array, size_t is not the proper type for it:
either you know that there are never more than 65535 lines in the file and
unsigned int is sufficient, or you don't and long or unsigned long should be
your choice.

Total: at least 50% of integers are size_t because they measure an array or
index into one.

Second example:
int Status: return code
size_t ThisPattern: used as index
size_t len: line length (i.e. object count)
int Found: flag
size_t SpinnerControl: used as index
int LineCount: line count
size_t len: used as index (this is in a different function to the other
len)
size_t pattern: used as pointer offset, which we'll count as an index
int Status: return code
size_t ThisPattern: used as index
size_t len: used as pointer offset, i.e. index
int Found: flag
size_t wcount: word count
size_t width: keeps track of how much horizontal space an output line
takes
up
int Hit: flag
int Status: return code
size_t ThisPattern: used as index
size_t len: current line length
size_t wcount: word count
size_t width: horizontal space tracker
size_t idx: used as index
size_t j: used as index
int done: flag
size_t curr: used as index
size_t i: used as index
size_t Size: tracks current buffer size
size_t BytesRead: tracks number of input bytes
int Status: return code
size_t pos: records position of a letter in the alphabet
size_t ThisEntry: used as index
size_t ThisByte: used as index
size_t Count: counter
size_t pos: records position of a letter in the alphabet
size_t Freq: used as index
size_t ch: used as index
size_t Start: tracks starting position
size_t End: tracks ending position
size_t RangeStart: tracks start of range
size_t RangeEnd: tracks end of range
size_t LineLength: tracks line length
size_t ch: used as index
size_t Start: tracks starting position
size_t End: tracks ending position
size_t RangeStart: tracks start of range
size_t RangeEnd: tracks end of range
size_t LineLength: tracks line length
46 total variables.
16 variables used as index per your own admission.
7 variables use to measure strings or parts of strings, ie counts of objects
in arrays.
wcount and pos are counts of objects for which size_t is not the proper
type.
RangeStart and RangeEnd track ranges: cannot tell from context if they
qualify as offsets in an array or as counts of objects unrelated to memory,
such as lines, for which you chose the type int anyway.
Of course argc also qualifies as the count of objects in the argv array.

Again at least 50% of integer variables are size_t because they measure an
array or index into one.
Unless of course *you* are playing with words, and claiming that, if less
is more, then fewest is most?
I'm just trying to use common sense to explain what I think Malcolm means in
his assertion: type size_t is only sensible for variables that represent
sizes, indices, and counts of objects in arrays in memory. A substantial
amount of variables in C programs are used for such a purpose and should be
made size_t instead of int or unsigned int if the arrays they refer to can
be of arbitrary size. Aside from those uses, variables used for counts of
objects with no relation to memory should use a type appropriate for their
assumed maximum value, not necessarily size_t.
>Your examples side with Malcolm's
point, just not with the exact terms of his assertion.

I don't see how.
I hope to have made this clearer, not precisely with his assertion, but with
my interpretation of what he means. "Most" is probably too much, but to say
that integer variables are "primarily" used for such purposes as indexing or
measuring object counts in memory arrays seems right to me.

--
Chqrlie.
Nov 12 '07 #73
Charlie Gordon said:
"Richard Heathfield" <rj*@see.sig.invalida écrit dans le message de
news: TI******************************@bt.com...
>Charlie Gordon said:
<snip>
>>
>>You are playing on words: most integers in your programs are used to
index into arrays or measure array sizes.

No, they aren't. Colloquially, "most" means "nearly all", which is
clearly not true, and strictly speaking, "most" means "more than half",
and not even /that/'s true. In one of the samples, the number was a big
fat zero, and in the second, it was considerably less than half, indeed
only slightly over one third.

You conveniently snipped all the evidence
It is appropriate to snip material unless I am directly commenting on it.
In this case, I was commenting not on the material I snipped but on your
claim that I was playing with words. If you meant "conveniently"
literally, fine, and I'm glad I was able to present my reply in a way that
you found convenient. But if you meant it ironically, it's a baseless
slur.
, so here it is:

First example:
>int rc: return code
size_t len: string length (i.e. count of char objects)
size_t longest: measure of longest string constructed (i.e. count of
char objects)
size_t maxlinelen: measure of longest line encountered (i.e. count of
char objects)
size_t n: line count
int first: flag

NONE of these objects is used as an index into an array.

len, longest, maxlinelen are string lengths: they measure a count of
objects
in an array, which is why they were made size_t.
len is populated like this: len = strlen(rest) + 1;

and used like this:

if(len longest)
{
size_t prev;
longest = len;
while(prev = longest, longest &= (longest - 1))
{
continue;
}
longest = prev * 2;
}

None of them is used for indexing into an array.
Unless n is intended to
count lines for allocating an array, size_t is not the proper type for
it: either you know that there are never more than 65535 lines in the
file and unsigned int is sufficient, or you don't and long or unsigned
long should be your choice.
Agreed. Amazing what turns up in these discussions, isn't it? It should be
unsigned long (and now is).
Total: at least 50% of integers are size_t because they measure an array
or index into one.
Sure, but that's not what Malcolm is saying. He's saying that most integers
are used as indices, or are ultimately used to derive indices. What you
are trying to show is that those integers that I have deliberately chosen
to be of type size_t are used for object counts or sizes. Yes, *some* of
those size_t are used for indexing into an array, but his claim was that
most (i.e. at the very least, more than half) integral type objects are
used for indexing into arrays. My data showed that this claim was not true
in the given arbitrary sample. If you want to argue that more than half my
size_t objects were used for indexing into an array, well, I would not be
surprised if that were true, but it turns out that it isn't true in that
example.
Second example:
>int Status: return code
size_t ThisPattern: used as index
size_t len: line length (i.e. object count)
int Found: flag
size_t SpinnerControl: used as index
int LineCount: line count
size_t len: used as index (this is in a different function to the other
len)
size_t pattern: used as pointer offset, which we'll count as an index
int Status: return code
size_t ThisPattern: used as index
size_t len: used as pointer offset, i.e. index
int Found: flag
size_t wcount: word count
size_t width: keeps track of how much horizontal space an output line
takes
up
int Hit: flag
int Status: return code
size_t ThisPattern: used as index
size_t len: current line length
size_t wcount: word count
size_t width: horizontal space tracker
size_t idx: used as index
size_t j: used as index
int done: flag
size_t curr: used as index
size_t i: used as index
size_t Size: tracks current buffer size
size_t BytesRead: tracks number of input bytes
int Status: return code
size_t pos: records position of a letter in the alphabet
size_t ThisEntry: used as index
size_t ThisByte: used as index
size_t Count: counter
size_t pos: records position of a letter in the alphabet
size_t Freq: used as index
size_t ch: used as index
size_t Start: tracks starting position
size_t End: tracks ending position
size_t RangeStart: tracks start of range
size_t RangeEnd: tracks end of range
size_t LineLength: tracks line length
size_t ch: used as index
size_t Start: tracks starting position
size_t End: tracks ending position
size_t RangeStart: tracks start of range
size_t RangeEnd: tracks end of range
size_t LineLength: tracks line length

46 total variables.
16 variables used as index per your own admission.
Oh, please don't be so dramatic. It's not a question of admitting this and
confessing that, but a question of examining Malcolm's claim that most
integers are used as indices into arrays.
7 variables use to measure strings or parts of strings, ie counts of
objects in arrays.
Yes. Counts are not indices, however.
wcount and pos are counts of objects for which size_t is not the proper
type.
I agree about wcount (which I've now fixed to be unsigned long).
RangeStart and RangeEnd track ranges: cannot tell from context if they
qualify as offsets in an array or as counts of objects unrelated to
memory, such as lines, for which you chose the type int anyway.
They mark the lower and upper limits of a loop whose counter is used for
indexing into an array. I agree that the loop counter is used for indexing
(and I recorded it as such), but I do not agree that the limits are used
for indexing.
Of course argc also qualifies as the count of objects in the argv array.
Yes. It's a count. It ought to be size_t.
Again at least 50% of integer variables are size_t because they measure
an array or index into one.
Again, that's not Malcolm's claim. He says that most integers are used for
indexing arrays. I will agree that size_t is often used for indexing
(which is really another way of saying "counting objects"), but there are
many more uses for integers than mere indexing, important as that use
undoubtedly is.
>Unless of course *you* are playing with words, and claiming that, if
less is more, then fewest is most?

I'm just trying to use common sense to explain what I think Malcolm means
in his assertion: type size_t is only sensible for variables that
represent
sizes, indices, and counts of objects in arrays in memory.
I agree that those are the proper uses of size_t - and they describe how I
use it myself (except when I mistakenly use it for other things, and
you've spotted a couple of those yourself). But I don't think that's what
Malcolm means at all.

<snip>
"Most" is probably too much,
Right.
but to
say that integer variables are "primarily" used for such purposes as
indexing or measuring object counts in memory arrays seems right to me.
Fine, but it doesn't seem right to me. It seems to me that it would be
closer to the mark to say that indexing is one of the very many important
uses to which integers are put. If he means that integers are used more
frequently for indexing than for any other single purpose, then I might
even agree (or at least not bother to disagree), but to claim that this
frequency exceeds 50% seems to me to be an exaggeration.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Nov 12 '07 #74
Charlie Gordon said:
"Richard Heathfield" <rj*@see.sig.invalida écrit dans le message de
news: y7******************************@bt.com...
<snip>
>[An array index is] an object count. It measures the distance,
expressed in object units, between the start of the array and
the point in that array where can be found the object that we
care about. This is entirely consistent with the usage of
size_t in functions such as fread, fwrite, and calloc.

It is very misleading to describe size_t as measuring distances:
I was actually describing an ***array index***, and that's precisely what
it does - it gives you the number of objects between 0 and 'this' object.
Thus, if "object units" can be considered a unit of measurement, an array
index measures an offset in object units.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Nov 12 '07 #75
Richard wrote:
Keith Thompson <ks***@mib.orgwrites:
....
>Richard Heathfield (who generally writes in English, not
Heathfieldese) is entirely capable of expressing disagreement without

Richard Heathfield writes in flowery prose that sometimes appears to be
designed to confuse non native speakers from what I can gather.
I cannot address the question of his intent; only he knows for sure what
he intended. But I know from personal experience that confusing
non-native speakers is not especially difficult, and needn't be ascribed
to deliberate intent. I'm married to one, and work with several others,
and avoiding confusion requires constant effort; preventing confusion
completely seems impossible. My own prose is sufficiently complicated
that I frequently unintentionally confuse native English speakers.
>saying that any arguments on the other side have no merit. I don't
presume to speak for him, but I believe he reserves claims that an
argument has no merit for cases where he believes that an argument has
no merit.

Isn't that like saying "when he thinks he's right he thinks he's right?
Or is my parser now broekn?
No. Saying "I disagree" in this kind of context is basically equivalent
to saying "Your argument has insufficient merit". Saying that "Your
argument has no merit" is a much stronger assessment. IMO, it was also a
correct assessment in this case.
Nov 12 '07 #76
In article <8p******************************@bt.com>,
Richard Heathfield <rj*@see.sig.invalidwrote:
>>[An array index is] an object count. It measures the distance,
expressed in object units, between the start of the array and
the point in that array where can be found the object that we
care about. This is entirely consistent with the usage of
size_t in functions such as fread, fwrite, and calloc.
>It is very misleading to describe size_t as measuring distances:
>I was actually describing an ***array index***, and that's precisely what
it does - it gives you the number of objects between 0 and 'this' object.
C doesn't have array indexes. In an expression like ptr[i], i is the
second operand of the subscripting operator, whose other operand is a
pointer, not an array. It measures the signed displacement from ptr
in object units.

size_t is only suitable for this when the displacement happens to be
always positive (which of course must be the case when the first
operand results from conversion of an array).
>Thus, if "object units" can be considered a unit of measurement, an array
index measures an offset in object units.
"offset" is better, because it doesn't have the implication of
unsignedness that "distance" and "number of objects" have.

-- Richard

--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
Nov 12 '07 #77
ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:
In article <8p******************************@bt.com>,
Richard Heathfield <rj*@see.sig.invalidwrote:
>>>[An array index is] an object count. It measures the distance,
expressed in object units, between the start of the array and
the point in that array where can be found the object that we
care about. This is entirely consistent with the usage of
size_t in functions such as fread, fwrite, and calloc.
>>It is very misleading to describe size_t as measuring distances:
A length of a string is a distance. So is the size of a malloc.
>
>>I was actually describing an ***array index***, and that's precisely what
it does - it gives you the number of objects between 0 and 'this'
object.
But in most API functions that return size_t we are always talking about
number of bytes. Not number of elements.
>
C doesn't have array indexes. In an expression like ptr[i], i is the
Of course C has array indices. They are do not have a specific type, but
it does feature indexing into arrays.

Nov 12 '07 #78
In article <03************@news.individual.net>,
Richard <rg****@gmail.comwrote:
>C doesn't have array indexes. In an expression like ptr[i], i is the
>Of course C has array indices. They are do not have a specific type, but
it does feature indexing into arrays.
Of course you can *do* array indexing, but you do it with a more
general mechanism in which the subscript is not inherently unsigned.

-- Richard

--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
Nov 12 '07 #79
ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:
In article <03************@news.individual.net>,
Richard <rg****@gmail.comwrote:
>>C doesn't have array indexes. In an expression like ptr[i], i is the
>>Of course C has array indices. They are do not have a specific type, but
it does feature indexing into arrays.

Of course you can *do* array indexing, but you do it with a more
general mechanism in which the subscript is not inherently unsigned.

-- Richard
I was point out that the statement "C doesn't have array indexes" is
somewhat misleading.
Nov 12 '07 #80
Richard Tobin said:
Richard Heathfield <rj*@see.sig.invalidwrote:
<snip>
>
>>I was actually describing an ***array index***, and that's precisely what
it does - it gives you the number of objects between 0 and 'this' object.

C doesn't have array indexes.
Very good. :-) But of course you know what I mean (as your followup
clearly shows).

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Nov 12 '07 #81
>"Chris Torek" <no****@torek.netwrote in message
>... I could argue that C would be better
if it had nothing but *un*signed types. :-)

Seriously, you can use unsigned types throughout, and ssize_t is
generally unnecessary.
In article <Do******************************@bt.com>,
Malcolm McLean <re*******@btinternet.comwrote:
>All indices must ultimately be positive.
You mean "nonnegative" (as I think someone else has already noted).
>The problem is that intermediate values can be negative, which
doesn't happen often, but not so infrequently as not to be a problem.
So just use an unsigned type. The "negative" numbers will be large
positive numbers, but because you are working in a ring mod 2-sup-K
for some K, they work EXACTLY THE SAME AS "negative" numbers (at
least for everything you will do with them).

The test for "is some variable notionally negative" is simply "x
greater than LIM" for some constant LIM. If you set this limit to
cause approximately half of the numbers to be "negative", you merely
need to squint a bit to realize you have achieved "two's complement"
arithmetic.

For a concrete example, if you are using "unsigned int" and UINT_MAX
is 65535, the "negative half" of the space is all those values in
the range [32768 .. 65535]. If UINT_MAX is 4294967295, the "negative
half" of the space is those values in [2147483648 .. 4294967295].

(This works fine for addition and subtraction, but requires a fixup
step for multiplication and division, when thinking of half of the
space as "negative". Anyone who has ever coded multiply and divide
routines for CPUs that lack the instructions should be familiar
with this.)
>A trap on overflow is not bad behaviour incidentally. It is good behaviour.
So you have an implementation in which:

int i;
...
i = INT_MAX - 4;
...
i += 32; /* result should be INT_MAX + 31, ie, "overflow" */

causes a runtime trap? Those are, alas, all too rare. Can you
name your implementation?
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 12 '07 #82
Richard Heathfield wrote, On 12/11/07 09:42:
Malcolm McLean said:
>"Richard Heathfield" <rj*@see.sig.invalidwrote in message
>>>All indices must ultimately be positive.
ITYM non-negative.
Only if you count from zero, which computers do but mathematicians don't.

In comp.lang.c, we can reasonably assume that we're talking about C, not
mathematics, unless a clear statement to the contrary is made.
<snip>

Actually, postive means greater than zero in mathematics, so if as
Malcolm suggested one is using the maths definition you (Richard) were
still correct in your correction. The same applies in English according
to a brief check of a few dictionaries.
--
Flash Gordon
Nov 12 '07 #83
Flash Gordon <sp**@flash-gordon.me.ukwrites:
Richard Heathfield wrote, On 12/11/07 09:42:
>Malcolm McLean said:
>>"Richard Heathfield" <rj*@see.sig.invalidwrote in message
All indices must ultimately be positive.
ITYM non-negative.

Only if you count from zero, which computers do but mathematicians don't.
In comp.lang.c, we can reasonably assume that we're talking about C,
not mathematics, unless a clear statement to the contrary is made.

<snip>

Actually, postive means greater than zero in mathematics, so if as
Malcolm suggested one is using the maths definition you (Richard) were
still correct in your correction. The same applies in English
according to a brief check of a few dictionaries.
I don't think anyone was disagreeing over the meanings of the terms
"positive" and "non-negative". (Positive values are greater than
zero; non-negative values are greater than or equal to zero.)

Somebody (attribution lost) wrote that "All indices must ultimately be
positive". RH corrected that to "non-negative", since 0 is a valid
index. Malcolm IMHO muddied the waters by bringing a non-C context
into the discussion.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Nov 12 '07 #84
Flash Gordon said:

<snip>
Actually, postive means greater than zero in mathematics, so if as
Malcolm suggested one is using the maths definition you (Richard) were
still correct in your correction.
Well, his original claim was not that 0 is positive but that all indices
are positive, which is true from many mathematicians' point of view, since
they tend to label indices starting from 1. Having said that, I've seen a
good few mathematical works in which indices have been labeled from 0.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Nov 12 '07 #85

"Flash Gordon" <sp**@flash-gordon.me.ukwrote in message
Looking at a small amount of the code for clamav I saw closer to 50% of
integer variables having something to do with indexing/size (including
variables relating to indexing/size of files), but it did not appear to be
quite up to 50%.

However, it is normally the person making the claim that is expected to
provide the evidence. You are still asserting what we will find if we do
the test, not providing the evidence to back up your claim.
People can't count them right. Maybe because the idea of "ultimately used to
derive indices" is a bit woolly.
If we say
for(i=start;i<=end;i++)
array[i] = 0;

start and end hold index values, though they are not actually used as the
indexing variable themsleves. They are intermediates.

of the other hand if we write

cmp = strcmp(argv[1], "-x");
if(cmp == 0)
outfile = argv[2]
else
outfile = argv[1];

we wouldn't say that argv[1] is really an "index string" and cmp an index
intermediate The difference is obvious but takes a little bit of sensitivity
to apply.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm
Nov 12 '07 #86
Malcolm McLean wrote, On 12/11/07 21:57:
>
"Flash Gordon" <sp**@flash-gordon.me.ukwrote in message
>Looking at a small amount of the code for clamav I saw closer to 50%
of integer variables having something to do with indexing/size
(including variables relating to indexing/size of files), but it did
not appear to be quite up to 50%.

However, it is normally the person making the claim that is expected
to provide the evidence. You are still asserting what we will find if
we do the test, not providing the evidence to back up your claim.
People can't count them right.
That comment and the rest of your post completely ignores the points
that I made. Ignoring points does not invalidate them.

Why should we do your work for you. If you want to convince people *you*
provide the evidence to convince them.

In the part of the post you snipped (without marking the cut which is
potentially misleading) I provided a count from one file and it suggests
that your assertion is wrong for the type of code I am currently most
involved in.

I scanned but did not count another in the part you left in I gave my
impression from a quick look at a second, which also suggests you are wrong.
Maybe because the idea of "ultimately
used to derive indices" is a bit woolly.
If we say
for(i=start;i<=end;i++)
array[i] = 0;

start and end hold index values, though they are not actually used as
the indexing variable themsleves. They are intermediates.
<snip>

I used the most liberal interpretation I could which would have included
start and end in your example above and much more. It still suggests you
are wrong.

Now, since both RHs checks and mine both disagree with your assertion,
and unlike the unrelated study you pointed out both Richard and I used
real C code developed for real applications will you accept the
possibility that you could be wrong? I'm not asking you to say you are
definitely wrong, just that you admit that it is not definite that you
are correct.
--
Flash Gordon
Nov 13 '07 #87
Chris Torek <no****@torek.netwrites:
<snip>
So you have an implementation in which:

int i;
...
i = INT_MAX - 4;
...
i += 32; /* result should be INT_MAX + 31, ie, "overflow" */

causes a runtime trap? Those are, alas, all too rare. Can you
name your implementation?
I discovered only recently that gcc can make integer arithmetic
overflow (on some platforms at least) with the -ftrapv flag. Can be
handy to have that assurance.

--
Ben.
Nov 13 '07 #88
Tubular Technician wrote:
Sooo... what's the real deal with size_t? Where should it be
used/avoided (examples?)
Tubular Technician wrote:
Sooo... what's the real deal with size_t? Where should it be
used/avoided (examples?)
size_t is an implementation defined unsigned integer type.

The biggest problem with size_t,
is that it may or may not be lower ranking than int,
and that it might or might not
promote from an unsigned type to a signed type,
which matters much in comparisons.

And so, expressions that have operands of size_t,
may require some casts for portability reasons.

Consider the necessity of the casts in the case
where these expressions are defined as being equal to (1):
(UNIT_MAX (size_t)-1)
(UNIT_MAX INT_MAX)
( INT_MAX (size_t)-1)
( count == (size_t)-2)

For this code:
int get_line(char **lineptr, size_t *n, FILE *stream)
{
int rc;
void *p;
size_t count;

count = 0;
while ((rc = getc(stream)) != EOF) {
if (count != (size_t)-2) {
++count;
}
if ((size_t)(count + 2u) *n) {
http://www.mindspring.com/~pfilandr/...ine/get_line.c

--
pete
Nov 13 '07 #89
"Malcolm McLean" <re*******@btinternet.comwrote in message
news:AN******************************@bt.com...
"Richard Heathfield" <rj*@see.sig.invalidwrote in message
>>All indices must ultimately be positive.

ITYM non-negative.
Only if you count from zero, which computers do but mathematicians
don't.
Ah, but computers have all kinds of funny things when it comes to math, like
positive and negative zero on some systems.

For that matter, are negative indices necessarily incorrect?

char a[2], *b, c;
b = &a[1];
c = b[-1];

ISTM that is defined according to C's rules, since the last line is
equivalent to "c = *(b + -1);", which is of course defined.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
--
Posted via a free Usenet account from http://www.teranews.com

Nov 18 '07 #90

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

176
by: basecamp | last post by:
just checking the average age of programmers using this group -- thanks
4
by: Alfonzo Morra | last post by:
I've the ff code in cp assignmenent cstor: PB& PB::operator=( const PB& b) { if ( this != &b ) { PB *pb = new PB( b ) ; this = pb ; // <- Compiler barfs here } return *this ; }
4
by: Ron Vecchi | last post by:
I recently picked up a Managed Direct 3d book and the examples are in c++. I've always used C# and wonder if c++ is more suited to Direct 3D programming. Of course I know they both access the same...
8
by: Z D | last post by:
Hi, I was wondering what's the point of "finally" is in a try..catch..finally block? Isn't it the same to put the code that would be in the "finally" section right after the try/catch block?...
4
by: David Lozzi | last post by:
OK simple question. Whats the default value for an string() array? sub LoadStuff(byval one as integer, byval two as string, optional byval three() as string = ??) Its driving me nuts! ...
8
by: buc | last post by:
I have a simple combox on the screen that is bound via a datareader to a stored proc in sql that returns a simple string. The code is 'load stored proc then dReader =...
20
by: Snis Pilbor | last post by:
Whats the point of making functions which take arguments of a form like "const char *x"? It appears that this has no effect on the function actually working and doing its job, ie, if the function...
9
by: Christopera | last post by:
I setup a site that uses a set width main body then installed some divs within the body. In Opera, IE7, and FF it all looks pretty similar, some small problems with IE7 but the site still looks...
3
by: Thomas 'PointedEars' Lahn | last post by:
Erwin Moller wrote: Just to add a bit more off-topic noise: The World Wide Web is not really a part of the Internet (interconnected networks); it is *an application of* the Internet. The...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.