size_t problems

jacob navia

I am trying to compile as much code in 64 bit mode as
possible to test the 64 bit version of lcc-win.

The problem appears now that size_t is now 64 bits.

Fine. It has to be since there are objects that are more than 4GB
long.

The problem is, when you have in thousands of places

int s;

// ...
s = strlen(str) ;

Since strlen returns a size_t, we have a 64 bit result being
assigned to a 32 bit int.

This can be correct, and in 99.9999999999999999999999999%
of the cases the string will be smaller than 2GB...

Now the problem:

Since I warn each time a narrowing conversion is done (since
that could loose data) I end up with hundreds of warnings each time
a construct like int a = strlen(...) appears. This clutters
everything, and important warnings go lost.
I do not know how to get out of this problem. Maybe any of you has
a good idea? How do you solve this when porting to 64 bits?

jacob

Aug 29 '07

Subscribe Post Reply

409

10700

<
1
2
3
4
5
>
Last »

CBFalconer

Keith Thompson wrote:

>

.... snip ...

>
"const" in a parameter declaration doesn't do anything useful for
the caller, since (as I'm sure you know) a function can't modify
an argument anyway. It does prevent the function from (directly)
modifying its own parameter (a local object), but that's of no
concern to the caller.

It does if you are passing a pointer to a const item. That way you
can protect the parameter and avoid copying large objects. Such
as, but not limited to, strings.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 31 '07 #101

Ian Collins

CBFalconer wrote:

Keith Thompson wrote:
.... snip ...
>"const" in a parameter declaration doesn't do anything useful for
the caller, since (as I'm sure you know) a function can't modify
an argument anyway. It does prevent the function from (directly)
modifying its own parameter (a local object), but that's of no
concern to the caller.

It does if you are passing a pointer to a const item. That way you
can protect the parameter and avoid copying large objects. Such
as, but not limited to, strings.

Why would you want to? That implies writing something like

void f( const int* const );

which is rather pointless.

simply writing

void f( const int* );

protects the pointed to item. You can change the value of the pointer,
but it still points to constant data.

I'm not sure where "avoid copying large objects" comes form, care to
elaborate?

--
Ian Collins.

Aug 31 '07 #102

Richard Bos

Ben Bacarisse <be********@bsb.me.ukwrote:

ga*****@xmission.xmission.com (Kenny McCormack) writes:

both of you just post carp

Fishing for compliments?

No, Kenny's just trolling again.

Richard

Aug 31 '07 #103

Richard

rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:

Ben Bacarisse <be********@bsb.me.ukwrote:

>ga*****@xmission.xmission.com (Kenny McCormack) writes:

both of you just post carp

Fishing for compliments?

No, Kenny's just trolling again.

Richard

I thing someone missed the joke.

Aug 31 '07 #104

Malcolm McLean

"Ben Pfaff" <bl*@cs.stanford.eduwrote in message
news:87************@blp.benpfaff.org...

"Malcolm McLean" <re*******@btinternet.comwrites:

>In fact if you use size_t safely and consistently, virtually all ints
need to be size_t's. The committee have managed to produce a very
far-reaching change to the C language, simply though fixing up a
slight problem in the interface to malloc().

My code does in fact end up using size_t quite often. If I were
perfectly consistent, it would probably use size_t even more
often. Why is that a problem?

If you are indexing an arbitrary-length array, effectively now it is an
error to use int. That's a big change from what most people would recognise
as "C". It is also very undesireable that i, which holds the index, is
described as a "size_t" when it certainly doesn't hold a size. N, the count,
doesn't hold an amount of memory either, but is also a size_t.

Thnen you've got the problem of two standards, in fact 14 standards at last
count, for holding integers. That makes it harder and harder to make
functions fit together. Code is littered with casts because cursorxy takes
two int *s, whilst drawxy takes two size_ts.

The best solution is to to say "int is a type which can index any array"
which means that int must have the same number of bits as a char pointer,
which on 99% of platforms is no problem at all. By making the standard
slightly loose the wierdos can break this rule - if char *'s have an extra
four bits, because underlying bytes are 32 bits, it might be unacceptably
inefficient to have ints large enough to hold them, but the loss of the
ability to index strings taking up an eighth of the address space or more,
without special code, isn't too bad a loss, and the problem won't be solved
by size_t.

There is also the issue of signedness. Again, this is more theoretical than
practical. In practise you can live without the extra bit, because it only a
problem handling

The other problem is backwards compatibility with legacy libraries. However,
as I pointed out, 64 bits of memory will be the maximum for a long time to
come. We mustn't damage C now purely to call a few APIs left over from
32-bit days.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Aug 31 '07 #105

Richard Bos

Richard <rg****@gmail.comwrote:

rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:

Ben Bacarisse <be********@bsb.me.ukwrote:

ga*****@xmission.xmission.com (Kenny McCormack) writes:

both of you just post carp

Fishing for compliments?
No, Kenny's just trolling again.

I thing someone missed the joke.

Yes, you. Look up the (supposed) etymology of "trolling" in the Jargon
File.

Richard

Aug 31 '07 #106

Malcolm McLean

"Richard Heathfield" <rj*@see.sig.invalidwrote in message
news:J5******************************@bt.com...

Charlton Wilbur said:

But you don't have /any/ dollars in your bank account. What you have is
a number which represents a dollar /balance/ - it is, if you like, the
difference between the number of dollars the bank owes you and the
number of dollars you owe the bank. It can, however, reasonably be
regarded as a monetary amount, and as such can of course be negative.
The number of dollars in your wallet is /also/ a monetary amount, and
monetary amounts can be negative. Of course, the number of dollars in
your wallet cannot be negative, any more than 6 can be negative, even
though 6 is an int and ints can be negative.

It is also possible to have imaginary money, in your paycheck. Happened to
me once.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Aug 31 '07 #107

Richard

rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:

Richard <rg****@gmail.comwrote:

>rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:

Ben Bacarisse <be********@bsb.me.ukwrote:

ga*****@xmission.xmission.com (Kenny McCormack) writes:

both of you just post carp

Fishing for compliments?

No, Kenny's just trolling again.

I thing someone missed the joke.

Yes, you. Look up the (supposed) etymology of "trolling" in the Jargon
File.

Richard

But that's not funny.

Aug 31 '07 #108

jacob navia

spacecriter (Bill C) wrote:

jacob navia wrote:
>I am trying to compile as much code in 64 bit mode as
possible to test the 64 bit version of lcc-win.

The problem appears now that size_t is now 64 bits.

Fine. It has to be since there are objects that are more than 4GB
long.

The problem is, when you have in thousands of places

int s;

// ...
s = strlen(str) ;

Since strlen returns a size_t, we have a 64 bit result being
assigned to a 32 bit int.

This can be correct, and in 99.9999999999999999999999999%
of the cases the string will be smaller than 2GB...

Now the problem:

Since I warn each time a narrowing conversion is done (since
that could loose data) I end up with hundreds of warnings each time
a construct like int a = strlen(...) appears. This clutters
everything, and important warnings go lost.
I do not know how to get out of this problem. Maybe any of you has
a good idea? How do you solve this when porting to 64 bits?

jacob

I assume that you don't want to redefine s as a size_t because it may be
used elsewhere as an int, and you would rather not track down everywhere it
may be used.

So why not replace all the strlen() calls with your own function (maybe
call it i_strlen(), or somesuch name) that returns an int?

That would be a good solution!

THANKS!

jacob

Aug 31 '07 #109

Richard

jacob navia <ja***@jacob.remcomp.frwrites:

spacecriter (Bill C) wrote:
>So why not replace all the strlen() calls with your own function (maybe
call it i_strlen(), or somesuch name) that returns an int?

That would be a good solution!

THANKS!

jacob

You can call it "littlestrlen()" ....

Aug 31 '07 #110

CBFalconer

Ian Collins wrote:

CBFalconer wrote:
>Keith Thompson wrote:

>.... snip ...

>>"const" in a parameter declaration doesn't do anything useful for
the caller, since (as I'm sure you know) a function can't modify
an argument anyway. It does prevent the function from (directly)
modifying its own parameter (a local object), but that's of no
concern to the caller.

It does if you are passing a pointer to a const item. That way you
can protect the parameter and avoid copying large objects. Such
as, but not limited to, strings.

Why would you want to? That implies writing something like

void f( const int* const );

I think you are confused. "void f(const int* param)" declares param
to be a pointer pointing to a const int (or the first item in a
const array of ints). There is no second const.

However "void f(const struct blah param)" declares param to be an
initialized (and const) copy of something that originated as a
struct blah. The entire struct has been copied into the parameter
space of the function f. This copying is what can be avoided by
using a pointer, as in "void f(const struct blah *param)".

You just can't pass an array by value in C without embedding it in
a suitable struct.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 31 '07 #111

Keith Thompson

CBFalconer <cb********@yahoo.comwrites:

Keith Thompson wrote:
... snip ...
>>
"const" in a parameter declaration doesn't do anything useful for
the caller, since (as I'm sure you know) a function can't modify
an argument anyway. It does prevent the function from (directly)
modifying its own parameter (a local object), but that's of no
concern to the caller.

It does if you are passing a pointer to a const item. That way you
can protect the parameter and avoid copying large objects. Such
as, but not limited to, strings.

I was referring only to applying 'const' to the parameter itself. (I
could have been clearer.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Aug 31 '07 #112

Ian Collins

CBFalconer wrote:

Ian Collins wrote:
>CBFalconer wrote:
>>Keith Thompson wrote:
.... snip ...

"const" in a parameter declaration doesn't do anything useful for
the caller, since (as I'm sure you know) a function can't modify
an argument anyway. It does prevent the function from (directly)
modifying its own parameter (a local object), but that's of no
concern to the caller.
It does if you are passing a pointer to a const item. That way you
can protect the parameter and avoid copying large objects. Such
as, but not limited to, strings.
Why would you want to? That implies writing something like

void f( const int* const );

I think you are confused. "void f(const int* param)" declares param
to be a pointer pointing to a const int (or the first item in a
const array of ints). There is no second const.

Confused by what you wrote maybe? "That way you can protect the
parameter" the only way you can prevent the parameter being modified is
to make the parameter const. I (and I think Keith) was pointing out
that making the parameter type const is seldom, if ever, useful.

--
Ian Collins.

Aug 31 '07 #113

Richard Heathfield

Malcolm McLean said:

<snip>

If you are indexing an arbitrary-length array, effectively now it is
an error to use int.

Yes.

That's a big change from what most people would recognise as "C".

No. It's the way I've been using C ever since I learned how to do it
properly, rather than follow the Schildt-style advice I had received up
to that point.

It is also very undesireable that i, which holds the
index, is described as a "size_t" when it certainly doesn't hold a
size.

I'll agree that an index doesn't hold a size...

N, the count, doesn't hold an amount of memory either, but is
also a size_t.

....but I can't agree that it doesn't hold a count.

Thnen you've got the problem of two standards, in fact 14 standards at
last count, for holding integers.

The integer types that C90 must support (and which are required to be
integer types) are char, signed char, unsigned char, short, unsigned
short, int, unsigned int, long, unsigned long, wchar_t, size_t,
ptrdiff_t, sig_atomic_t - which is thirteen.

C99 adds long long and unsigned long long, making at least fifteen, and
then there are an indeterminate number of intsuch-and-such_t types,
making a count impractical.

Either way, your count, like your argument, is wrong.

That makes it harder and harder to
make functions fit together.

No, it doesn't.

Code is littered with casts because
cursorxy takes two int *s, whilst drawxy takes two size_ts.

The only code that is littered with casts is broken code.

The best solution is to to say "int is a type which can index any
array" which means that int must have the same number of bits as a
char pointer, which on 99% of platforms is no problem at all.

No, the best solution is to use size_t where appropriate, and int where
appropriate.

<snip>

The other problem is backwards compatibility with legacy libraries.

Not a problem at all. All you have to do is recompile the library with
the new compiler. If that breaks the library, it probably wasn't a very
good library anyway.

However, as I pointed out, 64 bits of memory will be the maximum for a
long time to come.

Ha. And perhaps ha.

We mustn't damage C now purely to call a few APIs left over from
32-bit days.

C has never /had/ 32-bit days. C doesn't care how many bits a byte or an
int or a size_t or an address bus has, subject to certain basic minima
which are considerably lower than 32. Write your code to depend on
64-bit ints, and you can bet your bottom dollar it'll break one day,
and you'll be resisting the change to 128 or 256 or whatever it is out
of sheer fear of breakage. That's your problem. The solution? Stop
depending on particular sizes, and work out how to program in the
large.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Aug 31 '07 #114

CBFalconer

jacob navia wrote:

spacecriter (Bill C) wrote:
>jacob navia wrote:

.... snip ...

>>
I assume that you don't want to redefine s as a size_t because it
may be used elsewhere as an int, and you would rather not track
down everywhere it may be used.

So why not replace all the strlen() calls with your own function
(maybe call it i_strlen(), or somesuch name) that returns an int?

That would be a good solution!

No, that would be a temporary glossing over, avoiding fixing the
fundamental problem, and postponing the fix (or abandonment) until
later, with attendant confusion of the code. Not wise in the long
term. Some things are better done right.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>
--
Posted via a free Usenet account from http://www.teranews.com

Aug 31 '07 #115

Keith Thompson

jacob navia <ja***@jacob.remcomp.frwrites:

spacecriter (Bill C) wrote:
>I assume that you don't want to redefine s as a size_t because it
may be used elsewhere as an int, and you would rather not track
down everywhere it may be used. So why not replace all the
strlen() calls with your own function (maybe call it i_strlen(), or
somesuch name) that returns an int?

That would be a good solution!

THANKS!

Hmm, sounds familiar.

| I suppose you could write a strlen wrapper that calls the real strlen,
| checks whether the result exceeds INT_MAX (if you think that check is
| worth doing), and then returns the result as an int. That's assuming
| strlen calls are the only things triggering the warnings. And you'd
| still have to make hundreds of changes in the code.

<http://groups.google.com/group/comp.lang.c/msg/3ef33439c43be6ac>

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Aug 31 '07 #116

Keith Thompson

"Malcolm McLean" <re*******@btinternet.comwrites:
[...]

If you are indexing an arbitrary-length array, effectively now it is
an error to use int. That's a big change from what most people would
recognise as "C".

The "big change" was made in 1989.

It is also very undesireable that i, which holds the
index, is described as a "size_t" when it certainly doesn't hold a
size. N, the count, doesn't hold an amount of memory either, but is
also a size_t.

Then you should add this to your code:

typedef size_t size_or_count_or_index_t;

Or just think of size_t as something more general than its name
implies.

Thnen you've got the problem of two standards, in fact 14 standards at
last count, for holding integers. That makes it harder and harder to
make functions fit together. Code is littered with casts because
cursorxy takes two int *s, whilst drawxy takes two size_ts.

They're called "types", not "standards".

I actually tend to think that the number of standard types in C has
gotten to be a bit much. It's probably inevitable given the way C has
evolved, but it's something I'd do differently if I were designing a
new language from scratch. Rather than having a dozen or so
predefined integer types, I think I'd prefer a general method to
define types.

We wouldn't tolerate a language with just a dozen or so predefined
array types, each with a fixed length that varies from one
implementation to another, but we accept it for integer types. (And
yes, there are reasons for the difference.)

But this is just idle speculation. C is what it is.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Aug 31 '07 #117

Keith Thompson

CBFalconer <cb********@yahoo.comwrites:

jacob navia wrote:
>spacecriter (Bill C) wrote:
>>jacob navia wrote:

... snip ...

>>>
I assume that you don't want to redefine s as a size_t because it
may be used elsewhere as an int, and you would rather not track
down everywhere it may be used.

So why not replace all the strlen() calls with your own function
(maybe call it i_strlen(), or somesuch name) that returns an int?

That would be a good solution!

No, that would be a temporary glossing over, avoiding fixing the
fundamental problem, and postponing the fix (or abandonment) until
later, with attendant confusion of the code. Not wise in the long
term. Some things are better done right.

If he can be certain that none of the strings he's dealing with will
ever exceed 32767 bytes (say, they're people's names), then it's not a
horribly bad solution, especially if his wrapper invokes the real
strlen() and checks the result before returning it as an int.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Aug 31 '07 #118

jacob navia

Keith Thompson wrote:

jacob navia <ja***@jacob.remcomp.frwrites:
>spacecriter (Bill C) wrote:
>>I assume that you don't want to redefine s as a size_t because it
may be used elsewhere as an int, and you would rather not track
down everywhere it may be used. So why not replace all the
strlen() calls with your own function (maybe call it i_strlen(), or
somesuch name) that returns an int?

That would be a good solution!

THANKS!

Hmm, sounds familiar.

| I suppose you could write a strlen wrapper that calls the real strlen,
| checks whether the result exceeds INT_MAX (if you think that check is
| worth doing), and then returns the result as an int. That's assuming
| strlen calls are the only things triggering the warnings. And you'd
| still have to make hundreds of changes in the code.

NO!

Just

int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;

<http://groups.google.com/group/comp.lang.c/msg/3ef33439c43be6ac>

Aug 31 '07 #119

Richard Tobin

In article <46***********************@news.orange.fr>,
jacob navia <ja***@jacob.remcomp.frwrote:

>#define strlen Strlen_i;

These is probably not legal, at least in theory. Doing it after all
includes will probably work, though.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Aug 31 '07 #120

Richard

jacob navia <ja***@jacob.remcomp.frwrites:

Keith Thompson wrote:
>jacob navia <ja***@jacob.remcomp.frwrites:
>>spacecriter (Bill C) wrote:
I assume that you don't want to redefine s as a size_t because it
may be used elsewhere as an int, and you would rather not track
down everywhere it may be used. So why not replace all the
strlen() calls with your own function (maybe call it i_strlen(), or
somesuch name) that returns an int?

That would be a good solution!

THANKS!

Hmm, sounds familiar.

| I suppose you could write a strlen wrapper that calls the real strlen,
| checks whether the result exceeds INT_MAX (if you think that check is
| worth doing), and then returns the result as an int. That's assuming
| strlen calls are the only things triggering the warnings. And you'd
| still have to make hundreds of changes in the code.

NO!

Just

int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;

int Strlen_i(char *s)
{
int i=0;
while (*s++ && ++i);
return i;
}

why? It returns in the case of a mad string (ie bigger than int) when i
wraps to 0. Assuming i does that in the standard.

Aug 31 '07 #121

Richard Tobin

In article <j3************@homelinux.net>, Richard <rg****@gmail.comwrote:

>why? It returns in the case of a mad string (ie bigger than int) when i
wraps to 0. Assuming i does that in the standard.

Integer overflow is allowed to be an error. But on most systems, huge
positive integers wrap around to huge negative ones and only get to
zero again when they are doubly huge.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Aug 31 '07 #122

Kenny McCormack

In article <46***************@news.xs4all.nl>,
Richard Bos <rl*@hoekstra-uitgeverij.nlwrote:

>Ben Bacarisse <be********@bsb.me.ukwrote:

>ga*****@xmission.xmission.com (Kenny McCormack) writes:

both of you just post carp

Fishing for compliments?

No, Ben's just trolling again.

Richard

It was an intentional play on words (double entendre).

Aug 31 '07 #123

Kenny McCormack

In article <7a************@homelinux.net>, Richard <rg****@gmail.comwrote:

>rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:

>Richard <rg****@gmail.comwrote:

>>rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:

Ben Bacarisse <be********@bsb.me.ukwrote:

ga*****@xmission.xmission.com (Kenny McCormack) writes:

both of you just post carp

Fishing for compliments?

No, Kenny's just trolling again.

I thing someone missed the joke.

Yes, you. Look up the (supposed) etymology of "trolling" in the Jargon
File.

Richard

But that's not funny.

Indeed. Plus, it sounds very much like an ex post facto construction.

Aug 31 '07 #124

Martin Wells

Keith Thompson:

"const" in a parameter declaration doesn't do anything useful for the
caller, since (as I'm sure you know) a function can't modify an
argument anyway.

Agreed, it's just a waste of letters.

It does prevent the function from (directly)
modifying its own parameter (a local object), but that's of no concern
to the caller.

If I don't plan on changing a variable's value, then I make it const,
including in the parameter list of a function.

It would make more sense to be able to specify "const" in the
*definition* of a function but not in the *declaration*. And gcc
seems to allow this:

int foo(int x);

int main(void)
{
return foo(0);
}

int foo(const int x)
{
return x;
}

but I'm not sure whether it's actually legal. In any case, it's not a
style that seems to be common.

I haven't written much C in a while, but I think I used to do that and
have no problem.

Martin

Aug 31 '07 #125

Martin Wells

Ian Collins:

If you use casts frequently in C, you are doing something wrong.

Depends entirely on the nature of the code. I've written portable code
before that is littered with casts for very good reasons.

If you use naked casts at all in C++, you are doing something very wrong.

No, this is a phobia. If a C++ programmer had any sense, they'd
realise that the following two expressions are identical in every way:

MyType(x)

(MyType)x

Try it if you don't believe me.

I only use the more flowerly casts when I'm actually dealing with user-
defined class types and so forth.

There's nothing at all wrong with writing the following in C++:

int x;

char *p = (char*)&x;

Going to the effort of writing "static_cast" just exposes a phobia.

Anyway, back to c.

In my shops we always have a rule that all casts require a comment, a
good way to make developers think twice before using them.

In the little snippet I wrote just above, I'd only write a comment
with it if my target audience only started programming yesterday at 3
O'Clock.

I can't fan a compiler that issues a warning without the cast, just out
of interest, which one does?

IMO, any decent compiler should issue truncation warnings.

Martin

Aug 31 '07 #126

Martin Wells

jacob navia:

int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;}

If you want a ounce of efficiency then try:

int Strlen_i(char const *const p)
{
return (int)strlen(p);
}

That is to say, the platform's bulit-in strlen function is extremely
likely to be more efficient than anything you write.

Martin

Aug 31 '07 #127

Ben Pfaff

"Malcolm McLean" <re*******@btinternet.comwrites:

If you are indexing an arbitrary-length array, effectively now it is
an error to use int. That's a big change from what most people would
recognise as "C". It is also very undesireable that i, which holds the
index, is described as a "size_t" when it certainly doesn't hold a
size. N, the count, doesn't hold an amount of memory either, but is
also a size_t.

An array of char can potentially have an index range of
0...SIZE_MAX. An array of any larger object type has a more
limited index range. Therefore, size_t is always a suitable type
for representing an array index.
--
char a[]="\n .CJacehknorstu";int putchar(int);int main(void){unsigned long b[]
={0x67dffdff,0x9aa9aa6a,0xa77ffda9,0x7da6aa6a,0xa6 7f6aaa,0xaa9aa9f6,0x11f6},*p
=b,i=24;for(;p+=!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
2:{i++;if(i)break;else default:continue;if(0)case 1:putchar(a[i&15]);break;}}}

Aug 31 '07 #128

Malcolm McLean

"Ben Pfaff" <bl*@cs.stanford.eduwrote in message
news:87************@blp.benpfaff.org...

"Malcolm McLean" <re*******@btinternet.comwrites:
>If you are indexing an arbitrary-length array, effectively now it is
an error to use int. That's a big change from what most people would
recognise as "C". It is also very undesireable that i, which holds the
index, is described as a "size_t" when it certainly doesn't hold a
size. N, the count, doesn't hold an amount of memory either, but is
also a size_t.

An array of char can potentially have an index range of
0...SIZE_MAX. An array of any larger object type has a more
limited index range. Therefore, size_t is always a suitable type
for representing an array index.

An arbitrary fucntion, let's call it mean(), ought to be able to take any
array.

so
double mean(double *x, size_t N)

is correct. int will work, but might be a nuisance to caller.

However if we are to have a really whizzy mean, we will sort the numbers
before adding them.

So let's call qsort

void qsort(void *x, size_t N, size_t sz, int (*comp)(const void * const void
*)).

Yes qsort() takes two size_t's as well. So we are OK. The system does work,
but only so long as we are absolutely consistent in using size_t everywhere.

My proposal is to 1) make size_t signed, 2) rename it int.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Aug 31 '07 #129

Ed Jensen

Ben Pfaff <bl*@cs.stanford.eduwrote:

>And yet, other programming languages get by -- somehow -- even though
they don't even have unsigned integer types.

What programming languages are you thinking of here?

One example would be Java.

Aug 31 '07 #130

Richard

Ed Jensen <ej*****@visi.comwrites:

Ben Pfaff <bl*@cs.stanford.eduwrote:

>>And yet, other programming languages get by -- somehow -- even though
they don't even have unsigned integer types.

What programming languages are you thinking of here?

One example would be Java.

Lisp. Or Lithp.

Aug 31 '07 #131

Ed Jensen

user923005 <dc*****@connx.comwrote:

Can those same languages create objects with a size to large to be
held in an integer?

Consider this Java code:

byte[] foo = new byte[N];

N must be type int. In Java, an int is a 32 bit signed value.
Therefore, you can't create a byte array with more than 2^31-1
elements.

Now consider this Java code:

short[] foo = new short[N];

Presumably, this could work on a 64 bit JVM, where N = 2^31-1.

The size of the resulting object, in bytes, is larger than the maximum
value a Java int can hold.

Full disclosure: I do not have access to a system capable of testing
this. These conclusions are based on my understanding of the Java
language.

If 'yes', then those languages are defective. If 'no', then integer
is the correct return.

A pointless observation. All programming languages are defective in
at least one way or another. ALL of them.

My point stands: Somehow, other programming languages get by just fine
returning an int when asked for the length of a string.

An unsigned is a defective return from anything that describes the
size of an object.

And yet, other programming languages get by -- somehow -- even though
they don't even have unsigned integer types.

I can create a language with a single type. Somehow, I think it will
be less effective than C for programming tasks.

You may decide a programming language with only signed integer types
is less effective than C for programming tasks if you like; however,
it doesn't dimish the success or usefulness of those other languages.
Nor is that the only thing that should be considered when choosing a
programming language.

>I recognize and understand why the range of C types are defined the
way they're defined, but that doesn't minimize the pain when trying to
write 100% portable code.

The way to minimize the pain of writing 100% portable code is to write
it correctly, according to the language standard. For instance, that
would include using size_t for object sizes. Now, pre-ANSI C did not
have size_t. So that code will require effort to repair.

Writing 100% portable C code is extremely non-trivial and when taken
to an extreme can interfere with the progress of a project.

I understand why size_t was invented, but I have some suspicions a
more pragmatic approach may have been superior, such as returning int
from strlen() instead of size_t.

Aug 31 '07 #132

Charlton Wilbur

>>>>"BP" == Ben Pfaff <bl*@cs.stanford.eduwrites:

BPMartin Wells <wa****@eircom.netwrites:

>While I admire your sentiment as regards following the C89
Standard, I still must condemn any compiler that allows the
"implict function declaration" feature, not at least without
having to explicitly request it.

BPImplicit function declarations are part of C89. A compiler
BPthat rejects programs that use this feature is not an
BPimplementation of C89.

Yes, but --

a conforming compiler may issue any diagnostics it wishes, which means
it may certainly say "WARNING: implicitly declared function" or
something to that effect; and

most compilers need to be instructed to compile in strict ANSI/ISO
mode anyway, and so making the default behavior for implicitly
declared functions an error and only accepting them in strict mode
would be nicely consonant with that.

Charlton

--
Charlton Wilbur
cw*****@chromatico.net

Aug 31 '07 #133

Richard Tobin

In article <64************@homelinux.net>, Richard <rg****@gmail.comwrote:

>>>And yet, other programming languages get by -- somehow -- even though
they don't even have unsigned integer types.

>>What programming languages are you thinking of here?

>One example would be Java.

>Lisp. Or Lithp.

Most modern Lisps have bignums, which removes the problem of choosing
a size.

-- Richard

--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Aug 31 '07 #134

Richard Tobin

In article <87************@blp.benpfaff.org>,
Ben Pfaff <bl*@cs.stanford.eduwrote:

>An array of char can potentially have an index range of
0...SIZE_MAX. An array of any larger object type has a more
limited index range. Therefore, size_t is always a suitable type
for representing an array index.

For a sufficiently restricted interpretation of array index. p[-3]
can be perfectly legal.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Aug 31 '07 #135

Keith Thompson

jacob navia <ja***@jacob.remcomp.frwrites:

Keith Thompson wrote:
>jacob navia <ja***@jacob.remcomp.frwrites:
>>spacecriter (Bill C) wrote:
I assume that you don't want to redefine s as a size_t because it
may be used elsewhere as an int, and you would rather not track
down everywhere it may be used. So why not replace all the
strlen() calls with your own function (maybe call it i_strlen(), or
somesuch name) that returns an int?

That would be a good solution!

THANKS!
Hmm, sounds familiar.
| I suppose you could write a strlen wrapper that calls the real
| strlen, checks whether the result exceeds INT_MAX (if you think
| that check is worth doing), and then returns the result as an
| int. That's assuming strlen calls are the only things triggering
| the warnings. And you'd still have to make hundreds of changes
| in the code.

NO!

There's no need to shout.

Just

int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;

I think redefining strlen invokes undefined behavior; you're likely to
get away with it, but it might break when your code is compiled by
some other compiler. And if 'strlen' is already defined as a
function-like macro, redefining it as an object-like macro (without
first '#undef'ing it) is a constraint violation. (I'm assuming you
have a '#include <string.h>'.)

Why reimplement strlen rather than just calling it? It's a simple
enough function, but the implementation's strlen could well be faster
than your re-write. And by calling strlen and converting the result
to int, you make it easier to add a range check later.

><http://groups.google.com/group/comp.lang.c/msg/3ef33439c43be6ac>

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Aug 31 '07 #136

Flash Gordon

Malcolm McLean wrote, On 31/08/07 16:18:

>
"Ben Pfaff" <bl*@cs.stanford.eduwrote in message
news:87************@blp.benpfaff.org...
>"Malcolm McLean" <re*******@btinternet.comwrites:
>>If you are indexing an arbitrary-length array, effectively now it is
an error to use int. That's a big change from what most people would
recognise as "C". It is also very undesireable that i, which holds the
index, is described as a "size_t" when it certainly doesn't hold a
size. N, the count, doesn't hold an amount of memory either, but is
also a size_t.

An array of char can potentially have an index range of
0...SIZE_MAX. An array of any larger object type has a more
limited index range. Therefore, size_t is always a suitable type
for representing an array index.

An arbitrary fucntion, let's call it mean(), ought to be able to take
any array.

so
double mean(double *x, size_t N)

is correct. int will work, but might be a nuisance to caller.

Only if the caller does not write correct code.

However if we are to have a really whizzy mean, we will sort the numbers
before adding them.

So let's call qsort

void qsort(void *x, size_t N, size_t sz, int (*comp)(const void * const
void *)).

Yes qsort() takes two size_t's as well. So we are OK. The system does
work, but only so long as we are absolutely consistent in using size_t
everywhere.

Ah, he sees the light.

My proposal is to 1) make size_t signed, 2) rename it int.

Or perhaps not. Almost 20 years after a language is standardised is a
bit late to start trying to change it. Especially when it has proved
extremely successful.
--
Flash Gordon

Aug 31 '07 #137

jacob navia

Martin Wells wrote:

jacob navia:

>int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;}

If you want a ounce of efficiency then try:

int Strlen_i(char const *const p)
{
return (int)strlen(p);
}

That is to say, the platform's bulit-in strlen function is extremely
likely to be more efficient than anything you write.

Martin

I just did it so that I defined before the macro. But you are right.
Should do that.

Aug 31 '07 #138

Malcolm McLean

"Flash Gordon" <sp**@flash-gordon.me.ukwrote in message
news:sn************@news.flash-gordon.me.uk...

Malcolm McLean wrote, On 31/08/07 16:18:
>Yes qsort() takes two size_t's as well. So we are OK. The system does
work, but only so long as we are absolutely consistent in using size_t
everywhere.

Ah, he sees the light.

That's why Basic Algorithms is absolutely consistent in using int. Otherwise
I would either have to translate everything to size_t, or you would rapidly
risk a mess.

>
Or perhaps not. Almost 20 years after a language is standardised is a bit
late to start trying to change it. Especially when it has proved extremely
successful.

Effectively we are in a hiatus between standards. It looks like C99 will
never be widely implemented. So now is the time to get those nasty size_t's
out of our code.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Aug 31 '07 #139

CBFalconer

jacob navia wrote:

>

.... snip ...

>
Just

int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;

At which point your code has undefined behaviour. Please read the
standard some day.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 31 '07 #140

CBFalconer

Martin Wells wrote:

Ian Collins:

>If you use casts frequently in C, you are doing something wrong.

Depends entirely on the nature of the code. I've written portable
code before that is littered with casts for very good reasons.

>If you use naked casts at all in C++, you are doing something
very wrong.

No, this is a phobia. If a C++ programmer had any sense, they'd
realise that the following two expressions are identical in
every way:

MyType(x)

(MyType)x

Try it if you don't believe me.

Please don't confuse this newsgroup with C++. There is a separate
newsgroup where that (different) language is on topic.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 31 '07 #141

Ian Collins

Martin Wells wrote:

Ian Collins:

>If you use casts frequently in C, you are doing something wrong.

Depends entirely on the nature of the code. I've written portable code
before that is littered with casts for very good reasons.

I'd wager it could be written without most, or even all of them.

>In my shops we always have a rule that all casts require a comment, a
good way to make developers think twice before using them.

In the little snippet I wrote just above, I'd only write a comment
with it if my target audience only started programming yesterday at 3
O'Clock.

That's because it can be written without the cast.

>
>I can't fan a compiler that issues a warning without the cast, just out
of interest, which one does?

IMO, any decent compiler should issue truncation warnings.

Do you know of a "decent compiler" that does?

--
Ian Collins.

Aug 31 '07 #142

Ian Collins

Malcolm McLean wrote:

>
Yes qsort() takes two size_t's as well. So we are OK. The system does
work, but only so long as we are absolutely consistent in using size_t
everywhere.

Isn't consistency one of the foundations of our art?

--
Ian Collins.

Aug 31 '07 #143

Ian Collins

jacob navia wrote:

>
int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;

You really should bite the bullet and fix the code.

The transition form 32 to 64 bits isn't without pain, whether the return
is worth the pain is a choice of the developer.

There a some good papers on transitioning from 32 to 64 bit environments
to be found on the web. I'm sure the number (if not the quality) will
increase as the windows world catches up with the rest of us.

--
Ian Collins.

Aug 31 '07 #144

jacob navia

Ian Collins wrote:

jacob navia wrote:
>int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;

You really should bite the bullet and fix the code.

If aint'broken do not fix it

There is no simple solution. It means go over the
code and put casts everywhere, fix the new bugs
as you dscover them, etc.

Don't feel like it. There are more interesting things to do.

The transition form 32 to 64 bits isn't without pain, whether the return
is worth the pain is a choice of the developer.

Well, my compiler system is up and running in 64 bits...
I have to fix the debugger though, and many other stuff...

There a some good papers on transitioning from 32 to 64 bit environments
to be found on the web. I'm sure the number (if not the quality) will
increase as the windows world catches up with the rest of us.

Yes, I know. I have read most of them.

Aug 31 '07 #145

Keith Thompson

"Malcolm McLean" <re*******@btinternet.comwrites:

"Ian Collins" <ia******@hotmail.comwrote in message

[...]

>Isn't consistency one of the foundations of our art?

Yes. But psychological factors are also important. If an index
variable is called "size" then of course the compiler will happily
chug through and index the array by variable "size". However to anyone
reading the program it is intensely irritating.
The same goes for variables which don't hold amounts of memory being
called by a name that suggests that this is their function.

[...]

Get over it. Learn what "size_t" means; there's more to it than how
it's spelled.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Aug 31 '07 #146

Craig Gullixson

In article <46**********************@news.orange.fr>,
jacob navia <ja***@jacob.remcomp.frwrites:

Ian Collins wrote:
>jacob navia wrote:
>>int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;

You really should bite the bullet and fix the code.

If aint'broken do not fix it

But it *is* broken as far as the C standard is concerned.

There is no simple solution. It means go over the
code and put casts everywhere, fix the new bugs
as you dscover them, etc.

Don't feel like it. There are more interesting things to do.

If one believes in the engineering aspect of software development,
then maintenance is part of the deal. As pointed out elsewhere in
this thread, size_t has been around for 18 years so having to deal
with it should not exactly be a surprise.

That being said, you are free to either deal with updating your code
or to ignore the compiler warnings. It all depends on how much you
and those others who use the code care about it working correctly and
how difficult it is to port to other compilers, platforms, operating
systems, etc., when needed.

As an example of consequences of not keeping code up to date, I've
spent something in excess of a week getting a network communications
package for a little I/O box embeded in one of our systems to compile
and work correctly after an OS/compiler upgrade of the system needing
to use the I/O box. It turns out that the latest version of the
software package supplied by the vender is *full* of pre C89 crud.
I now have the system working again to the point that it is useful,
however the porting hassles serve as a disincentive for purchasing any
more of the company's products.

>The transition form 32 to 64 bits isn't without pain, whether the return
is worth the pain is a choice of the developer.

Well, my compiler system is up and running in 64 bits...
I have to fix the debugger though, and many other stuff...

>There a some good papers on transitioning from 32 to 64 bit environments
to be found on the web. I'm sure the number (if not the quality) will
increase as the windows world catches up with the rest of us.

Yes, I know. I have read most of them.

--
__________________________________________________ ______________________
Craig A. Gullixson
Instrument Engineer INTERNET: cg********@nso.edu
National Solar Observatory/Sac. Peak PHONE: (505) 434-7065
Sunspot, NM 88349 USA FAX: (505) 434-7029

Aug 31 '07 #147

Ian Collins

jacob navia wrote:

Ian Collins wrote:

>You really should bite the bullet and fix the code.

If aint'broken do not fix it

But is is. Quarts don't fit into pint pots.

There is no simple solution. It means go over the
code and put casts everywhere, fix the new bugs
as you dscover them, etc.

Casts that hide turds are a good reason to have strict (process) rules
regarding their use. I've even worked at a shop where casts has to pass
a design review, which I thought was overkill at the time. I now know
better.

Don't feel like it. There are more interesting things to do.

I don't feel like paying my bills, there are more interesting things to
do. Unfortunately they will get me in the end.

>There a some good papers on transitioning from 32 to 64 bit environments
to be found on the web. I'm sure the number (if not the quality) will
increase as the windows world catches up with the rest of us.

Yes, I know. I have read most of them.

But did you learn from them?

--
Ian Collins.

Aug 31 '07 #148

Keith Thompson

Ian Collins <ia******@hotmail.comwrites:

jacob navia wrote:
>Ian Collins wrote:

>>You really should bite the bullet and fix the code.

If aint'broken do not fix it

But is is. Quarts don't fit into pint pots.

The counterargument is that the contents of a quart pot can be poured
into a pint pot if the quart pot is less than half full. That appears
to be the case here (the code invokes strlen on strings that,
apparently, are known with some confidence to be shorter than INT_MAX
bytes).

Nevertheless, strlen() returns size_t for a good reason, and the
result should be stored in a size_t object unless there's a good
reason not to do so. (One possible reason is that fixing legacy code
would be more effort than it's worth.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Aug 31 '07 #149

Ian Collins

Keith Thompson wrote:

Ian Collins <ia******@hotmail.comwrites:
>jacob navia wrote:
>>Ian Collins wrote:
You really should bite the bullet and fix the code.

If aint'broken do not fix it

But is is. Quarts don't fit into pint pots.

Nevertheless, strlen() returns size_t for a good reason, and the
result should be stored in a size_t object unless there's a good
reason not to do so. (One possible reason is that fixing legacy code
would be more effort than it's worth.)

Unfortunately the fixes take on a whole new significance when porting
from 32 to 64 bit.

--
Ian Collins.

Aug 31 '07 #150

Similar topics