size_t problems

jacob navia

I am trying to compile as much code in 64 bit mode as
possible to test the 64 bit version of lcc-win.

The problem appears now that size_t is now 64 bits.

Fine. It has to be since there are objects that are more than 4GB
long.

The problem is, when you have in thousands of places

int s;

// ...
s = strlen(str) ;

Since strlen returns a size_t, we have a 64 bit result being
assigned to a 32 bit int.

This can be correct, and in 99.9999999999999999999999999%
of the cases the string will be smaller than 2GB...

Now the problem:

Since I warn each time a narrowing conversion is done (since
that could loose data) I end up with hundreds of warnings each time
a construct like int a = strlen(...) appears. This clutters
everything, and important warnings go lost.
I do not know how to get out of this problem. Maybe any of you has
a good idea? How do you solve this when porting to 64 bits?

jacob

Aug 29 '07

Subscribe Post Reply

409

10708

« First
<
2
3
4
5
6
>
Last »

jacob navia

Craig Gullixson wrote:

In article <46**********************@news.orange.fr>,
jacob navia <ja***@jacob.remcomp.frwrites:
>Ian Collins wrote:
>>jacob navia wrote:
int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;

You really should bite the bullet and fix the code.

If aint'broken do not fix it

But it *is* broken as far as the C standard is concerned.

Look, it is not the C standard that runs my code.

It is a mindless processor, churning instruction after instruction, no
mind no standards, no nothing.

I have an aesthetic view of code. What is important in it, from my
viewpoint, is clarity of design and above all, that
IT WORKS.

Code can be written up to the best standards, but if doesn't work or if
it doesn't perform very well I do not care. It is ugly.

The code I am porting is the code of the IDE of lcc-win, and the code of
the debugger. I started writing it around 1992.

The IDE was one of the few pieces of code I salvaged from my failed
lisp interpreter project, that was a technical success but a commercial
failure.

It has been ported to windows 16 bits (from 32 bit DOS emulator with
Delorie), then ported to windows 32 bits in 1996 (windows 95), then
ported to linux 32 bits , under GTK, and then to windows 64 bits.

Believe me, I know what porting means, what is important in code
what is not.

>
>There is no simple solution. It means go over the
code and put casts everywhere, fix the new bugs
as you dscover them, etc.

Don't feel like it. There are more interesting things to do.

If one believes in the engineering aspect of software development,
then maintenance is part of the deal. As pointed out elsewhere in
this thread, size_t has been around for 18 years so having to deal
with it should not exactly be a surprise.

Yeah. I have to cope with the possibility of strings larger than 2GB.
Gosh!

That being said, you are free to either deal with updating your code
or to ignore the compiler warnings. It all depends on how much you
and those others who use the code care about it working correctly and
how difficult it is to port to other compilers, platforms, operating
systems, etc., when needed.

I think that the fix proposed will fit the bill.

As an example of consequences of not keeping code up to date, I've
spent something in excess of a week getting a network communications
package for a little I/O box embeded in one of our systems to compile
and work correctly after an OS/compiler upgrade of the system needing
to use the I/O box. It turns out that the latest version of the
software package supplied by the vender is *full* of pre C89 crud.

You will agree with me that THAT is much serious than a few compiler
warnings because of size_t I suppose...

I adopted immediately C89 when it come out, because of the prototypes.
It was an INCREDIBLE relaxing thing that I did NOT have to worry
anymore if I always passed the right number of parameters to my
functions. The compiler would yell at me. What a fantastic feeling.
I still remember it!

I now have the system working again to the point that it is useful,
however the porting hassles serve as a disincentive for purchasing any
more of the company's products.

Sorry but did you contact the vendor? If they still exists and
sell that package they have surely upgraded it...

Aug 31 '07 #151

Flash Gordon

Malcolm McLean wrote, On 31/08/07 19:27:

>
"Flash Gordon" <sp**@flash-gordon.me.ukwrote in message
news:sn************@news.flash-gordon.me.uk...
>Malcolm McLean wrote, On 31/08/07 16:18:
>>Yes qsort() takes two size_t's as well. So we are OK. The system does
work, but only so long as we are absolutely consistent in using
size_t everywhere.

Ah, he sees the light.

That's why Basic Algorithms is absolutely consistent in using int.

Therefore being inconsistent with the standard for the language your
claim to want to use.

Otherwise I would either have to translate everything to size_t, or you
would rapidly risk a mess.

Shock horror, if you do only part of your code correctly you get a mess!
The obvious solution is to write all of it correctly!

>Or perhaps not. Almost 20 years after a language is standardised is a
bit late to start trying to change it. Especially when it has proved
extremely successful.

You have not addressed the points above. A reasonable assumption is that
this is because you realise you don't have a good argument against them.

Effectively we are in a hiatus between standards. It looks like C99 will
never be widely implemented. So now is the time to get those nasty
size_t's out of our code.

Not everyone thinks they are nasty. In any case, comp.std.c is the place
to propose changes to the standard.
--
Flash Gordon

Sep 1 '07 #152

spacecriter \(Bill C\)

Keith Thompson wrote:

jacob navia <ja***@jacob.remcomp.frwrites:
>spacecriter (Bill C) wrote:
>>I assume that you don't want to redefine s as a size_t because it
may be used elsewhere as an int, and you would rather not track
down everywhere it may be used. So why not replace all the
strlen() calls with your own function (maybe call it i_strlen(), or
somesuch name) that returns an int?

That would be a good solution!

THANKS!

Hmm, sounds familiar.

>I suppose you could write a strlen wrapper that calls the real
strlen, checks whether the result exceeds INT_MAX (if you think that
check is worth doing), and then returns the result as an int.
That's assuming strlen calls are the only things triggering the
warnings. And you'd still have to make hundreds of changes in the
code.

<http://groups.google.com/group/comp.lang.c/msg/3ef33439c43be6ac>

Precicely what I had in mind with the suggestion. I guess I missed your
previous post.

Presumably, his code worked in 32-bit mode, so his new function should
emulate behavior of the 32-bit version of strlen. That *should* keep it
from beaking anything downstream. That should include casting the unsigned
result into an int.
--
Bill C.

Sep 1 '07 #153

Ben Bacarisse

"Malcolm McLean" <re*******@btinternet.comwrites:

"Flash Gordon" <sp**@flash-gordon.me.ukwrote in message
news:sn************@news.flash-gordon.me.uk...
>Malcolm McLean wrote, On 31/08/07 16:18:
>>Yes qsort() takes two size_t's as well. So we are OK. The system
does work, but only so long as we are absolutely consistent in
using size_t everywhere.

Ah, he sees the light.

That's why Basic Algorithms is absolutely consistent in using
int. Otherwise I would either have to translate everything to size_t,
or you would rapidly risk a mess.

I can't understand why, since you acknowledge that part of the problem
is old code that uses int[1], you choose to perpetuate the problem in a
new book.

If you don't like the under score or the name for pedagogic reasons
just use a typedef in all the code (yes, someone else suggested this
already, my apologies for not looking up a giving credit -- it is
late). How about

typedef size_t cardinality;

? That suggests counting, indexing and size all in one.

[1] Elsewhere. It is not in the quoted text.

--
Ben.

Sep 1 '07 #154

pete

Ben Pfaff wrote:

An array of char can potentially have an index range of
0...SIZE_MAX.

Almost.
For

char array[SIZE_MAX];

the lvalue of the last element is (array[SIZE_MAX - 1]).

--
pete

Sep 1 '07 #155

pete

Richard Tobin wrote:

>
In article <87************@blp.benpfaff.org>,
Ben Pfaff <bl*@cs.stanford.eduwrote:

An array of char can potentially have an index range of
0...SIZE_MAX. An array of any larger object type has a more
limited index range. Therefore, size_t is always a suitable type
for representing an array index.

For a sufficiently restricted interpretation of array index. p[-3]
can be perfectly legal.

If (&p) is the address of an object of an array type,
then p[-3] isn't defined.

--
pete

Sep 1 '07 #156

pete

jacob navia wrote:

If aint'broken do not fix it

There is no simple solution.

There can't be any solution of any kind, if it aint'broken.

--
pete

Sep 1 '07 #157

Richard Heathfield

CBFalconer said:

jacob navia wrote:
>>
... snip ...
>>
Just

int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;

At which point your code has undefined behaviour.

No, at which point his code won't even compile.

Please read the standard some day.

I think he should start with something a little easier to understand.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 1 '07 #158

CBFalconer

pete wrote:

Richard Tobin wrote:
>Ben Pfaff <bl*@cs.stanford.eduwrote:

>>An array of char can potentially have an index range of
0...SIZE_MAX. An array of any larger object type has a more
limited index range. Therefore, size_t is always a suitable
type for representing an array index.

For a sufficiently restricted interpretation of array index.
p[-3] can be perfectly legal.

If (&p) is the address of an object of an array type,
then p[-3] isn't defined.

Disproof:

int aone[10];
int *const atwo = &aone[3];
/* atwo is now effectively an array of indices -3 thru 6 */
...
int i;
for (i = -3; i < 7; i++) atwo[i] = i; /* legal */

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Sep 1 '07 #159

Malcolm McLean

"Ben Bacarisse" <be********@bsb.me.ukwrote in message
news:87************@bsb.me.uk...

"Malcolm McLean" <re*******@btinternet.comwrites:

>That's why Basic Algorithms is absolutely consistent in using
int. Otherwise I would either have to translate everything to size_t,
or you would rapidly risk a mess.

I can't understand why, since you acknowledge that part of the problem
is old code that uses int[1], you choose to perpetuate the problem in a
new book.

Two things will happen.
Probably there will be a howl of protest as desktop programs move from 32 to
64 bits, and the implications of size_t being no longer the same size as an
int (give or take a sign bit) become obvious. So something will be done, and
people will look at code saying size_t i and say "Oh, that garbage the
committee inisted on back in 2007? What obsolete code."

The other possibility is that the committee will have its way, and we've all
got to write size_t for practically every array index. This makes C a
difficult language, OK for the specialist, but not very good for beginner
use. So it is no longer a good choice for a beginning book. Either use a
different language, or use a cut down, simplified version of the existing
language, with a note to say what you've done.

Either way, it is a bad idea to always follow the latest fashion in
programming. That way you've got to keep on rewriting things.
--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Sep 1 '07 #160

Richard Heathfield

Malcolm McLean said:

<snip>

Probably there will be a howl of protest as desktop programs move from
32 to 64 bits,

Why? Surely everyone has learned their lesson from the early 1990s -
"don't rely on exact-size types, or your code will break one day" -
haven't they?

and the implications of size_t being no longer the same
size as an int (give or take a sign bit) become obvious.

It has never been the case that size_t is the same size as an int,
except by coincidence. ints are sizeof(int) bytes big, whereas size_ts
are sizeof(size_t) bytes big. If these values are the same, that's an
interesting coincidence, but nothing more.

So something
will be done, and people will look at code saying size_t i and say
"Oh, that garbage the committee inisted on back in 2007? What obsolete
code."

(a) far from being garbage, size_t is a useful type;
(b) the committee codified size_t is 1989, not 2007;
(c) far from being obsolete, code that uses proper types in the proper
way is more likely to survive and flourish than code that does not.

<snip>

Either way, it is a bad idea to always follow the latest fashion in
programming.

Such as, say, 64-bit ints.

That way you've got to keep on rewriting things.

Precisely. Whereas, if you use the proper types in the right way, you
are less likely to have to do that.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 1 '07 #161

Keith Thompson

CBFalconer <cb********@yahoo.comwrites:

pete wrote:
>Richard Tobin wrote:
>>Ben Pfaff <bl*@cs.stanford.eduwrote:

An array of char can potentially have an index range of
0...SIZE_MAX. An array of any larger object type has a more
limited index range. Therefore, size_t is always a suitable
type for representing an array index.

For a sufficiently restricted interpretation of array index.
p[-3] can be perfectly legal.

If (&p) is the address of an object of an array type,
then p[-3] isn't defined.

Disproof:

int aone[10];
int *const atwo = &aone[3];
/* atwo is now effectively an array of indices -3 thru 6 */
...
int i;
for (i = -3; i < 7; i++) atwo[i] = i; /* legal */

No, there's no such thing as an array with indices -3 through 6 -- and
atwo is a pointer, not an array. But a good case could be made that
atwo points to the first element of an array of length 7 (that happens
to overlap the last 7 elements of aone). I'm not sure just how good a
case can be made; it depends on the exact wording of the standard
(which I don't have handy at the moment).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 1 '07 #162

Ian Collins

Malcolm McLean wrote:

"Ben Bacarisse" <be********@bsb.me.ukwrote:
>>

I can't understand why, since you acknowledge that part of the problem
is old code that uses int[1], you choose to perpetuate the problem in a
new book.

Two things will happen.
Probably there will be a howl of protest as desktop programs move from
32 to 64 bits, and the implications of size_t being no longer the same
size as an int (give or take a sign bit) become obvious. So something
will be done, and people will look at code saying size_t i and say "Oh,
that garbage the committee inisted on back in 2007? What obsolete code."

Those of us with decent desktops have been in the 64 bit world for well
over a decade and I haven't heard any howls yet.

The other possibility is that the committee will have its way, and we've
all got to write size_t for practically every array index.

They've had their way since 1989, where have you been? 64 bit desktops
started to appear shortly after.

This makes C
a difficult language, OK for the specialist, but not very good for
beginner use. So it is no longer a good choice for a beginning book.

Are you really saying too hard for windows programmers?

>
Either way, it is a bad idea to always follow the latest fashion in
programming. That way you've got to keep on rewriting things.

You must be behind the times Malcolm, there have been plenty of fashions
that have been and gone in the past 18 years.
--
Ian Collins.

Sep 1 '07 #163

pete

CBFalconer wrote:

>
pete wrote:

If (&p) is the address of an object of an array type,
then p[-3] isn't defined.

Disproof:

int aone[10];
int *const atwo = &aone[3];
/* atwo is now effectively an array of indices -3 thru 6 */
...
int i;
for (i = -3; i < 7; i++) atwo[i] = i; /* legal */

(&aone[3]) is the address of an object of type int.
Your disproof is irrelevant to my statement.

--
pete

Sep 1 '07 #164

Richard Tobin

In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.orgwrote:

>>>For a sufficiently restricted interpretation of array index.
p[-3] can be perfectly legal.

>No, there's no such thing as an array with indices -3 through 6 -- and
atwo is a pointer, not an array.

That's why I said "for a sufficiently restricted interpretation of
array index". How the standard defines array is unimportant; the
point is that in indexing, both sizes (which "should" be unsigned
size_ts) and offsets (both negative and positive) are used and
combined and compared. So I find the fact that sizes are inherently
positive unconvincing as an argument for their being unsigned.

The *real* reason for their being unsigned is that the good sizes for
signed ints have in the past been inadequate for addressing all
objects. At the risk of sounding like Mr Gates, I suggest that 63
bits will be quite adequate for object sizes throughout the future
life of C.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Sep 1 '07 #165

Richard Tobin

In article <46***********@mindspring.com>,
pete <pf*****@mindspring.comwrote:

> int aone[10];
int *const atwo = &aone[3];

>(&aone[3]) is the address of an object of type int.

I realise this is just pedantry, but who can complain?

Is the following legal:

typedef int array_type[7];
array_type *atwo = (array_type *)&aone[3];

and if so, what is the type of *atwo? And is not (*atwo)[-1] legal?

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Sep 1 '07 #166

CBFalconer

Malcolm McLean wrote:

>

.... snip ...

>
The other possibility is that the committee will have its way, and
we've all got to write size_t for practically every array index.
This makes C a difficult language, OK for the specialist, but not
very good for beginner use. So it is no longer a good choice for a
beginning book. Either use a different language, or use a cut down,
simplified version of the existing language, with a note to say
what you've done.

You don't type an array index because it's indexing an array. You
type it according to the values it has to hold. Similarly for
anything else. If an index has to hold any value returned from
strlen (which is a size_t) then it must be a size_t. If it has to
hold "sizeof double" it can be a char, a short, an int, a long, or
a size_t, and unsigned versions of all. I don't think anyone will
take you to task for assuming "sizeof double" is no larger than
127.

If you had ever had the training of using Pascal correctly, you
would be aware of this. There you first type the variable that
indexes an array (lower and upper bounds). Then you build an array
indexed by that type. Now the error detection will catch you
anytime you exceed the preset bounds in the index, and use of the
index involves no checks.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>
--
Posted via a free Usenet account from http://www.teranews.com

Sep 1 '07 #167

Joe Wright

Richard Heathfield wrote:

CBFalconer said:

>jacob navia wrote:
... snip ...
>>Just

int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;
At which point your code has undefined behaviour.

No, at which point his code won't even compile.

>Please read the standard some day.

I think he should start with something a little easier to understand.

This compiles just fine for me.

#include <stdio.h>

size_t Strlen(char *s) {
char *p = s;
if (p) while (*p) p++;
return p - s;
}

#define strlen Strlen

int main(void) {
char line[80] = "Are you kidding me?";
printf("The length of string \"%s\" is %d bytes.\n",
line, (int)strlen(line));
return 0;
}

Is there anything wrong with it?

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Sep 1 '07 #168

Harald van =?UTF-8?B?RMSzaw==?=

Joe Wright wrote:

This compiles just fine for me.

#include <stdio.h>

size_t Strlen(char *s) {
char *p = s;
if (p) while (*p) p++;
return p - s;
}

#define strlen Strlen

int main(void) {
char line[80] = "Are you kidding me?";
printf("The length of string \"%s\" is %d bytes.\n",
line, (int)strlen(line));
return 0;
}

Is there anything wrong with it?

No, ignoring style, there is nothing wrong with it, as long as <string.his
not included.

Sep 1 '07 #169

Richard Heathfield

<much snippage>

Joe Wright said:

Richard Heathfield wrote:
>CBFalconer said:
>>jacob navia wrote:
#define strlen Strlen_i;
At which point your code has undefined behaviour.

No, at which point his code won't even compile.

>>Please read the standard some day.

I think he should start with something a little easier to understand.

This compiles just fine for me.

Look at his code more closely. Much more closely. Vewy vewy cwosewy, in
fact. I have re-quoted the relevant line.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 1 '07 #170

Joe Wright

Richard Heathfield wrote:

<much snippage>

Joe Wright said:
>Richard Heathfield wrote:
>>CBFalconer said:
jacob navia wrote:
#define strlen Strlen_i;
At which point your code has undefined behaviour.
No, at which point his code won't even compile.

Please read the standard some day.
I think he should start with something a little easier to understand.

This compiles just fine for me.

Look at his code more closely. Much more closely. Vewy vewy cwosewy, in
fact. I have re-quoted the relevant line.

I see it (;) now. The admonishment to compile even your snippets before
posting is valid.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Sep 1 '07 #171

Joe Wright

Harald van Dijk wrote:

Joe Wright wrote:
>This compiles just fine for me.

#include <stdio.h>

size_t Strlen(char *s) {
char *p = s;
if (p) while (*p) p++;
return p - s;
}

#define strlen Strlen

int main(void) {
char line[80] = "Are you kidding me?";
printf("The length of string \"%s\" is %d bytes.\n",
line, (int)strlen(line));
return 0;
}

Is there anything wrong with it?

No, ignoring style, there is nothing wrong with it, as long as <string.his
not included.

Style? Anyway, what changes if I include <string.hafter <stdio.hand
before #define strlen Strlen ?

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Sep 1 '07 #172

Harald van =?UTF-8?B?RMSzaw==?=

Joe Wright wrote:

Harald van Dijk wrote:
>Joe Wright wrote:
>>This compiles just fine for me.

#include <stdio.h>

size_t Strlen(char *s) {
char *p = s;
if (p) while (*p) p++;
return p - s;
}

#define strlen Strlen

int main(void) {
char line[80] = "Are you kidding me?";
printf("The length of string \"%s\" is %d bytes.\n",
line, (int)strlen(line));
return 0;
}

Is there anything wrong with it?

No, ignoring style, there is nothing wrong with it, as long as <string.h>
is not included.

Style?

Defining your own functions with the same name as standard library functions
or macros is not something I would ever consider good style. Not even the
times when it's allowed and actually useful.

Anyway, what changes if I include <string.hafter <stdio.hand
before #define strlen Strlen ?

If <string.halready defines strlen as a macro, you will get a complaint
from your compiler that you're redefining the macro. If you make sure to
use #undef first, the behaviour is undefined.

Sep 1 '07 #173

jacob navia

Joe Wright wrote:

Richard Heathfield wrote:
><much snippage>

Joe Wright said:
>>Richard Heathfield wrote:
CBFalconer said:
jacob navia wrote:
>#define strlen Strlen_i;
At which point your code has undefined behaviour.
No, at which point his code won't even compile.

Please read the standard some day.
I think he should start with something a little easier to understand.

This compiles just fine for me.

Look at his code more closely. Much more closely. Vewy vewy cwosewy,
in fact. I have re-quoted the relevant line.

I see it (;) now. The admonishment to compile even your snippets before
posting is valid.

I do not think so.

Snippets are intended for people, not machines. Besides this, that guy
can only be satisfied with things like that:
missing semicolons, missing this or that.

Sep 1 '07 #174

Martin Wells

Malcolm McLean:

Yes. But psychological factors are also important. If an index variable is
called "size" then of course the compiler will happily chug through and
index the array by variable "size". However to anyone reading the program it
is intensely irritating.

typedef size_t index_t;

Martin

Sep 1 '07 #175

Martin Wells

Ian Collins:

In the little snippet I wrote just above, I'd only write a comment
with it if my target audience only started programming yesterday at 3
O'Clock.

That's because it can be written without the cast.

The cast is there to suppress a compiler warning. Anyway it seems we
disagree fundamentally on this so I don't think there's much point in
discussing it, other than a constant "I like cast" reply to "I don't
like cast".

>IMO, any decent compiler should issue truncation warnings.

Do you know of a "decent compiler" that does?

gcc.

Martin

Sep 1 '07 #176

Martin Wells

jacob navia:

Look, it is not the C standard that runs my code.

It is a mindless processor, churning instruction after instruction, no
mind no standards, no nothing.

I have an aesthetic view of code. What is important in it, from my
viewpoint, is clarity of design and above all, that
IT WORKS.

There is a price to be paid for this: Lack of portability.

You are now paying that price, and the headache you now suffer is the
fruit of your own actions.

Portability seems to be a key issue on this newsgroup, which is why
you aren't getting the replies you desire.

Martin

Sep 1 '07 #177

Martin Wells

Malcolm McLean

Two things will happen.
Probably there will be a howl of protest as desktop programs move from 32 to
64 bits, and the implications of size_t being no longer the same size as an
int (give or take a sign bit) become obvious. So something will be done, and
people will look at code saying size_t i and say "Oh, that garbage the
committee inisted on back in 2007? What obsolete code."

The other possibility is that the committee will have its way, and we've all
got to write size_t for practically every array index. This makes C a
difficult language, OK for the specialist, but not very good for beginner
use. So it is no longer a good choice for a beginning book. Either use a
different language, or use a cut down, simplified version of the existing
language, with a note to say what you've done.

Am I the only one who doesn't acknowledge any problems when moving
from 32-Bit to 64-Bit? That is to say, am I the only one who's being
using size_t properly?

Someone please tell me why it's so difficult to use size_t in the
following fashion:

#include <stddef.h>

void AddFiveToEachElement(int *p,size_t len)
{
while (len--) *p++ += 5;
}

If you throw portability out the window, as jacob navia has done, then
you are ASKING FOR THESE PROBLEMS. You're lighting a fuse... it may be
a very long fuse, but it eventually will go off.

Martin

Sep 1 '07 #178

Martin Wells

Ed Jensen:

1. Writing 100% portable code. This can be non-trivial and really
slow down your development. (However, I'm sure writing 100% portable
code doesn't slow down any of the geniuses HERE. I'm talking strictly
about MORTAL developers.)

I don't consider myself to be an Einstein by any stretch of the
imagination, but still I've no problem keeping my code portable.
Likely reason being that I focused on that fashion of programming
rather than played around with int's all the time.

"Well, just write your C code so it's 100% portable in the first
place. Easy! Problem solved! Only dummies don't do that!"

Writing portable code is really a hell of a lot easier, and even a
hell of a lot more satisfying, than you make it sound.

Martin

Sep 1 '07 #179

CBFalconer

Joe Wright wrote:

>
Richard Heathfield wrote:
CBFalconer said:

jacob navia wrote:
... snip ...
Just

int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;
At which point your code has undefined behaviour.
No, at which point his code won't even compile.

Please read the standard some day.
I think he should start with something a little easier to understand.
This compiles just fine for me.

#include <stdio.h>

size_t Strlen(char *s) {
char *p = s;
if (p) while (*p) p++;
return p - s;
}

AFAICS this has the same action as strlen.

>
#define strlen Strlen

This leads to undefined behaviour.

int main(void) {
char line[80] = "Are you kidding me?";
printf("The length of string \"%s\" is %d bytes.\n",
line, (int)strlen(line));
return 0;
}

Is there anything wrong with it?

Yes. See above.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Sep 1 '07 #180

Martin Wells

CBFalconer:

size_t Strlen(char *s) {
char *p = s;
if (p) while (*p) p++;
return p - s;
}

AFAICS this has the same action as strlen.

Just as an example, the strlen on Microsoft Windows compilers test
entire 4-byte chunks at a time looking for a byte which is all zeros.
It's a hell of a lot faster than using a canonical loop.

Martin

Sep 1 '07 #181

Ed Jensen

Martin Wells <wa****@eircom.netwrote:

>1. Writing 100% portable code. This can be non-trivial and really
slow down your development. (However, I'm sure writing 100% portable
code doesn't slow down any of the geniuses HERE. I'm talking strictly
about MORTAL developers.)

I don't consider myself to be an Einstein by any stretch of the
imagination, but still I've no problem keeping my code portable.
Likely reason being that I focused on that fashion of programming
rather than played around with int's all the time.

Choose all that apply:

1. You're mistaken.

2. You're a liar.

3. You don't get very much done.

>"Well, just write your C code so it's 100% portable in the first
place. Easy! Problem solved! Only dummies don't do that!"

Writing portable code is really a hell of a lot easier, and even a
hell of a lot more satisfying, than you make it sound.

Writing extremely portable code IS easy, just not in C.

Sep 1 '07 #182

Martin Wells

Ed Jensen

I don't consider myself to be an Einstein by any stretch of the
imagination, but still I've no problem keeping my code portable.
Likely reason being that I focused on that fashion of programming
rather than played around with int's all the time.

Choose all that apply:

1. You're mistaken.

2. You're a liar.

3. You don't get very much done.

I'll have to go with number 1. Sorry I'll try again:

Writing portable code in C is VERY easy.

Yeah that sounds about right.

Writing portable code is really a hell of a lot easier, and even a
hell of a lot more satisfying, than you make it sound.

Writing extremely portable code IS easy, just not in C.

Now you're just preaching about your own incompetence. Sorry to sound
hostile, but it's the truth.

Martin

Sep 1 '07 #183

Keith Thompson

jacob navia <ja***@jacob.remcomp.frwrites:

Ed Jensen wrote:

[...]

>"Well, just write your C code so it's 100% portable in the first
place. Easy! Problem solved! Only dummies don't do that!"

And then, like heathfield, they discover that they published a book
(c unleashed) with in one page the assumption that
sizeof(int) == sizeof(int *).

[...]

Yes, well, that's quite an effective refutation of Richard's claim
that he's infallible.

Except that he's never made such a claim.

He (or one of his co-authors) made a mistake. So what? That doesn't
affect his ability to offer good advice (which is checked for accuracy
by other readers here).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 1 '07 #184

Malcolm McLean

"jacob navia" <ja***@jacob.remcomp.frwrote in message
news:46***********************@news.orange.fr...

Ed Jensen wrote:
And then, like heathfield, they discover that they published a book
(c unleashed) with in one page the assumption that
sizeof(int) == sizeof(int *).

It is easy to play the guru here. More difficult in reality.

It is also a lot easier to find errors in books than to write one. Having
been through the same process I won't criticise Heathfield too much. They
can creep in during formatting as well as in development and testing. My
book had some errors as well.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Sep 1 '07 #185

Richard Heathfield

Keith Thompson said:

jacob navia <ja***@jacob.remcomp.frwrites:
>>
And then, like heathfield, they discover that they published a book
(c unleashed) with in one page the assumption that
sizeof(int) == sizeof(int *).
[...]

Yes, well, that's quite an effective refutation of Richard's claim
that he's infallible.

Except that he's never made such a claim.

Right (except in jest, of course). Nevertheless, don't assume that Mr
Navia's claim is correct without checking. It might be, of course, but
then again, it might not be.

He (or one of his co-authors) made a mistake.

Quite a few, alas. I don't recall any instances of assuming sizeof(int)
to be equal to sizeof(int *), but it's certainly possible. In the
absence of a more specific reference, however, I will assume that his
bug report has as much substance behind it as everything else he posts.
If I'm wrong to assume this, doubtless I'll find out in due course.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 1 '07 #186

Malcolm McLean

"Martin Wells" <wa****@eircom.netwrote in message
news:11**********************@d55g2000hsg.googlegr oups.com...

Ed Jensen

>Choose all that apply:

1. You're mistaken.

2. You're a liar.

3. You don't get very much done.

Now you're just preaching about your own incompetence. Sorry to sound
hostile, but it's the truth.

No, limited experience, Not the same thing as incompetence at all.

If you write say, mainly code to drive GUIs under Windows, you will find
that there's little point making much portable. Everything has to be ripped
up and rewritten whenever the denizens of Redmond decide to realease a new
compiler anyway.

However if you are writing mostly scientific programs, as I am doing at
present, everything has got to be portable. I've no business writing code
that can't be shifted to a mainframe or PC or whatever, as need arises. And
the graphical routines are in separate programs; the Beowulf cluster has a
simple teletype-style terminal as its communication with the outside world.

Even slash slash comments, which I thought were surely as good as standard
by now, are not accepted by the parallel compiler.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Sep 1 '07 #187

Malcolm McLean

"Martin Wells" <wa****@eircom.netwrote in message
news:11**********************@57g2000hsv.googlegro ups.com...

Malcolm McLean
#include <stddef.h>

void AddFiveToEachElement(int *p,size_t len)
{
while (len--) *p++ += 5;
}

If you throw portability out the window, as jacob navia has done, then
you are ASKING FOR THESE PROBLEMS. You're lighting a fuse... it
may be a very long fuse, but it eventually will go off.

What are those ints going to be used for? We don't know, but such a useful
function would surely find a place in calculating array indices, or
intermediate values, such as counts, to calculating array indices.

So we need another version

void AddFiveToEachElementsz(size_t *p, size_t len)

The fuse has gone off. That's what the admission of size_t does to your
code.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Sep 1 '07 #188

Malcolm McLean

"Martin Wells" <wa****@eircom.netwrote in message
news:11*********************@50g2000hsm.googlegrou ps.com...

CBFalconer:
Just as an example, the strlen on Microsoft Windows compilers test
entire 4-byte chunks at a time looking for a byte which is all zeros.
It's a hell of a lot faster than using a canonical loop.

No it's not. It's 4 times faster, which makes it O(N), which means it is
about as fast as the canonical loop.
Every man and his dog invents a new C strign library which performs the
length operation in O(constant) time.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Sep 1 '07 #189

Ben Pfaff

"Malcolm McLean" <re*******@btinternet.comwrites:

"Martin Wells" <wa****@eircom.netwrote in message
news:11**********************@57g2000hsv.googlegro ups.com...
>#include <stddef.h>

void AddFiveToEachElement(int *p,size_t len)
{
while (len--) *p++ += 5;
}

If you throw portability out the window, as jacob navia has done, then
you are ASKING FOR THESE PROBLEMS. You're lighting a fuse... it
may be a very long fuse, but it eventually will go off.

What are those ints going to be used for? We don't know, but such a
useful function would surely find a place in calculating array
indices, or intermediate values, such as counts, to calculating array
indices.

So we need another version

void AddFiveToEachElementsz(size_t *p, size_t len)

The fuse has gone off. That's what the admission of size_t does to
your code.

You need a version of the function for every single type that
might need to have 5 added to it. Adding size_t to the mix
doesn't change that very much.
--
int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p[i]\
);}return 0;}

Sep 1 '07 #190

Peter J. Holzer

On 2007-09-01 19:21, Malcolm McLean <re*******@btinternet.comwrote:

"Martin Wells" <wa****@eircom.netwrote in message
news:11**********************@57g2000hsv.googlegro ups.com...
>void AddFiveToEachElement(int *p,size_t len)
{
while (len--) *p++ += 5;
}

What are those ints going to be used for? We don't know, but such a useful
function would surely find a place in calculating array indices, or
intermediate values, such as counts, to calculating array indices.

No doubt it would also be very useful to calculating file offsets, which
are long

.... and trade flows which are doubles

.... and bank balances which are long long

So we need another version

void AddFiveToEachElementsz(size_t *p, size_t len)

so we need another version

void AddFiveToEachElementLong(long *p, size_t len)

and another one

void AddFiveToEachElementDouble(double *p, size_t len)

and another one

void AddFiveToEachElementLongLong(long long *p, size_t len)

The fuse has gone off. That's what the admission of size_t does to your
code.

Sep 1 '07 #191

Ben Pfaff

"Malcolm McLean" <re*******@btinternet.comwrites:

"Martin Wells" <wa****@eircom.netwrote in message
news:11*********************@50g2000hsm.googlegrou ps.com...
>CBFalconer:
Just as an example, the strlen on Microsoft Windows compilers test
entire 4-byte chunks at a time looking for a byte which is all zeros.
It's a hell of a lot faster than using a canonical loop.

No it's not. It's 4 times faster, which makes it O(N), which means it
is about as fast as the canonical loop.

4 times faster *is* a hell of a lot faster. Asymptotic
performance is not what the world is all about. In the end it's
all about how fast you can finish a particular task. The
asymptotic complexity of me adding numbers by hand is the same as
if the computer does it, but I tend to let the computer do it.
It's faster.
--
"The expression isn't unclear *at all* and only an expert could actually
have doubts about it"
--Dan Pop

Sep 1 '07 #192

Peter J. Holzer

On 2007-09-01 19:25, Malcolm McLean <re*******@btinternet.comwrote:

"Martin Wells" <wa****@eircom.netwrote in message
news:11*********************@50g2000hsm.googlegrou ps.com...
>CBFalconer:
Just as an example, the strlen on Microsoft Windows compilers test
entire 4-byte chunks at a time looking for a byte which is all zeros.
It's a hell of a lot faster than using a canonical loop.

No it's not. It's 4 times faster,

Probably less.

which makes it O(N), which means it is about as fast as the canonical
loop.

Sep 1 '07 #193

jacob navia

Peter J. Holzer wrote:

On 2007-09-01 19:25, Malcolm McLean <re*******@btinternet.comwrote:
>"Martin Wells" <wa****@eircom.netwrote in message
news:11*********************@50g2000hsm.googlegro ups.com...
>>CBFalconer:
Just as an example, the strlen on Microsoft Windows compilers test
entire 4-byte chunks at a time looking for a byte which is all zeros.
It's a hell of a lot faster than using a canonical loop.

No it's not. It's 4 times faster,

Probably less.

>which makes it O(N), which means it is about as fast as the canonical
loop.

By that kind of reasoning a snail is about as fast as a jet.

hp

Most of the strings in this application are less than 80 bytes long.

The difference is zero!

It is all swamped in the overhead of function call, and loop setup!

jacob

Sep 1 '07 #194

Mark McIntyre

On Sat, 01 Sep 2007 20:51:34 +0200, in comp.lang.c , jacob navia
<ja***@jacob.remcomp.frwrote:

>Standard C doesn't have

1) Any serious i/o. To do anything fast you need system specific stuff.
2) Any notion of the keyboard. To handle the keyboard you need system
specific stuff.
3) Any graphics. Ditto.
4) No network.
5) Not any timers with reasonable accuracy.

So? in any typical application, all the above interface specific stuff
can (and should) be separated from the meat of the programme.

>It would be possible to at least do something reasonable portable if the
standard would specify a reasonable string library, a common container
library, a common base for using in day to day programming.

Hey, didn't someone invent a new language cos they had similar issues,
remind us what its called?

>Or they do not use the network, nor do they do any graphics, nor do they
use any i/o, etc etc.

or they practice good progamming technique and isolate interface code
into different (and replaceable) libraries.

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Sep 1 '07 #195

Joe Wright

CBFalconer wrote:

Joe Wright wrote:
>Richard Heathfield wrote:
>>CBFalconer said:

jacob navia wrote:
... snip ...
Just
>
int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;
At which point your code has undefined behaviour.
No, at which point his code won't even compile.

Please read the standard some day.
I think he should start with something a little easier to understand.

This compiles just fine for me.

#include <stdio.h>

size_t Strlen(char *s) {
char *p = s;
if (p) while (*p) p++;
return p - s;
}

AFAICS this has the same action as strlen.

>#define strlen Strlen

This leads to undefined behaviour.

>int main(void) {
char line[80] = "Are you kidding me?";
printf("The length of string \"%s\" is %d bytes.\n",
line, (int)strlen(line));
return 0;
}

Is there anything wrong with it?

Yes. See above.

Not quite the same. See 'if (p)' checking for NULL.

Saying it doesn't make it so. The preprocessor does its thing early on
and by the time anything gets to the compiler, there is no reference to
strlen to be found, only to Strlen.

I suppose you don't like '#define strlen Strlen'. It has the effect of
removing a reference to a standard library function and replacing it
with the name of a local function before compilation. Harmless.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Sep 1 '07 #196

Malcolm McLean

"Peter J. Holzer" <hj*********@hjp.atwrote in message
news:sl************************@zeno.hjp.at...

>
and the admission of long, double, long long or any other type.

Let's face it, admitting types to C was a mistake.
We should go back to B.

The campaign for 64 bit ints wants int to be 64 bits. Then basically it's
ints for everything - no need for unsigned, 63 bits hold a number large
enough to count most things. Other types will be kept for special purposes.
Audio samples will be 16 bits for the foreseeable future, and you might need
a 32 bit type for interfacing with legacy libraries, and 128 bit longs for
cryptography. But bascially everything non-special can be an int, and the
problems disappear.

You've still got the problem of real numbers of course. The existence of two
and now three formats creates inefficiencies enough. But at least we'll have
the integers sorted out.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Sep 1 '07 #197

Keith Thompson

jacob navia <ja***@jacob.remcomp.frwrites:

Peter J. Holzer wrote:
>On 2007-09-01 19:25, Malcolm McLean <re*******@btinternet.comwrote:
>>"Martin Wells" <wa****@eircom.netwrote in message
news:11*********************@50g2000hsm.googlegr oups.com...
CBFalconer:
Just as an example, the strlen on Microsoft Windows compilers test
entire 4-byte chunks at a time looking for a byte which is all zeros.
It's a hell of a lot faster than using a canonical loop.

No it's not. It's 4 times faster,
Probably less.

>>which makes it O(N), which means it is about as fast as the canonical
loop.
By that kind of reasoning a snail is about as fast as a jet.

Most of the strings in this application are less than 80 bytes long.

The difference is zero!

It is all swamped in the overhead of function call, and loop setup!

Oh? Have you measured it?

Even if you have, your measurements apply only to your application.

strlen() is simple enough that re-inventing it isn't a huge deal; if
that's what you want to do, go ahead. But in general, predefined
functions are likely to be at least as fast as anything you can write
in portable C. (qsort() probaby imposes significant overhead because
it uses an indirect function call for each comparison, so a
custom-written sorting routine may be faster. But a custom-written
routine that does what qsort() does is unlikely to be faster than your
implementation's qsort().)

Even with small strings, a word-at-a-time version of strlen() might be
significantly faster if you invoke it enough times.

Note that I'm not advocating micro-optimization, i.e., obfuscating
your source code for the sake of some small performance increase. In
this case, the simplest code (calling the predefined strlen()) is both
simpler and likely faster than any replacement.

Of course, you could always re-write the application to use some other
representation for strings, so you don't have to call strlen() at all.
It might (or might not) give you a significant improvement in
performance and/or reliability if strlen() calls are a bottleneck, and
it's doable in purely standard C.

The performance difference between the predefined strlen() and your
re-implementation of it may not be significant, but you seem to be
offended by the idea of calling strlen(), and I have no idea why.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 1 '07 #198

Malcolm McLean

"Peter J. Holzer" <hj*********@hjp.atwrote in message
news:sl************************@zeno.hjp.at...

On 2007-09-01 19:25, Malcolm McLean <re*******@btinternet.comwrote:

By that kind of reasoning a snail is about as fast as a jet.

The snail, going West, is moving towards the Andromeda galaxy at 50.000001
km/s. The jet, going East, is moving towards Andromeda at about 49.660 km/s,
assuming it's a Concorde.

So to two decimal places, the snail is about as fast as the jet.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Sep 1 '07 #199

Richard Heathfield

Malcolm McLean said:

>
"Peter J. Holzer" <hj*********@hjp.atwrote in message
news:sl************************@zeno.hjp.at...
>On 2007-09-01 19:25, Malcolm McLean <re*******@btinternet.comwrote:

By that kind of reasoning a snail is about as fast as a jet.

The snail, going West, is moving towards the Andromeda galaxy at
50.000001 km/s. The jet, going East, is moving towards Andromeda at
about 49.660 km/s, assuming it's a Concorde.

If it's a Concorde, it isn't going East, and it's travelling rather
slower than the snail.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 1 '07 #200

Similar topics