Why doesn't strrstr() exist?

Christopher Benson-Manica

(Followups set to comp.std.c. Apologies if the crosspost is unwelcome.)

strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
isn't part of the standard. Why not?

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome.

Nov 15 '05 #1

Subscribe Post Reply

149

24945

Default User

Christopher Benson-Manica wrote:

(Followups set to comp.std.c. Apologies if the crosspost is
unwelcome.)

Followup UNSET.

I think it's dumb to do this. It annoys the crap out of me to have a
post in a group I read, with a followup set to one I DON'T read. If the
post was appropriate for comp.lang.c, then so are replies. If replies
aren't, then the post should never have been made here.

I find it rude and obnoxious.

Brian

Nov 15 '05 #2

Christopher Benson-Manica

Default User <de***********@yahoo.com> wrote:

I think it's dumb to do this. It annoys the crap out of me to have a
post in a group I read, with a followup set to one I DON'T read. If the
post was appropriate for comp.lang.c, then so are replies. If replies
aren't, then the post should never have been made here.
The fact that many comp.lang.c regulars do not (judging from the
paucity of posts) follow comp.std.c was my primary motivation for
crossposting it to this group as well.

I set the folloups to c.s.c only because tin seems to think it is bad
netiquette to set followups to more than one newsgroup...
I find it rude and obnoxious.

For that I humbly apologize; consider the lesson learned.

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome.

Nov 15 '05 #3

Stephen Hildrey

Default User wrote:

I find it rude and obnoxious.

....says Usenet expert "Default User" :-)

Steve
--
Stephen Hildrey
Mail: st***@uptime.org.uk / Tel: +442071931337
Jabber: st***@jabber.earth.li / MSN: fo*@hotmail.co.uk

Nov 15 '05 #4

SM Ryan

Christopher Benson-Manica <at***@nospam.cyberspace.org> wrote:
# (Followups set to comp.std.c. Apologies if the crosspost is unwelcome.)
#
# strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
# isn't part of the standard. Why not?

char *strrstr(char *x,char *y) {
int m = strlen(x);
int n = strlen(y);
char *X = malloc(m+1);
char *Y = malloc(n+1);
int i;
for (i=0; i<m; i++) X[m-1-i] = x[i]; X[m] = 0;
for (i=0; i<n; i++) Y[n-1-i] = y[i]; Y[n] = 0;
char *Z = strstr(X,Y);
if (Z) {
int ro = Z-X;
int lo = ro+n-1;
int ol = m-1-lo;
Z = x+ol;
}
free(X); free(Y);
return Z;
}

--
SM Ryan http://www.rawbw.com/~wyrmwif/
If your job was as meaningless as theirs, wouldn't you go crazy too?

Nov 15 '05 #5

Alan Balmer

On Thu, 25 Aug 2005 17:24:41 +0100, Stephen Hildrey
<st***@uptime.org.uk> wrote:

Default User wrote:
I find it rude and obnoxious.

...says Usenet expert "Default User" :-)

Steve

Don't have to be an expert to find it rude and obnoxious, for the
reasons "Default User" gives.
--
Al Balmer
Balmer Consulting
re************************@att.net

Nov 15 '05 #6

Walter Roberson

In article <11*************@corp.supernews.com>,
SM Ryan <wy*****@tango-sierra-oscar-foxtrot-tango.fake.org> wrote:

char *strrstr(char *x,char *y) {
int m = strlen(x);
int n = strlen(y);
char *X = malloc(m+1);
char *Y = malloc(n+1);
Small changes: strlen has a result type of size_t, not int, and
malloc() takes a parameter of type size_t, not int. A small change to
the declaratons of m and n fixes both issues.
int i;
for (i=0; i<m; i++) X[m-1-i] = x[i]; X[m] = 0;
for (i=0; i<n; i++) Y[n-1-i] = y[i]; Y[n] = 0;
As per the above, m and n are size_t not int, so i needs to be size_t
as well.

Also, you don't check to see whether the malloc() returned NULL.
char *Z = strstr(X,Y);
if (Z) {
int ro = Z-X;
int lo = ro+n-1;
int ol = m-1-lo;
Z = x+ol;
This starts to get into murky waters. Z-X is a subtraction
of pointers, the result of which is ptrdiff_t, which is a signed
integral type. Logically, though, Z-X could be of size_t, which
is unsigned. This difference has probably been discussed in the past,
but I have not happened to see the discussion of what happens with
pointer subtraction if the object size would fit in the unsigned
type but not in the signed type. Anyhow, ro, lo, ol should not be int.
}
free(X); free(Y);
return Z;
}

--
Look out, there are llamas!

Nov 15 '05 #7

Default User

Christopher Benson-Manica wrote:

Default User <de***********@yahoo.com> wrote:
I think it's dumb to do this. It annoys the crap out of me to have a
post in a group I read, with a followup set to one I DON'T read. If
the post was appropriate for comp.lang.c, then so are replies. If
replies aren't, then the post should never have been made here.
The fact that many comp.lang.c regulars do not (judging from the
paucity of posts) follow comp.std.c was my primary motivation for
crossposting it to this group as well.

Right. If I don't read c.s.c, I sure don't want to have to subscribe to
follow one thread.
I set the folloups to c.s.c only because tin seems to think it is bad
netiquette to set followups to more than one newsgroup...
I find it rude and obnoxious.

For that I humbly apologize; consider the lesson learned.

I was harsher than I needed to be there. Sorry for going a bit over the
top.

Brian

Nov 15 '05 #8

Default User

Stephen Hildrey wrote:

Default User wrote:
I find it rude and obnoxious.

...says Usenet expert "Default User" :-)

You feel that I my choice of moniker reflects something about my level
of expertise? Note that "Default User" is NOT the default name in
XanaNews, my current newsreader.

Brian

Nov 15 '05 #9

Eric Sosman

SM Ryan wrote:

Christopher Benson-Manica <at***@nospam.cyberspace.org> wrote:
# (Followups set to comp.std.c. Apologies if the crosspost is unwelcome.)
#
# strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
# isn't part of the standard. Why not?

char *strrstr(char *x,char *y) {
int m = strlen(x);
int n = strlen(y);
ITYM size_t, here and throughout.
char *X = malloc(m+1);
char *Y = malloc(n+1);
if (X == NULL || Y == NULL) ...?
int i;
for (i=0; i<m; i++) X[m-1-i] = x[i]; X[m] = 0;
for (i=0; i<n; i++) Y[n-1-i] = y[i]; Y[n] = 0;
char *Z = strstr(X,Y);
if (Z) {
int ro = Z-X;
int lo = ro+n-1;
int ol = m-1-lo;
Z = x+ol;
}
free(X); free(Y);
return Z;
}

Untested:

#include <string.h>
/* @NOPEDANTRY: ignore use of reserved identifier */
char *strrstr(const char *x, const char *y) {
char *prev = NULL;
char *next;
if (*y == '\0')
return strchr(x, '\0');
while ((next = strstr(x, y)) != NULL) {
prev = next;
x = next + 1;
}
return prev;
}

The behavior when y is empty is a matter of taste
and/or debate. The code above takes the view that the
rightmost occurrence in x of the empty string is the
one that appears (if that's the right word) just prior
to x's terminating zero; other conventions are surely
possible and might turn out to be better.

Note that simply omitting the test on y would be
an error: an empty y would then cause the while loop
to run off the end of x.

--
Er*********@sun.com

Nov 15 '05 #10

Douglas A. Gwyn

SM Ryan wrote:

Christopher Benson-Manica <at***@nospam.cyberspace.org> wrote:
# strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
# isn't part of the standard. Why not?
char *strrstr(char *x,char *y) {
int m = strlen(x);
int n = strlen(y);
char *X = malloc(m+1);
char *Y = malloc(n+1);
...

If one really wanted to use the function, that implementation
would be problematic.

I think the real answer is that there were lots of uses for
strstr() and few if any requests for strrstr() functionality.
Why specify/require it if it won't be used?

Also note that if you want to implement such a function you
might benefit from reading my chapter on string searching in
"Software Solutions in C" (ed. Dale Schumacher).

Nov 15 '05 #11

Douglas A. Gwyn

Walter Roberson wrote:

This starts to get into murky waters. Z-X is a subtraction
of pointers, the result of which is ptrdiff_t, which is a signed
integral type. Logically, though, Z-X could be of size_t, which
is unsigned. This difference has probably been discussed in the past,
but I have not happened to see the discussion of what happens with
pointer subtraction if the object size would fit in the unsigned
type but not in the signed type. Anyhow, ro, lo, ol should not be int.

ptrdiff_t is supposed to be defined as a type wide enough to
accommodate *any* possible result of a valid subtraction of
pointers to objects. If an implementation doesn't *have* a
suitable integer type, that is a deficiency..

Anyway, when you know which pointer is less than the other,
you can always subtract the lesser from the greater and the
result will then always be appropriately represented using
size_t. If you really had to worry about these limits in
some situation, you could first test which is lesser, then
use two branches in the code with size_t in each one.

Nov 15 '05 #12

Walter Roberson

In article <43***************@null.net>,
Douglas A. Gwyn <DA****@null.net> wrote:

Anyway, when you know which pointer is less than the other,
you can always subtract the lesser from the greater and the
result will then always be appropriately represented using
size_t. If you really had to worry about these limits in
some situation, you could first test which is lesser, then
use two branches in the code with size_t in each one.

It seems to me that you are implying that the maximum
object size that a C implementation may support, is only
half of the memory addressible in that address mode --
e.g., maximum 2 Gb object on a 32 bit (4 Gb span)
pointer machine. This limitation being necessary so that
the maximum object size would fit in a signed storage
location, just in case you wanted to do something like

(object + sizeof object) - object

"logically" the result would be sizeof object, an
unsigned type, but the pointer subtraction is defined
as returning a signed value, so the maximum
magnitude of the signed value would have to be at least
as great as the maximum magnitude of the unsigned value...

number_of_usable_bits(size_t) < number_of_usable_bits(ptrdiff_t)

[provided, that is, that one is not using a seperate-sign-bit
machine.]
The machines I use most often -happen- to have that property
anyhow, because the high-bit on a pointer is reserved for
indicating kernel memory space, but I wonder about the extent
to which this is true on other machines?
--
Ceci, ce n'est pas une idée.

Nov 15 '05 #13

pete

Douglas A. Gwyn wrote:

ptrdiff_t is supposed to be defined as a type wide enough to
accommodate *any* possible result of a valid subtraction of
pointers to objects.

What are you talking about?

Is your point that ptrdiff_t is actually defined
opposite of the way that it's supposed to be?

"If the result is not representable in an object of that type,
the behavior is undefined.
In other words, if the expressions P and Q point to,
respectively, the i-th and j-th elements of an array object,
the expression (P)-(Q) has the value i-j
provided the value fits in an object of type ptrdiff_t."

--
pete

Nov 15 '05 #14

Old Wolf

SM Ryan wrote:

#
# strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
# isn't part of the standard. Why not?

char *strrstr(char *x,char *y) {
int m = strlen(x);
int n = strlen(y);
char *X = malloc(m+1);
char *Y = malloc(n+1);
Using dynamic allocation for this function? You have got
to be kidding
int i;
for (i=0; i<m; i++) X[m-1-i] = x[i]; X[m] = 0;
for (i=0; i<n; i++) Y[n-1-i] = y[i]; Y[n] = 0;
char *Z = strstr(X,Y);
if (Z) {
int ro = Z-X;
int lo = ro+n-1;
int ol = m-1-lo;
Z = x+ol;
I don't know which is more obfuscated -- your code, or your
quote marker
}
free(X); free(Y);
return Z;
}

Nov 15 '05 #15

Keith Thompson

Christopher Benson-Manica <at***@nospam.cyberspace.org> writes:

strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
isn't part of the standard. Why not?

I don't think anyone has posted the real reason: it's arbitrary. The
C standard library isn't a coherently designed entity. It's a
collection of functionality from historical implementations,
consisting largely of whatever seemed like a good idea at the time,
filtered through the standards committee. Just look at the continuing
existence of gets(), or the design of <time.h>.

It's remarkable (and a tribute to the original authors and to the
committee) that the whole thing works as well as it does.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 15 '05 #16

SM Ryan

Eric Sosman <er*********@sun.com> wrote:
#
#
# SM Ryan wrote:
# > Christopher Benson-Manica <at***@nospam.cyberspace.org> wrote:
# > # (Followups set to comp.std.c. Apologies if the crosspost is unwelcome.)
# > #
# > # strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
# > # isn't part of the standard. Why not?
# >
# > char *strrstr(char *x,char *y) {
# > int m = strlen(x);
# > int n = strlen(y);

Time complexity can be O(m+n), since strstr can be O(m+n)
and O(2m+2n) = O(m+n).

# char *strrstr(const char *x, const char *y) {
# char *prev = NULL;
# char *next;
# if (*y == '\0')
# return strchr(x, '\0');
# while ((next = strstr(x, y)) != NULL) {
# prev = next;
# x = next + 1;
# }
# return prev;
# }

Potentially O(m*n), depending on how often characters repeat in y.

--
SM Ryan http://www.rawbw.com/~wyrmwif/
No pleasure, no rapture, no exquisite sin greater than central air.

Nov 15 '05 #17

Antoine Leca

En <news:43***************@null.net>, Douglas A. Gwyn va escriure:

Also note that if you want to implement such a function you
might benefit from reading my chapter on string searching in
"Software Solutions in C" (ed. Dale Schumacher).

The straightforward idea (using strstr() in a loop and returning the last
not-NULL answer, as strrchr() usually does) won't be a good one?
At least it would take profit from the optimized form of strstr() often
found (several people reported here that the shipped strstr()'s regularly
outperform crafted algorithms like Boyer-Moore.)

Not that I see any use for strrstr(), except perhaps to do the same as
strrchr() when c happens to be a multibyte character in a stateless
encoding.
Antoine

Nov 15 '05 #18

Michael Wojcik

In article <de**********@chessie.cirr.com>, Christopher Benson-Manica <at***@nospam.cyberspace.org> writes:

Default User <de***********@yahoo.com> wrote:
I think it's dumb to [set followups].
I find it rude and obnoxious.

For that I humbly apologize; consider the lesson learned.

What lesson? That Brian doesn't like the Followup-To header? I
wouldn't recommend tailoring your posting habits solely to his
preferences. Setting Followup-To on crossposted messages is
recommended by a number of netiquette guides and Son-of-1036. Some
people dislike it; other people - some of whom felt sufficiently
animated by the subject to formalize their thoughts in usage guides -
do not.

My inclination, frankly, is to follow the recommendations of the
group which can be bothered to promulgate guidelines, over the
complaints of those who can't be bothered to do more than complain.
Sometimes there are good reasons (a clear majority of opinion or
well-established practice in a given group, for example) for
observing other conventions, but I don't see any of those here. What
I see is one poster (well, two, since I've seen Alan chime in as well)
complaining about a widely-recommended practice.

--
Michael Wojcik mi************@microfocus.com

I will shoue the world one of the grate Wonders of the world in 15
months if Now man mourders me in Dors or out Dors
-- "Lord" Timothy Dexter, _A Pickle for the Knowing Ones_

Nov 15 '05 #19

Christopher Benson-Manica

Michael Wojcik <mw*****@newsguy.com> wrote:

Sometimes there are good reasons (a clear majority of opinion or
well-established practice in a given group, for example) for
observing other conventions, but I don't see any of those here. What
I see is one poster (well, two, since I've seen Alan chime in as well)
complaining about a widely-recommended practice.

Be that as it may, I would not want to deprive myself of the
opportunity of receiving a response from Brian or Alan over a
netiquette nitpick; perhaps the safe thing to do is to avoid including
comp.lang.c in crossposts altogether. There aren't many cases where
doing so is appropriate anyway...

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome.

Nov 15 '05 #20

Alan Balmer

On 26 Aug 2005 12:59:15 GMT, mw*****@newsguy.com (Michael Wojcik)
wrote:

In article <de**********@chessie.cirr.com>, Christopher Benson-Manica <at***@nospam.cyberspace.org> writes:
Default User <de***********@yahoo.com> wrote:
> I think it's dumb to [set followups].
> I find it rude and obnoxious.

For that I humbly apologize; consider the lesson learned.

What lesson? That Brian doesn't like the Followup-To header? I
wouldn't recommend tailoring your posting habits solely to his
preferences. Setting Followup-To on crossposted messages is
recommended by a number of netiquette guides and Son-of-1036. Some
people dislike it; other people - some of whom felt sufficiently
animated by the subject to formalize their thoughts in usage guides -
do not.

My inclination, frankly, is to follow the recommendations of the
group which can be bothered to promulgate guidelines, over the
complaints of those who can't be bothered to do more than complain.
Sometimes there are good reasons (a clear majority of opinion or
well-established practice in a given group, for example) for
observing other conventions, but I don't see any of those here. What
I see is one poster (well, two, since I've seen Alan chime in as well)
complaining about a widely-recommended practice.

It's really very simple. If one doesn't want discussion in a
newsgroup, don't post to it.

That's my guideline, hereby promulgated. Complain if you like.
--
Al Balmer
Balmer Consulting
re************************@att.net

Nov 15 '05 #21

Michael Wojcik

In article <fh********************************@4ax.com>, Alan Balmer <al******@att.net> writes:

On 26 Aug 2005 12:59:15 GMT, mw*****@newsguy.com (Michael Wojcik)
wrote:
What
I see is one poster (well, two, since I've seen Alan chime in as well)
complaining about a widely-recommended practice.
It's really very simple. If one doesn't want discussion in a
newsgroup, don't post to it.

That's my guideline, hereby promulgated.

Posting it hardly constitutes promulgation. When you include it in a
serious, substantial discussion of netiquette, made available as a
separate document, and preferably submitted to a standards mechanism
(say, as an Internet-Draft), then I'll consider it promulgated.
Promulgation must mean something other than "writing in public", or
why have the term at all?

There's a difference between, on the one hand, taking the time to
consider the nature of discourse in a medium, developing from that
theories of how best to use that medium, formulating those theories
as claims about best practices, constructing arguments in favor of
those practices, setting the lot down in a durable public document,
and submitting it for review; and on the other tossing out some
statement of preference in a note dashed off in a few seconds in some
conversation on a newsgroup where the question isn't even topical.

For me, that's a significant difference. For others, no doubt, it
is not; but it suffices for me to justify, to myself, disregarding
complaints about, say, cross-posting and followup-to when those
features are used in a manner that accords with most promulgated
guidelines.
Complain if you like.

I don't, particularly, since I don't really care what guidelines
people toss out in Usenet postings. What I do care about are the
ones that are arrived at by serious consideration and presented
with substantial justification.

Of course, that doesn't mean that there should be no discussion
of the question - quite the opposite, since it informs those who
might go on to produce the latter sort of guideline.

Tangentially, I might note that the reason I originally replied to
Christopher's post was that I feared he might believe that Brian's
opinion represented a consensus. It does not. (Should Christopher
choose to shape his behavior to it anyway, that's his business.)

--
Michael Wojcik mi************@microfocus.com

How can I sing with love in my bosom?
Unclean, immature and unseasonable salmon. -- Basil Bunting

Nov 15 '05 #22

Douglas A. Gwyn

Walter Roberson wrote:

It seems to me that you are implying that the maximum
object size that a C implementation may support, is only
half of the memory addressible in that address mode --
No, I was saying that *if* a C implementation doesn't
support some integer type with more bits than are needed
to represent an address, *and if* the compiler supports
objects larger than half the available address space,
*then* then the definition of ptrdiff_t becomes
problematic. Note all the conditions..
The machines I use most often -happen- to have that property
anyhow, because the high-bit on a pointer is reserved for
indicating kernel memory space, but I wonder about the extent
to which this is true on other machines?

Now that 64-bit integer support is required for C
conformance, there should be a suitable ptrdiff_t type
available except on systems that support processes with
data sizes greater than 2^63 bytes. I don't know of
many systems like that..

Nov 15 '05 #23

Douglas A. Gwyn

Antoine Leca wrote:

The straightforward idea (using strstr() in a loop and returning the last
not-NULL answer, as strrchr() usually does) won't be a good one?
Well, it won't be optimal, since it searches the entire string
even when a match could have been found immediately if the
scan progressed from the end of the string. Finding the end
of the string initially has relatively high overhead, alas,
due to the representation of C strings. It isn't immediately
obvious just what the trade-off is between starting at the end
and scanning backward vs. the algoritm you suggested. Probably,
unless strrstr() is a bottleneck in the app, what you suggested
will be good enough.
At least it would take profit from the optimized form of strstr()
Yes, that is useful.

What I was actually concerned about was that people might
implement the naive "brute-force" method of attempting matches
at each incremental (decremental?) position, which is okay for
occasional use but certainly not nearly the fastest method.
(several people reported here that the shipped strstr()'s
regularly outperform crafted algorithms like Boyer-Moore.)
I compared various algorithms in the book to which I referred.
Not that I see any use for strrstr(), except perhaps to do the same as
strrchr() when c happens to be a multibyte character in a stateless
encoding.

Even then it's problematic, because the search would not respect
alignment with boundaries between character encodings.

Nov 15 '05 #24

Douglas A. Gwyn

Keith Thompson wrote:

I don't think anyone has posted the real reason: it's arbitrary. The
C standard library isn't a coherently designed entity. It's a
collection of functionality from historical implementations,
consisting largely of whatever seemed like a good idea at the time,
filtered through the standards committee. ...

That is far from arbitrary. The evolution of C library
functions was substantially influenced by the demands of
practical programming, and many of the interfaces went
through several iterations in the early years of C, as
deficiencies in earlier versions were identified. The C
standards committee quite reasonably chose to standardize
existing interfaces rather than try to design totally new
ones. Many of the standard interfaces are not at all
what we would come up with in a new design.

Nov 15 '05 #25

Default User

Michael Wojcik wrote:

Tangentially, I might note that the reason I originally replied to
Christopher's post was that I feared he might believe that Brian's
opinion represented a consensus. It does not. (Should Christopher
choose to shape his behavior to it anyway, that's his business.)

You have no idea whether it represents a consensus or not. A
"consensus" is not necessarily complete unanimity.

Brian

Nov 15 '05 #26

websnarf

Keith Thompson wrote:

Christopher Benson-Manica <at***@nospam.cyberspace.org> writes:
strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
isn't part of the standard. Why not?

I don't think anyone has posted the real reason: it's arbitrary. The
C standard library isn't a coherently designed entity. It's a
collection of functionality from historical implementations,
consisting largely of whatever seemed like a good idea at the time,
filtered through the standards committee. Just look at the continuing
existence of gets(), or the design of <time.h>.

It's remarkable (and a tribute to the original authors and to the
committee) that the whole thing works as well as it does.

When you look at the world through rose color glasses ...

Remember that almost every virus, buffer overflow exploit, core
dump/GPF/etc is basically due to some undefined situation in the ANSI C
standard. I consider the ANSI C standard committee basically coauthors
of every one of these problems.

---
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 15 '05 #27

Randy Howard

we******@gmail.com wrote
(in article
<11********************@g49g2000cwa.googlegroups.c om>):

Keith Thompson wrote:
Christopher Benson-Manica <at***@nospam.cyberspace.org> writes:
strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
isn't part of the standard. Why not?
I don't think anyone has posted the real reason: it's arbitrary. The
C standard library isn't a coherently designed entity. It's a
collection of functionality from historical implementations,
consisting largely of whatever seemed like a good idea at the time,
filtered through the standards committee. Just look at the continuing
existence of gets(), or the design of <time.h>.

It's remarkable (and a tribute to the original authors and to the
committee) that the whole thing works as well as it does.

When you look at the world through rose color glasses ...

Well, at least some seem to have their eyes fully open.
Remember that almost every virus, buffer overflow exploit, core
dump/GPF/etc is basically due to some undefined situation in the ANSI C
standard.
Not really. Those that defined early C, and later standard C
are not responsible for bad programming. If a programmer has
access to the standard (which they do), and they decide to do
something which 'invokes undefined behavior', then it is their
fault. The standard says do not do that, and they did it
anyway.
I consider the ANSI C standard committee basically coauthors
of every one of these problems.

I couldn't disagree more. If programmers themselves were held
responsible for their mistakes, instead of trying to blame it on
loopholes or missing words in a huge document, we would be much
better off. If you could be fined or perhaps even jailed for
gross neglicence in software development the way doctors can be
today, I suspect the problem would be all but nonexistent.
--
Randy Howard (2reply remove FOOBAR)

Nov 15 '05 #28

websnarf

Randy Howard wrote:

we******@gmail.com wrote:
Keith Thompson wrote:
Christopher Benson-Manica <at***@nospam.cyberspace.org> writes:
strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
isn't part of the standard. Why not?

I don't think anyone has posted the real reason: it's arbitrary. The
C standard library isn't a coherently designed entity. It's a
collection of functionality from historical implementations,
consisting largely of whatever seemed like a good idea at the time,
filtered through the standards committee. Just look at the continuing
existence of gets(), or the design of <time.h>.

It's remarkable (and a tribute to the original authors and to the
committee) that the whole thing works as well as it does.
When you look at the world through rose color glasses ...

Well, at least some seem to have their eyes fully open.
Remember that almost every virus, buffer overflow exploit, core
dump/GPF/etc is basically due to some undefined situation in the ANSI C
standard.

Not really. Those that defined early C, and later standard C
are not responsible for bad programming.

Bad programming + good programming language does not allow for buffer
overflow exploits. You still need a bad programming language to
facilitate the manifestation of these worst case scenarios.
[...] If a programmer has access to the standard (which they
do), and they decide to do something which 'invokes undefined
behavior', then it is their fault. The standard says do not
do that, and they did it anyway.
Ok, this is what I was talking about when I mentioned rose colored
glasses. If programmers are perfect, then what you are saying is fine,
because you can expect perfection. But real people are not. And I
think expectations of perfection in programming is really nonsensical.

Remember NASA put a priority inversion (a truly nasty bug to deal with)
in the mars pathfinder. The Arianne rocket blew up because of an
overflow triggering an interrupt handler that was faulty. You think
the programmers for these projects were not trying their best to do a
good job? Perfect programmers/programming is a pipedream. There is a
reason we paint lines on the roads, wear seatbelts, put guardrails on
stairs and bridges.

The problem of programmer safety can be attacked quite successfully at
the level of the programming language itself. There isn't actually a
downside to removing gets() and deprecating strtok and strnc??. (Hint:
Legacy code uses legacy compilers.)

I consider the ANSI C standard committee basically coauthors
of every one of these problems.

I couldn't disagree more. If programmers themselves were held
responsible for their mistakes, instead of trying to blame it on
loopholes or missing words in a huge document, we would be much
better off.

And what if its not the programmer's fault? What if the programmer is
being worked to death? What if he's in a dispute with someone else
about how something should be done and lost the argument and was forced
to do things badly?
[...] If you could be fined or perhaps even jailed for
gross neglicence in software development the way doctors can be
today, I suspect the problem would be all but nonexistent.

Ok, that's just vindictive nonsense. Programmers are generally not
aware of the liability of their mistakes. And mistakes are not
completely removable -- and there's a real question as to whether the
rate can even be reduced.

But if you were to truly enforce such an idea, I believe both C and C++
as programming languages would instantly disappear. Nobody in their
right mind, other than the most irresponsible daredevils would program
in these langauges if they were held liable for their mistakes.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 15 '05 #29

Randy Howard

we******@gmail.com wrote
(in article
<11*********************@o13g2000cwo.googlegroups. com>):

Remember that almost every virus, buffer overflow exploit, core
dump/GPF/etc is basically due to some undefined situation in the ANSI C
standard.
Not really. Those that defined early C, and later standard C
are not responsible for bad programming.

Bad programming + good programming language does not allow for buffer
overflow exploits.

For suitably high-level languages that might be true (and
provable). Let us not forget that C is *not* a high-level
language. It's not an accident that it is called high-level
assembler.

I'd love for you to explain to us, by way of example, how you
could guarantee that assembly programmers can not be allowed to
code in a way that allows buffer overflows.
You still need a bad programming language to
facilitate the manifestation of these worst case scenarios.
If you wish to argue that low-level languages are 'bad', I will
have to disagree. If you want to argue that too many people
write code in C when their skill level is more appropriate to a
language with more seatbelts, I won't disagree. The trick is
deciding who gets to make the rules.

[...] If a programmer has access to the standard (which they
do), and they decide to do something which 'invokes undefined
behavior', then it is their fault. The standard says do not
do that, and they did it anyway.

Ok, this is what I was talking about when I mentioned rose colored
glasses. If programmers are perfect, then what you are saying is fine,
because you can expect perfection. But real people are not. And I
think expectations of perfection in programming is really nonsensical.

/Exactly/ Expecting zero buffer overruns is nonsensical.
Remember NASA put a priority inversion (a truly nasty bug to deal with)
in the mars pathfinder. The Arianne rocket blew up because of an
overflow triggering an interrupt handler that was faulty. You think
the programmers for these projects were not trying their best to do a
good job?
No, I do not. I expect things to go wrong, because humans are
not infallible. Especially in something as inherently difficult
as space travel. It's not like you can test it (for real)
before you try it for all the marbles. You can't just hire an
army of monkey to sit in a lab beating on the keyboarrd all day
like an application company.

Anyway, a language so restrictive as to guarantee that nothing
can go wrong will probably never be used for any real-world
project.
Perfect programmers/programming is a pipedream.
So is the idea of a 'perfect language'.
There is a
reason we paint lines on the roads, wear seatbelts, put guardrails on
stairs and bridges.
Yes. And we require licenses for dangerous activities
elsewhere, but anyone can pick up a compiler and start playing
around.
The problem of programmer safety can be attacked quite successfully at
the level of the programming language itself.
It's quite easy to simply make the use of gets() and friends
illegal for your code development. Most of us have already done
so, without a standard body telling us to do it.
There isn't actually a downside to removing gets() and deprecating
strtok and strnc??. (Hint: Legacy code uses legacy compilers.)
Hint: Legacy code doesn't have to stay on the original platform.
Even so, anyone dusting off an old program that doesn't go
sifting through looking for the usual suspects is a fool.

I don't have a problem with taking gets() out of modern
compilers, but as you already pointed out, this doesn't
guarantee anything. People can still fire up an old compiler
and use it. I don't see a realistic way for the C standard to
enforce such things.

I consider the ANSI C standard committee basically coauthors
of every one of these problems.

I couldn't disagree more. If programmers themselves were held
responsible for their mistakes, instead of trying to blame it on
loopholes or missing words in a huge document, we would be much
better off.

And what if its not the programmer's fault?

It is the fault of the development team, comprised of whoever
that involves for a given project. If the programmer feels like
his boss screwed him over, let him refuse to continue, swear out
an affidavit and have it notarized the bad software was
knowingly shipped, and that you refuse to endorse it.
What if the programmer is being worked to death?
That would be interesting, because although I have worked way
more than my fair share of 120 hour weeks, I never died, and
never heard of anyone dying. I have heard of a few losing it
and checking themselves into psycho wards, but still. If you
are being overworked, you can either keep doing it, or you can
quit, or you can convince your boss to lighten up. ESPECIALLY
in this case, the C standard folks are not to blame.
What if he's in a dispute with someone else
about how something should be done and lost the argument and
was forced to do things badly?
Try and force me to write something in a way that I know is
wrong. Go ahead, it'll be a short argument, because I will
resign first.

Try and force a brain surgeon to operate on your head with a
chainsaw. good luck.

[...] If you could be fined or perhaps even jailed for
gross neglicence in software development the way doctors can be
today, I suspect the problem would be all but nonexistent.

Ok, that's just vindictive nonsense.

Why? We expect architects, doctors, lawyers, pretty much all
other real 'professions' to meet and typically exceed a higher
standard, and those that do not are punished, fined, or stripped
of their license to practice in the field. Why should
programmers get a pass? Is it because you do not feel it is a
professional position?

We don't let anyone that wants to prescribe medicine, why should
we let anyone that wants to put software up for download which
could compromise system security?
Programmers are generally not aware of the liability of
their mistakes.
Then those you refer to must be generally incompetent. Those
that are good certainly are aware, especially when the software
is of a critical nature.
And mistakes are not completely removable --
Correct. It's also not possible to completely remove medical
malpractice, but it gets punished anyway. It's called a
deterrent.
and there's a real question as to whether the rate can even be reduced.
As long as there is no risk of failure, it almost certainly will
not be reduced by magic or wishing.
But if you were to truly enforce such an idea, I believe both C and C++
as programming languages would instantly disappear.
I highly doubt that. Low-level language programmers would be
the cream of the crop, not 'the lowest bidder' as is the case
today. You would not be hired to work based upon price, but on
skill. Much as I would go look for the most expensive attorney
I could find if I was on trial, I would look for the most highly
skilled programmers I could find to work on a nuclear reactor.

Taking bids and outsourcing to some sweatshop in a jungle
somewhere would not be on the list of options.
Nobody in their right mind, other than the most irresponsible
daredevils would program in these langauges if they were held
liable for their mistakes.

I guess all the professionals in other fields where they are
held up to scrutiny must be irresponsible daredevils too. For
example, there are operations that have very low success rates,
yet there are doctors that specialize in them anyway, despite
the low odds.

If you don't want to take the risk, then go write in visual
whatever#.net and leave it to those that are.
--
Randy Howard (2reply remove FOOBAR)

Nov 15 '05 #30

Chris McDonald

Randy Howard <ra*********@FOOverizonBAR.net> writes:

<getting-way-OT>

I'd love for you to explain to us, by way of example, how you
could guarantee that assembly programmers can not be allowed to
code in a way that allows buffer overflows.

......

/Exactly/ Expecting zero buffer overruns is nonsensical.

......

Anyway, a language so restrictive as to guarantee that nothing
can go wrong will probably never be used for any real-world
project.

I struggle to parse your first sentence, but what if assembly language
programmers were "required" to program in an assembly language whose
program structure could be strongly verified at runtime (aka JVM bytecodes)?

Or would that be against the spirit of an assembly language, and the
discussion?

</getting-way-OT>
--
Chris.

Nov 15 '05 #31

websnarf

Randy Howard wrote:

we******@gmail.com wrote:
Remember that almost every virus, buffer overflow exploit, core
dump/GPF/etc is basically due to some undefined situation in the ANSI C
standard.

Not really. Those that defined early C, and later standard C
are not responsible for bad programming.
Bad programming + good programming language does not allow for buffer
overflow exploits.

For suitably high-level languages that might be true (and
provable). Let us not forget that C is *not* a high-level
language. It's not an accident that it is called high-level
assembler.

Right. If you're not with us, you are with the terrorists.

Why does being a low language mean you have to present a programming
interface surrounded by landmines? Exposing a sufficiently low level
interface may require that you expose some danergous semantics, but why
expose them up front right in the most natural paths of usage?
I'd love for you to explain to us, by way of example, how you
could guarantee that assembly programmers can not be allowed to
code in a way that allows buffer overflows.
Ok, the halting problem means basically nobody guarantees anything
about computer programming.

But its interesting that you bring up the questions of assembly
language. If you persuse the x86 assembly USENET newsgroups, you will
see that many people are very interested in expanding the power and
syntax for assembly language (examples include HLA, RosAsm, and
others). A recent post talked about writing a good string library for
assembly, and there was a strong endorsement for the length prefixed
style of strings, including one direct reference to Bstrlib as a design
worth following (not posted by me!).

So, while assembly clearly isn't an inherently safe language, it seems
quite possible that some assembly efforts will have a much safer (and
much faster) string interface than C does.

You still need a bad programming language to facilitate the
manifestation of these worst case scenarios.

If you wish to argue that low-level languages are 'bad', I will
have to disagree.

So why put those words in my mouth?
[...] If you want to argue that too many people
write code in C when their skill level is more appropriate to a
language with more seatbelts, I won't disagree. The trick is
deciding who gets to make the rules.
But I'm not arguing that either. I am saying C is to a large degree
just capriciously and unnecessarily unsafe (and slow, and powerless,
and unportable etc., etc).

[...] If a programmer has access to the standard (which they
do), and they decide to do something which 'invokes undefined
behavior', then it is their fault. The standard says do not
do that, and they did it anyway.

Ok, this is what I was talking about when I mentioned rose colored
glasses. If programmers are perfect, then what you are saying is fine,
because you can expect perfection. But real people are not. And I
think expectations of perfection in programming is really nonsensical.

/Exactly/ Expecting zero buffer overruns is nonsensical.

Well, not exactly. If you're not using C or C++, then buffer overflows
usually at worse lead to a runtime exception; in C or C++, exploits are
typically designed to gain shell access in the context of the erroneous
program. Its like honey for bees -- people attack C/C++ programs
because they have this weakness. In other safer programming languages,
even if you had a buffer overflow, allowing a control flow
zombification of the program is typically not going to be possible.

Remember NASA put a priority inversion (a truly nasty bug to deal with)
in the mars pathfinder. The Arianne rocket blew up because of an
overflow triggering an interrupt handler that was faulty. You think
the programmers for these projects were not trying their best to do a
good job?

No, I do not. I expect things to go wrong, because humans are
not infallible. Especially in something as inherently difficult
as space travel.

Space travel itself was not the issue, and it wasn't any more
complicated than any kind of industrial device manager (as you might
find in an automated assembly line.) The real problem is the priority
inversions are *nasty*. Each component can be unit tested and
validated to work properly in isolation -- the problem is that when you
put them together and they encounter a specific scenario. Its just a
very sophisticated deadlock.
[...] It's not like you can test it (for real)
before you try it for all the marbles. You can't just hire an
army of monkey to sit in a lab beating on the keyboarrd all day
like an application company.
Hmm ... I don't think that's quite it. The problem is that the
scenario, which I don't recall all the details of, was something that
was simply unaccounted for in their testing. This is a problem in
testing in general. Line by line coverage, unit testing, and other
forms of typical testing really only find the most obvious bugs.

They were able to save the pathfinder, because VxWorks allows you to
reboot into a shell or debug mode, and they were able to patch the code
remotely. The point of this being that in the end they were lucky to
have very sophisticated 3rd party support that is well beyond anything
that the C standard delivers.
Anyway, a language so restrictive as to guarantee that nothing
can go wrong will probably never be used for any real-world
project.
How about simpler language that is more powerful, demonstrably faster,
more portable (dictionary definition), obviously safer and still just
as low level? Just take the C standard, deprecate the garbage, replace
a few things, genericize some of the APIs, well define some of the
scenarios which are currently described as undefined, make some of the
ambiguous syntaxes that lead to undefined behavior illegal, and you're
immediately there. If these steps seem too radical, just draw a line
from where you are and where you need to go, and pick an acceptable
point in between.

Your problem is that you assume making C safer (or faster, or more
portable, or whatever) will take something useful away from C that it
currently has. Think about that for a minute. How is possible that
your mind can be in that state?

Perfect programmers/programming is a pipedream.

So is the idea of a 'perfect language'.

But I was not advocating that. You want punishment -- so you
implicitely are *demanding* programmer perfection.

There is a
reason we paint lines on the roads, wear seatbelts, put guardrails on
stairs and bridges.

Yes. And we require licenses for dangerous activities
elsewhere, but anyone can pick up a compiler and start playing
around.
The problem of programmer safety can be attacked quite successfully at
the level of the programming language itself.

It's quite easy to simply make the use of gets() and friends
illegal for your code development. Most of us have already done
so, without a standard body telling us to do it.

So, estimate the time taken to absorb this information per programmer,
multiply it by the average wage of that programmer, multiply that by
the number of programmers that follow that and there you get the cost
of doing it correctly. Add to that the cost of downtime for those that
get it wrong. (These are costs per year, of course -- since its an on
going problem, the total cost would really be infinite.)

The standards body, just needs to remove it and those costs go away.
Vendors and legacy defenders and pure idiot programmers might get their
panties in a bunch, but no matter how you slice it, the cost of doing
this is clearly finite.

There isn't actually a downside to removing gets() and deprecating
strtok and strnc??. (Hint: Legacy code uses legacy compilers.)

Hint: Legacy code doesn't have to stay on the original platform.

Hint: moving code *ALWAYS* incurrs costs. As I said above, its a
*finite* cost. You don't think people who move code around with calls
to gets() in it should remove them?
Even so, anyone dusting off an old program that doesn't go
sifting through looking for the usual suspects is a fool.
And an old million line program? I think this process should be
automated. In fact, I think it should be automated in your compiler.
In fact I think your compiler should just reject these nonsensical
functions out of hand and issue errors complaining about them. Hey! I
have an idea! Why not remove them from the standard?
I don't have a problem with taking gets() out of modern
compilers, but as you already pointed out, this doesn't
guarantee anything. People can still fire up an old compiler
and use it. I don't see a realistic way for the C standard to
enforce such things.
Interesting -- because I do. You make gets a reserved word, not
redefinable by the preprocessor, and have it always lead to a syntax
error. This forces legacy code owners to either remove it, or stay
away from new compilers.

This has value because, developers can claim to be "C 2010 compliant"
or whatever, and this can tell you that you know it doesn't have gets()
or any other wart that you decided to get rid of. This would in turn
put pressure of the legacy code owners to remove the offending calls,
in an effort that's certainly no worse than the Y2K issue (without the
looming deadline hanging over their heads).

I consider the ANSI C standard committee basically coauthors
of every one of these problems.

I couldn't disagree more. If programmers themselves were held
responsible for their mistakes, instead of trying to blame it on
loopholes or missing words in a huge document, we would be much
better off.

And what if its not the programmer's fault?

It is the fault of the development team, comprised of whoever
that involves for a given project. If the programmer feels like
his boss screwed him over, let him refuse to continue, swear out
an affidavit and have it notarized the bad software was
knowingly shipped, and that you refuse to endorse it.

Oh I see. So, which socialist totally unionized company do you work as
a programmer for? I'd like to apply!

What if the programmer is being worked to death?

That would be interesting, because although I have worked way
more than my fair share of 120 hour weeks, I never died, and
never heard of anyone dying. I have heard of a few losing it
and checking themselves into psycho wards, but still.

Well ... they usually put in buffer overflows, backdoors, or otherwise
sloppy code before they check into these places.
[...] If you
are being overworked, you can either keep doing it, or you can
quit, or you can convince your boss to lighten up.
Hmmm ... so you live in India? I'm trying to guess where it is in this
day and age that you can just quit your job solely because you don't
like the pressures coming from management.
[...] ESPECIALLY in this case, the C standard folks are not to blame.
But if the same issue happens and you are using a safer language, the
same kinds of issues don't come up. Your code might be wrong, but it
won't allow buffer overflow exploits.

What if he's in a dispute with someone else
about how something should be done and lost the argument and
was forced to do things badly?

Try and force me to write something in a way that I know is
wrong. Go ahead, it'll be a short argument, because I will
resign first.

That's a nice bubble you live in. Or is it just in your mind?
Try and force a brain surgeon to operate on your head with a
chainsaw. good luck.
[...] If you could be fined or perhaps even jailed for
gross neglicence in software development the way doctors can be
today, I suspect the problem would be all but nonexistent.
Ok, that's just vindictive nonsense.

Why? We expect architects, doctors, lawyers, pretty much all
other real 'professions' to meet and typically exceed a higher
standard, and those that do not are punished, fined, or stripped
of their license to practice in the field. Why should
programmers get a pass? Is it because you do not feel it is a
professional position?

Because its not as structured, and that's simply not practical.
Doctors have training, internships, etc. Lawyers have to pass a bar
exam, etc. There's no such analogue for computer programmers. Because
the most successful programmers are always ones that are able to think
outside the box, but the bar for average programmers is pretty low --
but both can make a contribution, and neither can guarantee perfect
code.
We don't let anyone that wants to prescribe medicine, why should
we let anyone that wants to put software up for download which
could compromise system security?
Programmers are generally not aware of the liability of
their mistakes.
Then those you refer to must be generally incompetent.

Dennis Ritchie had no idea that NASA would put a priority inversion in
their pathfinder code. Linus Torvalds had no idea that the NSA would
take his code and use it for a security based platform. My point is
that programmers don't know what the liability of their code is,
because they are not always in control of when or where or for what it
might be used.

The recent JPEG parsing buffer overflow exploit, for example, came from
failed sample code from the JPEG website itself. You think we should
hunt down Tom Lane and linch him?
[...] Those that are good certainly are aware, especially when
the software is of a critical nature.
And mistakes are not completely removable --
Correct. It's also not possible to completely remove medical
malpractice, but it gets punished anyway. It's called a
deterrent.

You don't think medical practioners use the latest and safest
technology available to practice their medicine?

and there's a real question as to whether the rate can even be reduced.

As long as there is no risk of failure, it almost certainly will
not be reduced by magic or wishing.

This is utter nonsense. The reason for the success of languages like
Java and Python is not because of their speed you know.

But if you were to truly enforce such an idea, I believe both C and C++
as programming languages would instantly disappear.

I highly doubt that. Low-level language programmers would be
the cream of the crop, not 'the lowest bidder' as is the case
today.

You still don't get it. You, I or anyone you know, will produce errors
if pushed. There's no such thing as a 0 error rate for programming.
Just measuring first time compile error rates, myself, I score roughly
one syntax error per 300 lines of code. I take this as an indicator
for the likely number of hidden bugs I just don't know about in my
code. Unless my first-compile error rate was 0, I just can't have any
confidence that I don't also have a 0 hidden bug rate. I know that
since using my own Bstrlib library, and other similar mechanisms my
rate is probably far less now than its ever been. But its still not 0.

Go measure your own first-compile error rate and tell me you are
confident in your own ability to avoid hidden bugs. If you still think
you can achieve a 0 or near 0 hidden bug rate, go look up "priority
inversion". No syntax checker and no run time debugger can tell you
about this sort of error. Your only chance of avoiding these sorts of
errors is having a very thoroughly vetted high level design.
[...] You would not be hired to work based upon price, but on
skill. Much as I would go look for the most expensive attorney
I could find if I was on trial, I would look for the most highly
skilled programmers I could find to work on a nuclear reactor.

Taking bids and outsourcing to some sweatshop in a jungle
somewhere would not be on the list of options.
For a nuclear reactor, I would also include the requirement that they
use a safer programming language like Ada. Personally I would be
shocked to know that *ANY* nuclear reactor control mechanism was
written in C. Maybe a low level I/O driver library, that was
thoroughly vetted (because you probably can't do that in Ada), but
that's it.

Nobody in their right mind, other than the most irresponsible
daredevils would program in these langauges if they were held
liable for their mistakes.

I guess all the professionals in other fields where they are
held up to scrutiny must be irresponsible daredevils too.

No -- they have great assistance and controlled environments that allow
them to perform under such conditions. Something akin to using a
better programming language.
[...] For
example, there are operations that have very low success rates,
yet there are doctors that specialize in them anyway, despite
the low odds.
Well, your analogy only makes some sense if you are talking about
surgeons in developing countries who simply don't have access to the
necessary anesthetic, support staff or even the proper education to do
the operation correctly. In those cases, there is little choice, so
you make do with what you have. But obviously its a situation you just
want to move away from -- they way you solve it, is you give them
access to the safer, and better ways to practice medicine.
If you don't want to take the risk, then go write in visual
whatever#.net and leave it to those that are.

So you want some people to stay away from C because the language is too
dangerous. While I want the language be fixed so that most people
don't trigger the landmines in the language so easily. If you think
about it, my solution actually *costs* less.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 15 '05 #32

Magnus Wibeck

we******@gmail.com wrote:

The point of this being that in the end they were lucky to
have very sophisticated 3rd party support that is well beyond anything
that the C standard delivers.
You surely cannot be comparing "3rd party support" from a commercial
company to a language standard? They have totally different purposes.
That's like comparing a specification of a car to a taxi company,
and complaining that if you sit on the specification it doesn't get you
anywhere, but if you call the taxi company they get you where you tell them to.
For a nuclear reactor, I would also include the requirement that they
use a safer programming language like Ada.

The Ariane software module that caused the problem was written in Ada.
http://sunnyday.mit.edu/accidents/Ar...entreport.html
Had it been written in C, the actual cause (integer overflow) probably would not
have caused an exception. I'm not saying that it would have been better in
C, but you *cannot* blame the C standard for what happened there.

Also, this "priority inversion" you speak of - doesn't that imply processes
or threads? C does not have that AFAIK. So you cannot blame the C standard
for allowing priority inversion bugs to occurr. It neither allows or disallows
them, because C has no notion of priorities.

/Magnus

Nov 15 '05 #33

Richard Kettlewell

we******@gmail.com writes:

Randy Howard wrote:
we******@gmail.com wrote:
Remember that almost every virus, buffer overflow exploit, core
dump/GPF/etc is basically due to some undefined situation in the
ANSI C standard.

Not really. Those that defined early C, and later standard C are
not responsible for bad programming.

Bad programming + good programming language does not allow for
buffer overflow exploits. You still need a bad programming language
to facilitate the manifestation of these worst case scenarios.

Exploits that rely on C undefined behaviour are not the only kind of
problem in reality. Programs not written in C sometimes have serious
security problems too.

For example lots of software has had various kinds of quoting and
validation bugs - SQL injection, cross-site scripting, inadequate
shell quoting - for many years, and this is a consequence purely of
the program, and cannot be pinned on the language it is written in.

You won't spot these bugs with tools such as Valgrind or Purify,
either.

--
http://www.greenend.org.uk/rjk/

Nov 15 '05 #34

websnarf

Magnus Wibeck wrote:

we******@gmail.com wrote:
> The point of this being that in the end they were lucky to
> have very sophisticated 3rd party support that is well beyond anything
> that the C standard delivers.
You surely cannot be comparing "3rd party support" from a commercial
company to a language standard?

Originally I was making a point about the mistake rate of programmers.
But more generally, the C language probably has more "problem support
tools" than any language in existence, and this will probably continue
to be true for the future regardless of language mindshare.
[...] They have totally different purposes.
That's like comparing a specification of a car to a taxi company,
and complaining that if you sit on the specification it doesn't get you
anywhere, but if you call the taxi company they get you where you tell them
to.
Hmmm ... I'm not sure its the same thing. For example let's say C
added a function: numallocs(), that counted the number of memory
allocations that are outstanding (or the maximum number that could be
legally freed or whatever.) Similarly, if the Boehm garbage collector
were adopted as part of the C standard (not that I'm advocating that.)
If the C library were to basically abandon its string functions and use
something like Bstrlib, for example, then David Wagner's (and many
other) buffer overflow security analysis tools would be obsolete.

> For a nuclear reactor, I would also include the requirement that they
> use a safer programming language like Ada.

The Ariane software module that caused the problem was written in Ada.
http://sunnyday.mit.edu/accidents/Ar...entreport.html
Had it been written in C, the actual cause (integer overflow) probably would
not have caused an exception. I'm not saying that it would have been better
in C, but you *cannot* blame the C standard for what happened there.

You are right, I cannot blame C for bugs that happen in other
languages. This is the most famous one from Ada. If you would like a
short list of infamous bugs for C just go through the CERT advisories
-- they are basically almost entirely C related.

See, the thing is, with Ada bugs, you can clearly blame the programmer
for most kinds of failures. With C you can go either way. But nearly
every software design house that writes lots of software in C just gets
bit by bugs from all sorts of edges of the language.
Also, this "priority inversion" you speak of - doesn't that imply processes
or threads? C does not have that AFAIK. So you cannot blame the C standard
for allowing priority inversion bugs to occurr. It neither allows or
disallows them, because C has no notion of priorities.

The programmer used priority based threading because that's what he had
available to him. Suppose, however, that C had implemented co-routines
(they require only barely more support than setjmp()/longjmp()). It
turns out that using coroutines alone, you can implement a lot of
multitasking problems. Maybe the Pathfinder code would have more
coroutines, and fewer threads, and may have avoided the problem
altogether (I am not privy to their source, so I really don't know).
This isn't just some weird snake oil style solution -- by their very
nature, coroutines do not have priorities, do not in of themselves make
race conditions possible, and generally consume less in resources than
threads.

Coroutines are one of those "perfect compromises", because you can
easily specify a portable interface, that is very likely to be widely
supportable, they are actually tremendously faster than threading in
many cases, and all without adding *any* undefined behavior or
implementation defined behavior scenarios (other than a potential
inability to allocate new stacks.) Full blown multithreading, such as
in POSIX is notoriously platform specific, and it should not surprise
anyone that only few non-UNIX platforms support full blowns POSIX
threads. This fact has been noticed and adopted by those languages
where serious development is happening (Lua, Perl, Python). I don't
know if the C standards committee would be open to this -- I highly
doubt it.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 15 '05 #35

Chris Hills

In article <11**********************@g47g2000cwa.googlegroups .com>,
we******@gmail.com writes

Magnus Wibeck wrote:
we******@gmail.com wrote:
> The point of this being that in the end they were lucky to
> have very sophisticated 3rd party support that is well beyond anything
> that the C standard delivers.
You surely cannot be comparing "3rd party support" from a commercial
company to a language standard?
why not?
> For a nuclear reactor, I would also include the requirement that they
> use a safer programming language like Ada.
The Ariane software module that caused the problem was written in Ada.
http://sunnyday.mit.edu/accidents/Ar...entreport.html
Had it been written in C, the actual cause (integer overflow) probably would
not have caused an exception. I'm not saying that it would have been better
in C, but you *cannot* blame the C standard for what happened there.

You are right, I cannot blame C for bugs that happen in other
languages. This is the most famous one from Ada. If you would like a
short list of infamous bugs for C just go through the CERT advisories
-- they are basically almost entirely C related.

Possibly because C is more widely and less rigorously used? I would
expect that most Ada projects are high integrity and developed as such.
C is often not used ( and certainly not taught) in a high integrity
environmnet
See, the thing is, with Ada bugs, you can clearly blame the programmer
for most kinds of failures.
AFAIK the Arriane problem was one of project management
With C you can go either way. But nearly
every software design house that writes lots of software in C just gets
bit by bugs from all sorts of edges of the language.

So use a subset? Many industries do.
--
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills Staffs England /\/\/\/\/
/\/\/ ch***@phaedsys.org www.phaedsys.org \/\/\
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

Nov 15 '05 #36

kuyper

we******@gmail.com wrote:

Randy Howard wrote:
we******@gmail.com wrote: ....
You still need a bad programming language to facilitate the
manifestation of these worst case scenarios.
If you wish to argue that low-level languages are 'bad', I will
have to disagree.

So why put those words in my mouth?

He didn't - he's just pointing out that the characteristics you deplore
in C are inherent in C being a low-level language. Therefore, any
criticism of C for possessing those characteristics implies a criticism
of all low-level languages. You didn't actually make such a criticism,
but it was implied by the criticism you did make.

.... Your problem is that you assume making C safer (or faster, or more
portable, or whatever) will take something useful away from C that it
currently has. Think about that for a minute. How is possible that
your mind can be in that state?
Possibly, possession of a certain minimal state of awareness of
reality? No one wants C to be unsafe, slow, or unportable. As a general
rule, the cost-free ways of making it safer, faster, and more portable
have already been fully exploited. Therefore, the remaining ways are
disproportionately likely to carry a significant cost.

This is simple economics: cost-free or negative-cost ways of improving
anything are usually implemented quickly. With any reasonably mature
system, the ways of improving the system that haven't been implemented
yet are disproportionately likely to carry a significant cost.

....
So is the idea of a 'perfect language'.

But I was not advocating that. You want punishment -- so you
implicitely are *demanding* programmer perfection.

By that logic, requiring punishment for theft implicitly demands human
perfection?

.... get it wrong. (These are costs per year, of course -- since its an on
going problem, the total cost would really be infinite.)
You're failing to take into consideration the cost of capital. Costs
that take place in the future are less expensive in present-day dollars
than costs that take place in the present. The net present value of a
steady annual cost is finite, so long as the cost of capital is
positive.

.... The standards body, just needs to remove it and those costs go away.
Vendors and legacy defenders and pure idiot programmers might get their
panties in a bunch, but no matter how you slice it, the cost of doing
this is clearly finite.
You're assuming that those programmers are idiots, instead of being
intelligent people who are actually aware of what the ongoing (i.e. by
your way of calculating things, infinite) costs of such a change will
be.

I don't have a problem with taking gets() out of modern
compilers, but as you already pointed out, this doesn't
guarantee anything. People can still fire up an old compiler ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
and use it. ... ^^^^^^^^^^^
... I don't see a realistic way for the C standard to
enforce such things.

Interesting -- because I do. You make gets a reserved word, not
redefinable by the preprocessor, and have it always lead to a syntax
error. This forces legacy code owners to either remove it, or stay
away from new compilers.

How in the world does changing new compilers have any effect on people
who "fire up an old compiler and use it"?

....
It is the fault of the development team, comprised of whoever
that involves for a given project. If the programmer feels like
his boss screwed him over, let him refuse to continue, swear out
an affidavit and have it notarized the bad software was
knowingly shipped, and that you refuse to endorse it.

Oh I see. So, which socialist totally unionized company do you work as
a programmer for? I'd like to apply!

What does socialism and unionism have to do with workers accepting full
responsibility for the quality of their product?

That would be interesting, because although I have worked way
more than my fair share of 120 hour weeks, I never died, and
never heard of anyone dying. I have heard of a few losing it
and checking themselves into psycho wards, but still.

Well ... they usually put in buffer overflows, backdoors, or otherwise
sloppy code before they check into these places.

Backdoors are, by definition, installed deliberately. I suppose you
might have intended to imply that overworked programmers would install
backdoors as a way of getting revenge for being overworked, but if so,
you didn't express that idea properly.

[...] If you
are being overworked, you can either keep doing it, or you can
quit, or you can convince your boss to lighten up.

Hmmm ... so you live in India? I'm trying to guess where it is in this
day and age that you can just quit your job solely because you don't
like the pressures coming from management.

I'm curious - what part of the world do you live in where you are
prohibited from quitting your job? I don't understand your reference to
India - are you suggesting that it is the only place in the world where
workers aren't slaves?

....
Try and force me to write something in a way that I know is
wrong. Go ahead, it'll be a short argument, because I will
resign first.

That's a nice bubble you live in. Or is it just in your mind?

I live in that same bubble. I'm free to quit my job for any reasons I
want to, at any time I want to. I would stop being paid, I'd have to
start searching for a new job at a better employer, and I'd have to pay
full price if I decided to use the CORBA option to continue recieving
the insurance benefits that my employer currently subsizes, but those
are just consequences of my decision, not things that would prevent me
from making it. If I decide to obey orders to produce defective code, I
have to accept the consequences of being responsible for bad code. If I
prefer the consequences of having to look for a new job at a better
employer, that's precisely what I'll do. Wouldn't you?

.... Dennis Ritchie had no idea that NASA would put a priority inversion in
their pathfinder code. Linus Torvalds had no idea that the NSA would
take his code and use it for a security based platform. My point is
that programmers don't know what the liability of their code is,
because they are not always in control of when or where or for what it
might be used.
When you take someone else's code and use in in a context that it
wasn't designed for, the responsibility for adapting it to be suitable
for use in the new context is yours, not the original author's.

But if you were to truly enforce such an idea, I believe both C and C++

[...] For
example, there are operations that have very low success rates,
yet there are doctors that specialize in them anyway, despite
the low odds.

Well, your analogy only makes some sense if you are talking about
surgeons in developing countries who simply don't have access to the
necessary anesthetic, support staff or even the proper education to do
the operation correctly.

Which would you prefer: a life expectancy of three months, or a 30%
chance of increasing your life expectancy to 20 years, inextricably
linked with a 70% chance of dying in the operating room tomorrow? There
are real-life situations where the best doctors in the world, with the
best equipment in the world, can't offer you a choice that's any more
attractive than that one.
... In those cases, there is little choice, so
you make do with what you have. But obviously its a situation you just
want to move away from -- they way you solve it, is you give them
access to the safer, and better ways to practice medicine.

I suspect that no matter how advanced our medicine gets, there will
always be conditions that it's just barely able to deal with. The
longer we live, the harder it is to keep us living; that's pretty much
unavoidable.

Nov 15 '05 #37

Hallvard B Furuseth

Paul Hsieh writes:

Remember that almost every virus, buffer overflow exploit, core
dump/GPF/etc is basically due to some undefined situation in the
ANSI C standard. I consider the ANSI C standard committee
basically coauthors of every one of these problems.

So it's partly their fault? What should they have done -
refrained from standardizing the already existing C language?
That would not have helped: K&R C was already widely used, and
people were cooperating anyway to get some sort of portability out
of it.

Or should they have removed every undefined situation from the
language? Bye bye free() and realloc() - require a garbage
collector instead. To catch all bad pointer usage, insert
type/range information in both pointers and data. Those two
changes alone in the standard would change the C runtime
implementation so much that it's practically another language.

An _implementation_ which catches such things can be nice when you
already have a C program which you want to run safely. But if the
language standard itself made such requirements, a number of the
reasons that exist to choose C for a project would not be there.

If one is going to use another language than C, it's better to use
a language which takes advantage of not being C, instead of a
language which pretends to be C but isn't.

--
Hallvard

Nov 15 '05 #38

Chris Torek

[off-topic drift, but I cannot resist...]

In article <3n***********@individual.net>
Default User <de***********@yahoo.com> wrote:

You feel that I my choice of moniker reflects something about my level
of expertise? Note that "Default User" is NOT the default name in
XanaNews, my current newsreader.

I always figured it meant that you are known for requiring a
"default:" label in every switch(). :-)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Nov 15 '05 #39

Chris Torek

(Again, quite off-topic, but ...)

[Ariane rocket example]

In article <11**********************@g47g2000cwa.googlegroups .com>
<we******@gmail.com> wrote:

You are right, I cannot blame C for bugs that happen in other
languages. This is the most famous one from Ada. ...
See, the thing is, with Ada bugs, you can clearly blame the programmer
for most kinds of failures.
I am reminded of a line from a novel and movie:

"*We* fix the blame. *They* fix the problem. Their way's better."

[Pathfinder example]The programmer used priority based threading because that's what he had
available to him.
Actually, the Pathfinder used vxWorks, a system with which I am
now somewhat familiar. (Not that I know much about versions
predating 6.0, but this particular item has been this way "forever",
or long enough anyway.)

The vxWorks system offers "mutex semaphores" as one of its several
flavors of data-protection between threads. The mutex creation
call, semMCreate(), takes several flag parameters. One of these
flags controls "task" (thread, process, whatever moniker you prefer)
priority behavior when the task blocks on the mutex.

The programmer *chose* this behavior, because vxWorks does offer
priority inheritance. (Admittedly, vxWorks priority inheritance
has a flaw, but that is a different problem.)

Thus, your premise -- that the programmer used priority based
scheduling (without inheritance) that led to the priority inversion
problem "because that's what he had available" is incorrect: he
could have chosen to make all the threads the same priority, and/or
used priority inheritance, all with simple parameters to the various
calls (taskSpawn(), semMCreate(), and so on).
Coroutines are one of those "perfect compromises" ...

Coroutines are hardly perfect. However, if you like them, I suggest
you investigate the Icon programming language, for instance.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Nov 15 '05 #40

Keith Thompson

Magnus Wibeck <ma******************@telia.com> writes:
[...]

The Ariane software module that caused the problem was written in
Ada. http://sunnyday.mit.edu/accidents/Ar...entreport.html
Had it been written in C, the actual cause (integer overflow)
probably would not have caused an exception. I'm not saying that it
would have been better in C, but you *cannot* blame the C standard
for what happened there.

Nor can it be blamed on Ada. (Not that you did so, I just wanted to
clarify that point.)

The details are off-topic but easily Googlable.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 15 '05 #41

Keith Thompson

Chris Torek <no****@torek.net> writes:

[off-topic drift, but I cannot resist...]

In article <3n***********@individual.net>
Default User <de***********@yahoo.com> wrote:
You feel that I my choice of moniker reflects something about my level
of expertise? Note that "Default User" is NOT the default name in
XanaNews, my current newsreader.

I always figured it meant that you are known for requiring a
"default:" label in every switch(). :-)

Or for using a "Default:" label, which is perfectly legal but not
terribly useful.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 15 '05 #42

Magnus Wibeck

Chris Hills wrote:
[..]

Magnus Wibeck wrote:
You surely cannot be comparing "3rd party support" from a commercial
company to a language standard?

why not?

I find such a comparison void of meaning. Like comparing apples and anxiety.
A language standard is a passive item that describes one specific thing.
You cannot pay it money to do what you want.
A support department in a commercial company (should) bend over backwards
to help its customers getting issues with the company's products sorted out.

I missed the point websnarf was making, which, if I understand it correctly,
is that the fact that there are (lots of) support for products that use C
somehow infers that C is an unsafe language.

If that is the point websnarf was trying to make, the comparison should
be the frequency of 3rd party support contacts made regarding C casued
problems compared to other languages. Obviously numbers that are darn near
impossible to gather.

I'm not getting into the "C is unsafe" discussion, I just saw a few,
as I see it, flawed, deductions about C and the "unsafeness" of it,
and tried to address them.

/Magnus

Nov 15 '05 #43

Joe Wright

Default User wrote:

Stephen Hildrey wrote:

Default User wrote:
I find it rude and obnoxious.

...says Usenet expert "Default User" :-)

You feel that I my choice of moniker reflects something about my level
of expertise? Note that "Default User" is NOT the default name in
XanaNews, my current newsreader.

Brian

Nonsense. Among the several header lines of your message are..

From: "Default User" <de***********@yahoo.com>
Newsgroups: comp.lang.c
Subject: Re: Why doesn't strrstr() exist?
Date: 25 Aug 2005 18:11:29 GMT
Lines: 15
Message-ID: <3n***********@individual.net>
User-Agent: XanaNews/1.16.3.1

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Nov 15 '05 #44

websnarf

Hallvard B Furuseth wrote:

Paul Hsieh writes:
Remember that almost every virus, buffer overflow exploit, core
dump/GPF/etc is basically due to some undefined situation in the
ANSI C standard. I consider the ANSI C standard committee
basically coauthors of every one of these problems.
So it's partly their fault? What should they have done -
refrained from standardizing the already existing C language?

The ANSI C standard was good enough for 1989, when the computing
industry was still in its growing stages. It served the basic purpose
of standardizing everyone behind a common standard.

Its the standards that came *AFTER* that, where the problem is. The
problem of "buffer overflows" and similar problems was well documented
and even then were making the news. And look at the near unilateral
ambivalence to the C99 standard by compiler vendors. The point is that
the "coming together" has already been achieved -- the vendors have
already gotten the value out of unified standard from the 1989
standard. The C99 standard doesn't solve and crutcial problems of
similar nature.

But suppose the C99 standard (or C94, or some future standard) included
numerous changes for the purposes of security, that broke backwards
compatibility. If there are vendors who are concerned about backward
compatibility they would just stick with the older standard (which is
what they are doing right now anyways) and if they felt security was
more important then they would move towards the new standard.

The point being that, the real reason there has been so little C99
adoption is because there is little *value* in it. The foremost thing
it delivers is backwards compatibility -- but its something the
compiler vendors *ALREADY HAVE* by sticking with the previous
standards.

Because C99 has so little value add over C89, there is no demand for
it. And it fundamentally means that language really only solves the
same problems that it did in 1989. Even for me, restrict, was really
the only language feature I was remotely interested in, and <stdint.h>
the only other thing in the standard that has any real value in it.
But since the vendors I use are not interested in implementing C99, I
have lived with "assume no aliasing" compiler switches and I have
fashioned my very own stdint.h. It turns out, that in practice, this
completely covers my C99 needs -- and I'm sure it solves most people's
C99 needs. And this is just using 1989 C compiler technology.

If the C99 standard had solved *important* problems that are plaguing
programmers today, then I think there would be more demand. You might
cause a fracture in the C community, but at least the would be some
degree of keeping up with the needs of the C community. And the reason
*WHY* such things should be solved in the C standard and not just other
languages, is because this is where the largest problems are, and where
the effect can be leveraged to the degree.
That would not have helped: K&R C was already widely used, and
people were cooperating anyway to get some sort of portability out
of it.
Right. It would not have helped in 1989. But C99 doesn't help anyone
today. That's the key point.
Or should they have removed every undefined situation from the
language?
No, just the worst offenders.
[...] Bye bye free() and realloc() - require a garbage
collector instead. To catch all bad pointer usage, insert
type/range information in both pointers and data. Those two
changes alone in the standard would change the C runtime
implementation so much that it's practically another language.

Right. That's not what I am advocating.

Tell me, if you removed gets, strtok, and strn??, would you also have
practically another language?

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 15 '05 #45

Keith Thompson

Joe Wright <jw*****@comcast.net> writes:

Default User wrote:

[...]

You feel that I my choice of moniker reflects something about my
level
of expertise? Note that "Default User" is NOT the default name in
XanaNews, my current newsreader.
Brian

Nonsense. Among the several header lines of your message are..

From: "Default User" <de***********@yahoo.com>
Newsgroups: comp.lang.c
Subject: Re: Why doesn't strrstr() exist?
Date: 25 Aug 2005 18:11:29 GMT
Lines: 15
Message-ID: <3n***********@individual.net>
User-Agent: XanaNews/1.16.3.1

And how does this imply that his statement is nonsense?

If Mr. and Mrs. User named their son Default, I'd expect his articles
to have very similar headers if he used the same server and
newsreader.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 15 '05 #46

Randy Howard

we******@gmail.com wrote
(in article
<11**********************@o13g2000cwo.googlegroups .com>):

Randy Howard wrote:
Bad programming + good programming language does not allow for buffer
overflow exploits.

For suitably high-level languages that might be true (and
provable). Let us not forget that C is *not* a high-level
language. It's not an accident that it is called high-level
assembler.

Right. If you're not with us, you are with the terrorists.

Excuse me?
Why does being a low language mean you have to present a programming
interface surrounded by landmines?
If you have access to any sequence of opcodes available on the
target processor, how can it not be?
Exposing a sufficiently low level
interface may require that you expose some danergous semantics, but why
expose them up front right in the most natural paths of usage?
Do you feel that 'gets()' is part of the most natural path in C?
I'd love for you to explain to us, by way of example, how you
could guarantee that assembly programmers can not be allowed to
code in a way that allows buffer overflows.

Ok, the halting problem means basically nobody guarantees anything
about computer programming.

Fair enough, but you're just dodging the underlying question.
But its interesting that you bring up the questions of assembly
language. If you persuse the x86 assembly USENET newsgroups, you will
see that many people are very interested in expanding the power and
syntax for assembly language (examples include HLA, RosAsm, and
others).
For a suitably generous definition of 'many', perhaps.
A recent post talked about writing a good string library for
assembly, and there was a strong endorsement for the length prefixed
style of strings, including one direct reference to Bstrlib as a design
worth following (not posted by me!).
I would have been shocked if you had not figured out a way to
bring your package up. :-)
So, while assembly clearly isn't an inherently safe language, it seems
quite possible that some assembly efforts will have a much safer (and
much faster) string interface than C does.
Which does absolutely nothing to prevent the possibility of
developing insecure software in assembler. It may offer some
advantages for string handling, but that closes at best only one
of a thousand doors.

[...] If you want to argue that too many people
write code in C when their skill level is more appropriate to a
language with more seatbelts, I won't disagree. The trick is
deciding who gets to make the rules.

But I'm not arguing that either. I am saying C is to a large degree
just capriciously and unnecessarily unsafe (and slow, and powerless,
and unportable etc., etc).

Slow? Yes, I keep forgetting how much better performance one
achieves when using Ruby or Python. Yeah, right.

Powerless? How so? It seems to be the only language other than
assembler which has been used successfully for operating system
development.

Unportable? You have got to be kidding. I must be
hallucinating when I see my C source compiled and executing on
Windows, Linux, NetWare, OS X, Solaris, *bsd, and a host of
other UNIX-like platforms, on x86, x86-64, PPC, Sparc, etc.

Ok, this is what I was talking about when I mentioned rose colored
glasses. If programmers are perfect, then what you are saying is fine,
because you can expect perfection. But real people are not. And I
think expectations of perfection in programming is really nonsensical.

/Exactly/ Expecting zero buffer overruns is nonsensical.

Well, not exactly. If you're not using C or C++, then buffer overflows
usually at worse lead to a runtime exception; in C or C++, exploits are
typically designed to gain shell access in the context of the erroneous
program. Its like honey for bees -- people attack C/C++ programs
because they have this weakness. In other safer programming languages,
even if you had a buffer overflow, allowing a control flow
zombification of the program is typically not going to be possible.

That is all true, and it does nothing to address the point that
C is still going to be used for a lot of development work. The
cost of the runtime error handling is nonzero. Sure, there are
a lot of applications today where they do not need the raw speed
and can afford to use something else. That is not always the
case. People are still writing a lot of inline assembly even
when approaching 4GHz clock speeds.

Anyway, a language so restrictive as to guarantee that nothing
can go wrong will probably never be used for any real-world
project.

How about simpler language that is more powerful, demonstrably faster,
more portable (dictionary definition), obviously safer and still just
as low level?

That would be nice.
Just take the C standard, deprecate the garbage, replace
a few things, genericize some of the APIs, well define some of the
scenarios which are currently described as undefined, make some of the
ambiguous syntaxes that lead to undefined behavior illegal, and you're
immediately there.
I don't immediately see how this will be demonstrably faster,
but you are free to invent such a language tomorrow afternoon.
Do it, back up your claims, and no doubt the world will beat a
path to your website. Right? "D" is already taken, what will
you call it?
Your problem is that you assume making C safer (or faster, or more
portable, or whatever) will take something useful away from C that it
currently has. Think about that for a minute. How is possible that
your mind can be in that state?
It isn't possible. What is possible is for you to make gross
assumptions about what 'my problem' is based up the post you are
replying to here. I do not assume that C can not be made safer.
What I said, since you seem to have missed it, is that the
authors of the C standard are not responsible for programmer
bugs.

So is the idea of a 'perfect language'.

But I was not advocating that. You want punishment -- so you
implicitely are *demanding* programmer perfection.

No, I am not. I do not demand that doctors are perfect, but I
expect them to be highly motivated to attempt to be perfect.

It's quite easy to simply make the use of gets() and friends
illegal for your code development. Most of us have already done
so, without a standard body telling us to do it.

So, estimate the time taken to absorb this information per programmer,
multiply it by the average wage of that programmer, multiply that by
the number of programmers that follow that and there you get the cost
of doing it correctly.

What cost? Some 'world-wide rolled-up cost'? For me, it cost
me almost nothing at all. I first discovered gets() was
problematic at least a decade ago, probably even earlier, but I
don't keep notes on such things. It hasn't cost me anything
since. If I hire a programmer, this has all been settled to my
satisfaction before they get an offer letter. It hasn't been a
problem and I do not expect it to be one in the future.
The standards body, just needs to remove it and those costs go away.
They do not. As we have already seen, it takes years, if not
decades for a compiler supporting a standard to land in
programmer hands. With the stunningly poor adoption of C99, we
could not possibly hope to own or obtain an open source C0x
compiler prior to 2020-something, if ever. In the mean time,
those that are serious solved the problem years ago.
You don't think people who move code around with calls
to gets() in it should remove them?
Of course I do. In fact, I say so, which you conveniently
quoted just below...

Even so, anyone dusting off an old program that doesn't go
sifting through looking for the usual suspects is a fool.

And an old million line program?

Didn't /you/ just say that they should be removed?
I think this process should be
automated. In fact, I think it should be automated in your compiler.
In fact I think your compiler should just reject these nonsensical
functions out of hand and issue errors complaining about them.
Make up your mind. Fixing them in the the compiler, as I would
expect an 'automated' solution to do, and rejecting the
offending lines are completely different approaches.
Hey! I have an idea! Why not remove them from the standard?
Great idea. 15 years from now that will have some value.

A better idea. Patch gcc to bitch about them TODAY, regardless
of the standard.

I don't have a problem with taking gets() out of modern
compilers, but as you already pointed out, this doesn't
guarantee anything. People can still fire up an old compiler
and use it. I don't see a realistic way for the C standard to
enforce such things.

Interesting -- because I do. You make gets a reserved word, not
redefinable by the preprocessor, and have it always lead to a syntax
error.

What part of 'people can still fire up and old compiler' did you
fail to read and/or understand?
This has value because, developers can claim to be "C 2010 compliant"
or whatever, and this can tell you that you know it doesn't have gets()
or any other wart that you decided to get rid of.
They could also simply claim "we are smarter than the average
bear, and we know better to use any of the following offensive
legacy functions, such as gets(), ..."

To clarify, since it didn't soak in the first time, I am not
opposed to them being removed. I simply don't this as a magic
bullet, and certainly not in the sense that it takes far too
long for the compilers to catch up with it. I would much rather
see compilers modified to deny gets() and its ilk by default,
and require a special command line option to bypass it, /if at
all/. However, the warning message should be far more useful
than
gets.c: 325: error: gets() has been deprecated.

That's just oh so useful, especially to newbies. I wouldn't
care if it dumped a page and a half of explanation, along with a
detailed example of how to replace such calls with something
safer. After all, good code doesn't have it in them anyway, and
it won't annoy anyone that is competent.
This would in turn
put pressure of the legacy code owners to remove the offending calls,
in an effort that's certainly no worse than the Y2K issue (without the
looming deadline hanging over their heads).
If, and only if, they use a compiler with such changes. We
still see posts on a regular basic with people using old 16-bit
Borland compilers to write new software.

And what if its not the programmer's fault?

It is the fault of the development team, comprised of whoever
that involves for a given project. If the programmer feels like
his boss screwed him over, let him refuse to continue, swear out
an affidavit and have it notarized the bad software was
knowingly shipped, and that you refuse to endorse it.

Oh I see. So, which socialist totally unionized company do you work as
a programmer for? I'd like to apply!

I don't think you understood me. I know of no company that has
a policy for this. However, if I was working on something and
felt that something was being done that could be inherently
dangerous, and it was going to ship anyway, I would take some
form of legal action, if for no other reason than to be able to
disassociate myself from the impending lawsuits.

I would much rather go look for work than participate in
something that might wind up with people dying over the actions
of some meddling manager.

[...] If you
are being overworked, you can either keep doing it, or you can
quit, or you can convince your boss to lighten up.

Hmmm ... so you live in India?

Why would you think so?
I'm trying to guess where it is in this
day and age that you can just quit your job solely because you don't
like the pressures coming from management.
Where do you live? Because I am trying to guess where on the
planet you would /not/ have the right to quit your job.
Indentured servitude is not widely practiced anymore, AFAIK.

[...] ESPECIALLY in this case, the C standard folks are not to blame.

But if the same issue happens and you are using a safer language, the
same kinds of issues don't come up. Your code might be wrong, but it
won't allow buffer overflow exploits.

You can have 10 dozen other forms of security failure, that have
nothing to do with buffer overflows. It isn't a panacea. When
one form of attack is removed, another one shows up.

For example, the last straw the sent Microsoft windows off my
network for eternity happened recently. A computer system
running XP, SP2, all the patches, automatic Windows updates
daily, virus software with automatic updates and real-time
protection, email-virus scanning software, two different brands
of spyware protection, also with automatic updates enabled, and
both a hardware firewall and software firewall installed, got
covered up in viruses after 2 hours of letting my kids use it to
go play some stupid online kids game on disney.com or
nickelodeon.com (not sure which, since they went to both, and I
didn't want to replicate it). Suddenly, when I come back to
look at it, it has 3 or 4 new taskbar icons showing downloads in
progress of I know not what task manager shows a bunch of extra
processes that shouldn't be there, the registry run keys are
stuffed fool of malware, and it's pushing stuff out the network
of I know not what. I pull the cable, start trying to delete
files, which Windows wants to tell me I don't have permission to
do, scanning, the browser cache directories are filled with .exe
and .dll files, it's out of control.

A few expletives later, and I was installing a new Linux distro
that I had been meaning to try out for a while.

I had done just about everything I could imagine to lock the
system down, and it still got out of control in 2 hours letting
a 12-yr-old browse a website and play some games.

Of course, if enough people do the same thing, the bad guys will
figure out how to do this on Linux boxes as well. But for now,
the OS X and Linux systems have been causing me (and the kids)
zero pain and I'm loving it.

Try and force me to write something in a way that I know is
wrong. Go ahead, it'll be a short argument, because I will
resign first.

That's a nice bubble you live in. Or is it just in your mind?

No, I'm just not a spineless jellyfish. It's rather
disappointing that it surprises you, it doesn't say much for
your own backbone that you would just roll over when faced with
this sort of thing.

We expect architects, doctors, lawyers, pretty much all
other real 'professions' to meet and typically exceed a higher
standard, and those that do not are punished, fined, or stripped
of their license to practice in the field. Why should
programmers get a pass? Is it because you do not feel it is a
professional position?

Because its not as structured, and that's simply not practical.
Doctors have training, internships, etc. Lawyers have to pass a bar
exam, etc. There's no such analogue for computer programmers.

Thank you. You get it now. That is exactly what is missing.
Because the most successful programmers are always ones that are
able to think outside the box,
Then they should have zero problems passing a rigorous training
program and examinations.
but the bar for average programmers is pretty low --
Fine. If you don't have your cert, you can be a 'nurse', you
can write scripts, or use uber-safe languages certified for
those not willing to prove themselves worthy through formal
certification.
but both can make a contribution, and neither can guarantee
perfect code.
And no doctor can guarantee that you won't die on the operating
table. But, they have to prove that they are competent anyway,
despite the lack of a guarantee of perfection. Would you like
it if they didn't have to do so?

Programmers are generally not aware of the liability of
their mistakes.

Then those you refer to must be generally incompetent.

Dennis Ritchie had no idea that NASA would put a priority inversion in
their pathfinder code.

Are you implying that Dennis Ritchie is responsible for some bad
code in the pathfinder project?
Linus Torvalds had no idea that the NSA would
take his code and use it for a security based platform.
Is there any evidence that the NSA chose his code because it was
not worth fooling with? What is your point? Oh, you're going
to tell us...
My point is
that programmers don't know what the liability of their code is,
because they are not always in control of when or where or for what it
might be used.
Wow, that is tortured at best. Presumably Ritchie is in your
list because of C or UNIX? How could he be 'liable' for an
application or driver written by somebody else 30 years later?

Are the contributors to gcc responsible for every bad piece of
software compiled with it?

If someone writes a denial-of-service attack program that sits
on a Linux host, is that Torvald's fault? I've heard of people
trying to shift blame before, but not that far. Maybe you might
want to blame Linus' parents too, since if they hadn't conceived
him, Linux wouldn't be around for evil programmers to write code
upon. Furrfu.
The recent JPEG parsing buffer overflow exploit, for example, came from
failed sample code from the JPEG website itself. You think we should
hunt down Tom Lane and linch him?
Nope. If you take sample code and don't investigate it fully
before putting it into production use, that's /your/ problem.
You think a doctor would take a sample of medicine he found
laying on a shelf in 7-11 and administer it to a patient in the
hopes that it would work? Downloading source off the web and
using it without reading and understanding it is similarly
irresponsible, although with perhaps less chance (although no
guarantee) of it killing someone.

I highly doubt that. Low-level language programmers would be
the cream of the crop, not 'the lowest bidder' as is the case
today.

You still don't get it. You, I or anyone you know, will produce errors
if pushed. There's no such thing as a 0 error rate for programming.

Then I do get it, because I agree with you. Let me know when I
can write a device driver in Python.
Just measuring first time compile error rates, myself, I score roughly
one syntax error per 300 lines of code. I take this as an indicator
for the likely number of hidden bugs I just don't know about in my
code. Unless my first-compile error rate was 0, I just can't have any
confidence that I don't also have a 0 hidden bug rate.
Strange logic, or lack thereof. Having no first-compile errors
doesn't provide ANY confidence that you don't have hidden bugs.
Go measure your own first-compile error rate and tell me you are
confident in your own ability to avoid hidden bugs.
That would be pointless, since measuring first-compile error
rate proves zilch about overall bug rates. If you want to avoid
hidden bugs, you have to actively look for them, test for them,
and code explicitly to avoid them, regardless of how often your
compiler detects a problem.
If you still think you can achieve a 0 or near 0 hidden bug rate, [snip, no sense following a false premise]
For a nuclear reactor, I would also include the requirement that they
use a safer programming language like Ada. Personally I would be
shocked to know that *ANY* nuclear reactor control mechanism was
written in C. Maybe a low level I/O driver library, that was
thoroughly vetted (because you probably can't do that in Ada), but
that's it.
Well gee, there you have it. It seems that there are some
places were C is almost unavoidable. What a shock. Who's
wearing those rose-colored glasses now?

[...] For
example, there are operations that have very low success rates,
yet there are doctors that specialize in them anyway, despite
the low odds.

Well, your analogy only makes some sense if you are talking about
surgeons in developing countries who simply don't have access to the
necessary anesthetic, support staff or even the proper education to do
the operation correctly. In those cases, there is little choice, so
you make do with what you have. But obviously its a situation you just
want to move away from -- they way you solve it, is you give them
access to the safer, and better ways to practice medicine.

You seem to ignore the /fact/ that even in the finest medical
facilities on the planet (argue where they are elsewhere) there
are medical operations that have very low success rates, yet
they are still attempted, usually because the alternative is
certain death. A 20% chance is better than zero.

If you don't want to take the risk, then go write in visual
whatever#.net and leave it to those that are.

So you want some people to stay away from C because the language is too
dangerous.

So are chainsaws, but I don't want chainsaws to be illegal, they
come in handy. So are steak knifes, and despite them be illegal
on airplanes, being stuck with plastic 'sporks' instead, I still
like them when cutting into a t-bone. You can not eliminate all
risk.

Do you really think you can do anything to a language that
allows you to touch hardware that will prevent people from
misusing it? Not all development work is for use inside a VM or
other sandbox.
While I want the language be fixed so that most people
don't trigger the landmines in the language so easily.

I am not opposed to the language removing provably faulty
interfaces, but I do not want its capabilities removed in other
ways. Even so, there is no likelihood of any short-term
benefits, due to the propagation delay of standard changes into
compilers, and no proof that it will even be beneficial
longer-term.

It would probably be a better idea for you to finish your
completely new "better C compiler" (keeping to your string
library naming) and make it so popular that C withers on the
vine. It's been so successful for you already, replacing all
those evil null-terminated strings all over the globe, I quiver
in anticipation of your next earth-shattering achievement.

--
Randy Howard (2reply remove FOOBAR)

Nov 15 '05 #47

Randy Howard

we******@gmail.com wrote
(in article
<11**********************@g47g2000cwa.googlegroups .com>):

For a nuclear reactor, I would also include the requirement that they
use a safer programming language like Ada.
The Ariane software module that caused the problem was written in Ada.
http://sunnyday.mit.edu/accidents/Ar...entreport.html
Had it been written in C, the actual cause (integer overflow) probably would
not have caused an exception. I'm not saying that it would have been better
in C, but you *cannot* blame the C standard for what happened there.

You are right, I cannot blame C for bugs that happen in other
languages. This is the most famous one from Ada.

You just got done telling me that Ada would avoid problems.
See, the thing is, with Ada bugs, you can clearly blame the programmer
for most kinds of failures.
Oh my, SURELY the Ada standard should not allow such things to
happen. Those thoughtless bastards, how could this be? ;-)

Also, this "priority inversion" you speak of - doesn't that imply processes
or threads? C does not have that AFAIK. So you cannot blame the C standard
for allowing priority inversion bugs to occurr. It neither allows or
disallows them, because C has no notion of priorities.

The programmer used priority based threading because that's what he had
available to him.

He used something that does not even exist in standard C, and
got bit in the ass. Gee, and to think that you want to hold the
standard committee (and judging by another post of yours Ritchie
himself) responsible when people do things like this. Wow.
Let's read on and see what sort of hole you choose to dig...
Suppose, however, that C had implemented co-routines
Suppose that you hadn't blamed standard C for something not
written in standard C. That one had a much higher chance of
being true until just recently.
Maybe the Pathfinder code would have more
coroutines, and fewer threads, and may have avoided the problem
altogether (I am not privy to their source, so I really don't know).
That didn't stop you from blaming it on standard C, why stop
now?
Coroutines are one of those "perfect compromises", because you can
easily specify a portable interface, that is very likely to be widely
supportable, they are actually tremendously faster than threading in
many cases, and all without adding *any* undefined behavior or
implementation defined behavior scenarios (other than a potential
inability to allocate new stacks.)
How strange that they are so wildly popular, whereas threads are
never used. *cough*
Full blown multithreading, such as
in POSIX is notoriously platform specific, and it should not surprise
anyone that only few non-UNIX platforms support full blowns POSIX
threads.
That's interesting, because I have used the pthreads interfaces
for code on Windows (pthreads-win32), Linux, OS X, solaris, and
even Novell NetWare (libc, since they started supporting them
several years ago). I didn't realize they didn't work, because
for some strange reason, they do work for me. Maybe I'm just
lucky, or maybe you're too fond of spouting off about things you
have 'heard' but don't actually know to be true.

Have there been bugs in pthread libraries? Yes. Have their
been bugs in almost every library ever used in software
development? Yes. Where they impossible to fix? No.
This fact has been noticed and adopted by those languages
where serious development is happening (Lua, Perl, Python). I don't
know if the C standards committee would be open to this -- I highly
doubt it.

Feel free to propose a complete coroutine implementation.
--
Randy Howard (2reply remove FOOBAR)

Nov 15 '05 #48

Antoine Leca

En <news:43***************@null.net>, Douglas A. Gwyn va escriure:

Antoine Leca wrote:

Not that I see any use for strrstr(), except perhaps to do the same
as strrchr() when c happens to be a multibyte character in a
stateless encoding.

Even then it's problematic, because the search would not respect
alignment with boundaries between character encodings.

Good point, you are quite right, and this is often overseen problem.
It will only work with self-synchronizing encodings (UTF-8 comes to mind,
but the only others I know of are using SS2/SS3, the single shifts,
_without_ using LSx/SI/SO, the locking shifts, and they are NOT very common
;-)).
Quite narrow application for a general library function.
Antoine

Nov 15 '05 #49

Antoine Leca

En <news:11********************@g49g2000cwa.googlegro ups.com>,
Paul Hsieh va escriure:

Remember that almost every virus, buffer overflow exploit, core
dump/GPF/etc is basically due to some undefined situation in the ANSI
C standard.

<OT>
The worst exploit I've seen so far was because a library dealing with
Unicode was not checking about malformed, overlong, UTF-8 sequences, and
allowed to walk though the filesystem, including in places where webusers
are not supposed to go. AFAIK, the library is written in C++ (it could
equally been written in C, that won't change the point.)
And the exploit was successful because some key directories had bad default
permissions as factory setup.

Another one quite successful was based on an brocken API for address books;
the API can be accessed from (not strictly conforming) C code, but that is
not how it is used usually. And the way the API is accessed though C
purposely avoid possible buffer overflows.

The most sticky virus I had to deal with was a bootsector virus. PC
bootsectors are not known to be written in C, rather in assembly language.
Granted, all these behaviours are _not_ defined by the ANSI C standard.
</OT>

Just because C is very much used will mean that statically it will show up
more often in exploit or core dumps or GPF cases. This only shows it is a
successful language; there might be reasons for that; in fact, the ANSI C
Standard is a big reason for its prolongated life as a successful (= widely
used) language: I mean, had it not happen, C would probably be superceeded
nowadays (same for FIV then F77; or x86/PC.)
Antoine

Nov 15 '05 #50