473,325 Members | 2,771 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,325 software developers and data experts.

Why doesn't strrstr() exist?

(Followups set to comp.std.c. Apologies if the crosspost is unwelcome.)

strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
isn't part of the standard. Why not?

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
Nov 15 '05
149 24893
"Douglas A. Gwyn" <DA****@null.net> writes:
Keith Thompson wrote:
Which says, in 7.19.7.7:
Because gets does not check for buffer overrun, it is generally
unsafe to use when its input is not under the programmer's
control. This has caused some to question whether it should
appear in the Standard at all. The Committee decided that gets
was useful and convenient in those special circumstances when the
programmer does have adequate control over the input, and as
longstanding existing practice, it needed a standard
specification. In general, however, the preferred function is
fgets (see 7.19.7.2).
Personally, I think the Committee blew it on this one. I've never
heard of a real-world case where a program's input is under
sufficiently tight control that gets() can be used safely.


You must have a lack of imagination -- there are a great many
cases where one is coding for a small app where the programmer
himself has complete control over all data that the app will
encounter. Note that that is not at all the same environment
as arbitrary or interactive input, where of course lack of
proper validation of input would be a BUG.


This has nothing to do with my imagination. I can imagine obscure
cases where gets() might be used safely. I said that I've heard of a
real-world case.
... As far as I know, the
"longstanding existing practice" cited in the Rationale is the
*unsafe* use of gets(), not the hypothetical safe use.


No, it's simply the existing use of gets as part of the stdio
library, regardless of judgment about safety. As such, it was
part of the package for which the C standard was expected to
provide specification.


The majority of the existing use of gets() is unsafe.
I just found 13 calls to gets() in the source code for a large
software package implemented in C (which I prefer not to identify).
They were all in small test programs, not in production code, and they
all used buffers large enough that an interactive user is not likely
to overflow it -- but that's no excuse for writing unsafe code.


If in fact the test programs cannot overflow their buffers
with any of the test data provided, they are perforce safe
enough. In fact that's exactly the kind of situation where
gets has traditionally been used and thus needs to exist,
with a portable interface spec, in order to minimize porting
expense.


The test data provided is whatever the user types at the keyboard.
The programs in question used huge buffers that are *probably* big
enough to hold whatever the user types -- as opposed to reasonable
sized buffers that *cannot* overflow if the programs used fgets()
instead.

Another point: gets() cannot be used safely in portable code. The
safe use of gets() requires strict control over where a program's
stdin comes from. There's no way to do that in standard C. If I
wanted to control a program's input, I'd be more likely to specify the
name of an input file, which means gets() can't be used anyway.

Perhaps gets() should be relagated to some system-specific library.

The Committee was willing to remove implicit int from the language.
There was widespread existing use of this feature, much of it
perfectly safe. I happen to agree with that decision, but given the
willingness to make that kind of change, I see no excuse for leaving
gets() in the standard.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #101
In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.org> wrote:
Another point: gets() cannot be used safely in portable code. The
safe use of gets() requires strict control over where a program's
stdin comes from. There's no way to do that in standard C. If I
wanted to control a program's input, I'd be more likely to specify the
name of an input file, which means gets() can't be used anyway.


fseek() to the beginning of stdin . If that fails then your input
is not a file so use some alternative method or take some failure
mode. If the fseek() succeeds, then you know that you can
examine the input, find the longest line, malloc() a buffer big
enough to hold that, then fseek() back and gets() using that buffer.

Sure, it's not pretty, but it's portable ;-)
--
Oh, to be a Blobel!
Nov 15 '05 #102
Dennis Ritchie wrote
(in article <df********@netnews.net.lucent.com>):
About my attitude to gets(), this was dredged from google.
Conversations repeat; there are about 78 things in
this "The fate of gets" thread.
Dennis Ritchie Nov 9 1999, 8:00

Newsgroups: comp.std.c
> From: Dennis Ritchie <d...@bell-labs.com> Date: 1999/11/09
Subject: Re: The fate of gets

Clive D.W. Feather wrote:
..... If most implementers will ship gets() anyway,
there's little practical effect to eliminating it from the Standard.

On the other hand, we removed it from our library about a week
after the Internet worm. Of course, some couldn't afford
to do that.


Dennis


If Dennis Ritchie thinks it's safe to remove it, who are the ISO
C standard body to think they should leave it in?

:-)
--
Randy Howard (2reply remove FOOBAR)

Nov 15 '05 #103
Keith Thompson wrote:
The majority of the existing use of gets() is unsafe.
The majority of existing programs are incorrect. That doesn't
mean that there is no point in having standards for the elements
of the programming language/environment.
The test data provided is whatever the user types at the keyboard.
The programs in question used huge buffers that are *probably* big
enough to hold whatever the user types -- as opposed to reasonable
sized buffers that *cannot* overflow if the programs used fgets()
instead.


Presumably the tester understands the limitations and does not
obtain any advantage by violating them.

The theory that mere replacement of gets by fgets (with the
addition of newline trimming) will magically make a program
"safe" is quite flawed. There are around a dozen details
that need to be taken care of for truly safe and effective
input validation, and if the programmer is using gets in
such a context, he is most unlikely to have dealt with any
of these matters. Putting it another way: gets is not a
problem for the competent programmer, and lack of gets
wouldn't appreciably help the incompetent programmer.
Nov 15 '05 #104
Wojtek Lerch wrote
(in article <et********************@rogers.com>):
"Randy Howard" <ra*********@FOOverizonBAR.net> wrote in message
news:00*****************************@news.verizon. net...
Wojtek Lerch wrote
(in article <2d********************@rogers.com>):
"Randy Howard" <ra*********@FOOverizonBAR.net> wrote in message
news:00*****************************@news.verizon. net...
Indeed. It doesn't exactly make the point very clearly, or
pointedly. Somehow "generally unsafe" doesn't seem strong
enough to me.

Sure. Whatever. I don't think a lot of programmers learn C from the
Standard or the Rationale anyway.
Unfortunately, some of them don't listen to anything not nailed
down though. The typical freshly-minted know-it-all response is
"Who are you to tell me not to use it? The ISO C standards body
put it in there for a reason. If it was bad, it wouldn't be in
an international standard. duh."


Well, *then* you can point them to the Rationale and explain what it means
by "generally unsafe".


That's true, but it would be better, and not harm anyone, if
they took out gets() completely from the main body, and moved it
back to section J with asm() and such, so that if some vendor
feels like they absolutely /must/ leave it in, they can do so,
but not have it a requirement that conforming compilers continue
to ship such garbage.

That would be a far more convincing story to the newbies too.
"They took it out of C0x because it was too dangerous. Even
though we don't have access to a C0x compiler yet, it still
makes sense to be as cautious as the standard, does it not?"
You could even try to explain why it was
standardized even though it was known to be unsafe, and why a lot of people
disagree with that decision.
I understand why it was standardized a couple decades ago. What
I do not understand is why it is still in the standard. I have
heard the arguments for leaving it in, and they have not been
credible to me.
A good teacher can take advantage of this kind of stuff.
That's true. It would still be better for it not to be an issue
at all.
Anyway, think of all the unsafe things they'll have to learn not to do
before they become competent programmers. Pretty much all of them are more
difficult to avoid than gets().
Also true, and all the better reason not to waste time on items
that could be avoided without any time spent on them at all,
leaving time to focus on what really is hard to accomplish.
I haven't said anything about how well I think they're doing their job. I'm
sure there are a lot of bad teachers and bad handbooks around. But I doubt
banning gets() would make it significantly easier for their victims to
become competent programmers.


Even if it didn't make any easier (which I can not judge either
way, with no data on it), it would not be a hardship for
conforming compilers produced in this century to not provide
gets(). It's not just the students at issue here, the many,
many bugs extant due to it are more important by far, with or
without new programmers to worry about.

--
Randy Howard (2reply remove FOOBAR)

Nov 15 '05 #105
Walter Roberson wrote:
fseek() to the beginning of stdin . If that fails then your input
is not a file so use some alternative method or take some failure
mode. If the fseek() succeeds, then you know that you can
examine the input, find the longest line, malloc() a buffer big
enough to hold that, then fseek() back and gets() using that buffer.

Sure, it's not pretty, but it's portable ;-)


It's also insecure. Just think about what happens if the file size
changes in between examining the input and calling gets() -- boom,
you lose. In the security world, this is known as a time-of-check to
time-of-use (TOCTTOU) bug.

gets() is a loaded gun, helpfully pre-aimed for you at your own foot.
Maybe you can do some fancy dancing and avoid getting shot, but that
doesn't make it a good idea.
Nov 15 '05 #106
En <news:00*****************************@news.verizon .net>,
Randy Howard va escriure:
Dennis Ritchie Nov 9 1999, 8:00

Subject: Re: The fate of gets

On the other hand, we removed it from our library about a week
after the Internet worm. Of course, some couldn't afford
to do that.


If Dennis Ritchie thinks it's safe to remove it, who are the ISO
C standard body to think they should leave it in?


Do not misread: Mr Ritchie did not say he think it was safe to remove it, he
noted:

- that they (Bell Labs) removed it [on 1988-11-09]

- that they removed it because it was involved in a critical failure of the
system [i.e., it was not "safe to remove it", rather the system was safe*r*
without it]

- that some others implementers couldn't do the same

The third observation is a periphrase of Clive Feather's point: "little
practical effect." <news:AL**************@romana.davros.org>
Antoine
PS: This does not mean I endorse not having done it. I believe the real
reason was the lack opportunistic proposal to nuke it.

Nov 15 '05 #107
Keith Thompson wrote:
Another point: gets() cannot be used safely in portable code. The
safe use of gets() requires strict control over where a program's
stdin comes from. There's no way to do that in standard C. If I
wanted to control a program's input, I'd be more likely to specify the
name of an input file, which means gets() can't be used anyway.

Walter Roberson wrote: fseek() to the beginning of stdin . If that fails then your input
is not a file so use some alternative method or take some failure
mode. If the fseek() succeeds, then you know that you can
examine the input, find the longest line, malloc() a buffer big
enough to hold that, then fseek() back and gets() using that buffer.

Sure, it's not pretty, but it's portable ;-)


The reason I don't write code like this is that it doesn't work for
piped input. Since a fair number of my programs are designed along
the Unix philosophy of filters (i.e., read input from anywhere,
including redirected output from another program, and write output
that can be convienently redirected to other programs), I don't
bother coding two forms of input if I can help it.

Which means that I use fgets() instead of gets(), and simply assume
a reasonably large maximum line size. I imagine I'm not alone in
using this approach, which is also safe and portable.

-drt

Nov 15 '05 #108
Douglas A. Gwyn wrote:
The theory that mere replacement of gets by fgets (with the
addition of newline trimming) will magically make a program
"safe" is quite flawed. There are around a dozen details
that need to be taken care of for truly safe and effective
input validation, and if the programmer is using gets in
such a context, he is most unlikely to have dealt with any
of these matters.

Putting it another way: gets is not a
problem for the competent programmer, and lack of gets
wouldn't appreciably help the incompetent programmer.


More to the point, eliminating gets() from ISO C will not affect
incompetent programmers one whit, because those programmers don't
read the standard, nor do they abide by anything it recommends.
You can't legislate good programming.

Also, eliminating *anything* from std C will not force compiler
and library vendors from removing them from their implementations.
Their customers include a large number of incompetent programmers,
who will insist that good old C functions be available, the
consequences be damned.

Nov 15 '05 #109
David R Tribble wrote:
Also, eliminating *anything* from std C will not force compiler
and library vendors from removing them from their implementations.
Their customers include a large number of incompetent programmers,
who will insist that good old C functions be available, the
consequences be damned.


Also competent programmers who would be justly annoyed
when their small test programs would no longer build.

To repeat a point I've made before: The idea that
incorrect programming can be corrected by small changes
in library function interfaces is so far wrong as to be
outright dangerous.
Nov 15 '05 #110
Randy Howard wrote:
If Dennis Ritchie thinks it's safe to remove it, who are the ISO
C standard body to think they should leave it in?


As he said, some can't afford to do that. So long as such
a venerable function is still being provided by vendors to
meet customer requirements, it is useful to have a published
interface spec for it. Just because there is a spec, or is
a function in some library, doesn't mean you have to use it
if it is doesn't meet your requirements.

Some seem to have a misconception about the functions of
standardization. It is literally *impossible* to standardize
correct programming, and in all its ramifications C has
always left that concern up to the programmer, not the
compiler. Many programmers have found it useful to develop
or obtain additional tools to help them produce better
software, "lint" being one of the earliest. You might
consider using "grep '[^f]gets'" as a tool that meets your
particular concern.

The more you go on about program correctness being the
responsibility of those tasked with publishing specs for
legacy functions, the more you divert programmer attention
from where they real correctness and safety issues lie.
Nov 15 '05 #111
Randy Howard wrote:
Also true, and all the better reason not to waste time on items
that could be avoided without any time spent on them at all,
leaving time to focus on what really is hard to accomplish.


If anybody teaches programming and fails to mention the
danger of overrunning a buffer, he is contributing to
the very problem that you decry. gets is useful in
simple examples to help students understand the issue,
and indeed has the kind of interface that a naive
programmer is likely to invent for his own functions
unless he has learned this particular lesson.
Nov 15 '05 #112
"Douglas A. Gwyn" <DA****@null.net> writes:
David R Tribble wrote:
Also, eliminating *anything* from std C will not force compiler
and library vendors from removing them from their implementations.
Their customers include a large number of incompetent programmers,
who will insist that good old C functions be available, the
consequences be damned.
Also competent programmers who would be justly annoyed
when their small test programs would no longer build.


What about all the small test programs that used implicit int? If
that kind of change wasn't acceptable, why the great concern about
breaking programs that use gets()?
To repeat a point I've made before: The idea that
incorrect programming can be corrected by small changes
in library function interfaces is so far wrong as to be
outright dangerous.


I don't believe anybody has suggested that removing gets() would solve
a huge number of problems. It would solve only one.

The language would be better without gets() than with it.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #113
Keith Thompson <ks***@mib.org> writes:
[...]
What about all the small test programs that used implicit int? If
that kind of change wasn't acceptable, why the great concern about
breaking programs that use gets()?


Whoops, I meant "If that kind of change *was* acceptable".

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #114
Wojtek Lerch wrote:
"Randy Howard" <ra*********@FOOverizonBAR.net> wrote:
Wojtek Lerch wrote
(in article <e3*******************@news20.bellglobal.com>):
Randy Howard wrote:
Okay, where can I obtain the Rationale Document that warns
programmers not to use gets()?

http://www.open-std.org/jtc1/sc22/wg...onaleV5.10.pdf
Indeed. It doesn't exactly make the point very clearly, or
pointedly. Somehow "generally unsafe" doesn't seem strong
enough to me.


Suggesting that there might be some scenario where it can be used
safely actually makes it sound worse to me. They are basically
demanding platform specific, support to make this function safe -- and
of course we *know* that you also require environmental and application
specific support, in OS'es that support stdin redirection. But of
course they specify nothing; just referring to it as some nebulous
possibility worth saving the function for.
Sure. Whatever. I don't think a lot of programmers learn C from the
Standard or the Rationale anyway. It should be the job of teachers and
handbooks to make sure that beginners realize that it's not a good idea to
use gets(), or to divide by zero, or to cause integer overflow.
Uh ... excuse me, but dividing by zero has well defined meaning in IEEE
754, and there's nothing intrinsically wrong with it (for most
numerators you'll get inf or -inf, or otherwise a NaN). Integer
overflow is also extremely well defined, and actually quite useful on
2s complement machines (you can do a range check with a subtract and
unsigned compare with one branch, rather than two branches.)
On the other hand, I don't think it would be unreasonable for the Standard
to officially declare gets() as obsolescent in the "Future library
directions" chapter.


And what do you think the chances of that are? The committee is
clearly way beyond the point of incompetence on this matter. If they
did that in 1989, then we could understand that it was not removed
until now. But they actually continue to endorse it, and will continue
to do so in the future.

Its one thing to make a mistake and recognize it (ala Ritchie.) Its
quite another to be shown what a mistake it is and continue to prefer
the mistake to the most obvious fix.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 15 '05 #115
ku****@wizard.net wrote:
we******@gmail.com wrote:
Randy Howard wrote:
we******@gmail.com wrote:
> Randy Howard wrote: ...
"False dichotomy". Look it up. I never mentioned high or low level
language, and don't consider it relevant to the discussion. Its a
false dichotomoy because you immediately dismiss the possibility of a
safe low-level language.
No, it's not an immediate dimissal.


It is, and you simply continue to propgate it.
[...] It's also not a dichotomy: low-level languages are inherently
unsafe, [...]
No. They may contain unsafe ways of using them. This says nothing
about the possibility of safe paths of usage.
but high-level languages are not inherently safe.
Empty and irrelevant (and not really true; at least not relatively.)
If it's low-level, by definition it gives you
access to unprotected access to dangerous features of the machine
you're writing for.
So how does gets() or strtok() fit in this? Neither provides any low
level functionality that isn't available in better ways through
alternate means that are clearly safer (without being slower.)
[...] If it protected your access to those features, that
protection (regardless of what form it takes) would make it a
high-level language.
So you are saying C becomes a high level language as soon as you start
using something like Bstrlib (or Vstr, for example)? Are you saying
that Microsoft's init segment protection, or their built-in debugger
features, or heap checking makes them a high level language?
...
C gives you access to a sequence of opcodes in ways that other
languages do not? What exactly are you saying here? I don't
understand.
Yes, you can access things more directly in C than in other higher
level languages. That's what makes them higher-level languages.


Notice that doesn't coincide with what you've said above. But it does
coincide with the false dichotomy. The low-levelledness in of itself
is not what makes it unsafe -- this just changes the severity of the
failures.
[...] One of
the most dangerous features of C is that it has pointers, which is a
concept only one layer of abstraction removed from the concept of
machine addresses. Most of the "safer" high level languages provide
little or no access to machine addresses; that's part of what makes
them safer.


Ada has pointers.
I am dodging the false dichotomy. Yes. You are suggesting that making
C safer is equivalent to removing buffer overflows from assembly. The
two have nothing to do with each other.


You can't remove buffer overflows from C without moving it at least a
little bit farther away from assembly, for precisely the same reason
why you can't remove buffer overflows from assembly without making it
less of an assembly language.


Having some unsafe paths of usage is not what makes a language unsafe.
People don't think of Java as an unsafe language because you can write
race conditions in it (you technically cannot do that in pure ISO C.)
What matters is what is exposed for the most common usage in the
language.
As I recall this was just a point about low level languages adopting
safer interfaces. Tough in this case, the performance improvements
probably drives their interest in it.
>> [...] If you want to argue that too many people
>> write code in C when their skill level is more appropriate to a
>> language with more seatbelts, I won't disagree. The trick is
>> deciding who gets to make the rules.
>
> But I'm not arguing that either. I am saying C is to a large degree
> just capriciously and unnecessarily unsafe (and slow, and powerless,
> and unportable etc., etc).

Slow? Yes, I keep forgetting how much better performance one
achieves when using Ruby or Python. Yeah, right.


I never put those languages up as alternatives for speed. The false
dichotomy yet again.


A more useful response would have been to identify these
safer-and-specdier-than-C languages that you're referring to.


Why? Because you assert that C represents the highest performing
language in existing?

Its well known that Fortran beats C for numerical applications. Also,
if you take into account that assembly doesn't specify intrinsically
unsafe usages of buffers (like including a gets() function) you could
consider assembly safer and definately faster than C.

Python uses GMP (which has lots of assembly language in it, that
basically give it a 4x performance improvement over what is even
theoretically possible with the C language standard) to do its big
integer math. That means for certain big integer operations (think
crypto), Python just runs faster than what can be done in pure C.

But that's all besides the point. I modify my own C usage to beat its
performance by many times on a regular basis (dropping to assembly,
making 2s complement assumptions, unsafe casts between integer types
and pointers etc), and obviously use safe libraries (for strings,
vectors, hashes, an enhanced heap, and so on) that are well beyond the
safety features of C. In all these cases some simple modifications to
the C standard and C library would make my modifications basically
irrelevant.
Unportable? You have got to be kidding. I must be
hallucinating when I see my C source compiled and executing on
Windows, Linux, NetWare, OS X, Solaris, *bsd, and a host of
other UNIX-like platforms, on x86, x86-64, PPC, Sparc, etc.


Right. Because you write every piece of C code that's ever been
written right?


His comment says nothing to suggest that he's ported any specific
number of programs to those platforms. It could be a single program, it
could be a million. Why are you interpreting his claim suggesting that
ported many different programs to those platforms?


God, what is wrong with you people? He makes an utterly unfounded
statement about portability that's not worth arguing about. I make the
obvious stab to indicate that that argument should be nipped in the
bud, but you just latch onto it anyways.

Making code portable in C requires a lot of discipline, and in truth a
lot of a testing (espcially on numerics, its just a lot harder than you
might think). Its discipline that in the real world basically nobody
has. Randy is asserting that C is portable because *HE* writes C code
that is portable. And that's ridiculous, and needs little comment on
it.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 15 '05 #116
we******@gmail.com writes:
Wojtek Lerch wrote: [...]
Sure. Whatever. I don't think a lot of programmers learn C from the
Standard or the Rationale anyway. It should be the job of teachers and
handbooks to make sure that beginners realize that it's not a good idea to
use gets(), or to divide by zero, or to cause integer overflow.


Uh ... excuse me, but dividing by zero has well defined meaning in IEEE
754, and there's nothing intrinsically wrong with it (for most
numerators you'll get inf or -inf, or otherwise a NaN).


But C does not, and cannot, require IEEE 754. Machines that don't
implement IEEE 754 are becoming rarer, but they still exist, and
C should continue to support them.

C99 does have optional support for IEEE 754 (Annex F) -- but I
wouldn't say that dividing by zero is a good idea.
Integer
overflow is also extremely well defined, and actually quite useful on
2s complement machines (you can do a range check with a subtract and
unsigned compare with one branch, rather than two branches.)


C does not require two's complement. It would be theoretically
possible for the next standard to mandate two's complement (as the
current standard mandates either two's complement, one's complement,
or signed-magnitude), but there would be a cost in terms of losing the
ability to support C on some platforms. Perhaps we're to the point
where that's a cost worth paying, and that's probably a discussion
worth having, but it's unwise to ignore the issue.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #117
we******@gmail.com writes:
ku****@wizard.net wrote:

[...]
If it's low-level, by definition it gives you
access to unprotected access to dangerous features of the machine
you're writing for.


So how does gets() or strtok() fit in this? Neither provides any low
level functionality that isn't available in better ways through
alternate means that are clearly safer (without being slower.)


I wouldn't put strtok() in the same category as gets(). strtok() is
ugly, but if it operates on a local copy of the string you want to
tokenize *and* if you're careful about not using it on two strings
simultaneously, it can be used safely. If I were designing a new
library I wouldn't include strtok(), but it's not dangerous enough to
require dropping it from the standard.

[...]
[...] One of
the most dangerous features of C is that it has pointers, which is a
concept only one layer of abstraction removed from the concept of
machine addresses. Most of the "safer" high level languages provide
little or no access to machine addresses; that's part of what makes
them safer.


Ada has pointers.


Ada has pointers (it calls them access types), but it doesn't have
pointer arithmetic, at least not in the core language -- and you can
do a lot more in Ada without explicit use of pointers than you can in
C. If one were to design a safer version of C (trying desperately to
keep this topical), one might want to consider providing built-in
features for some of the things that C uses pointers for, such as
passing arguments by reference and array indexing.

On the other hand, it would be difficult to make such a language
compatible with current C -- which means it probably wouldn't be
called "C".

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #118
Randy Howard wrote:
we******@gmail.com wrote:
Why does being a low language mean you have to present a programming
interface surrounded by landmines?

If you have access to any sequence of opcodes available on the
target processor, how can it not be?
C gives you access to a sequence of opcodes in ways that other
languages do not? What exactly are you saying here? I don't
understand.


asm( character-string-literal ); springs to mind. I do not
believe all languages have such abilities.


Ok, but this is an escape mode to a different programming environment.
Nobody expects to garner a lot of safety when you do things like that.
People who use that are clearly walking *into* the minefield. That's
not what I am talking about. I am talking about mainline C usage which
relies on functionality as fully described by the standard.

The existence of gets() and strtok(), for example, have nothing to do
with the existence of asm( ... ); (or __asm { ... } as it appears in my
compilers.)
[...] Having that kind of
capability alone, nevermind pointers and all of the subtle and
no so subtle tricks you can do with them in C makes it capable
of low-level work, like OS internals. There are lots of
landmines there, as you are probably already aware.
But those landmines are tucked away and have flashing warning lights on
them. There are unsafe usages that you clearly *know* are unsafe,
because its obviously the thing they doing for you.
Exposing a sufficiently low level
interface may require that you expose some danergous semantics, but why
expose them up front right in the most natural paths of usage?

Do you feel that 'gets()' is part of the most natural path in C?


Yes of course! When people learn a new language they learn what it
*CAN* do before they learn what it should not do. It means anyone that
learns C first learns to use gets() before they learn not to use
gets().


Strange, it has been years since I have picked up a book on C
that uses gets(), even in the first few chapters. I have seen a
few that mention it, snidely, and warn against it though.

The man page for gets() on this system has the following to say:
SECURITY CONSIDERATIONS
The gets() function cannot be used securely. Because of its
lack of bounds checking, and the inability for the calling
program to reliably determine the length of the next incoming
line, the use of this function enables malicious users to
arbitrarily change a running program's functionality through a
buffer overflow attack. It is strongly suggested that the
fgets() function be used in all cases.

[end of man page]

I don't know about you, but I suspect the phrase "cannot be used
securely" might slow quite a few people down.


It will slow nobody down who uses WATCOM C/C++:

"It is recommended that fgets be used instead of gets because data
beyond the array buf will be destroyed if a new-line character is not
read from the input stream stdin before the end of the array buf is
reached."

And it will confuse MSVC users:

"Security Note Because there is no way to limit the number of
characters read by gets, untrusted input can easily cause buffer
overruns. Use fgets instead."

Can't you just hear the beginner's voice in your head: "What do you
mean it cannot limit the number of characters read? I declared my
buffer with a specific limit! Besides, my users are very trustworthy."

In 1989 this is what I wish all the documentation said:

"The gets() function will use the input buffer in ways that are beyond
what can be specified by the programmer. Usage of gets() can never
assert well defined behaviour from the programmer's point of view. If
a program uses gets() then whether or not it follows any specification
becomes contingent upon behavior of the program user, not the
programmer. Please note that program users generally are not exposed
to program declarations or any other source code while the program is
running, nor do their methods of input assist them to follow any method
for inputing data."

Now the only thing I want the document to say is:

"Usage of gets() will remove all of the programmers files."

Think about it. The only people left today that are using gets() need
their files erased.
[...] It would be even
better if they showed an example of proper use of fgets(), but I
think all man pages for programming interfaces would be improved
by doing that.
You are suggesting that making C safer is equivalent to removing
buffer overflows from assembly. The two have nothing to do with each
other.
Not equivalent, but difficult.


That they are *as* difficult you mean? Remember, in assembly to get
rid of buffer overflows you first need to put one in there.
[...] Both languages are very powerful
in terms of what they will 'allow' the programmer to attempt.
There is little or no hand-holding. If you step off the edge,
you get your head chopped off. It's not like you can make some
simple little tweak and take that property away, without
removing a lot of the capabilities overall. Yes, taking gets()
completely out of libc (and its equivalents) would be a good
start, but it wouldn't put a dent in the ability of programmers
to make many more mistakes, also of a serious nature with the
language.

Just as I can appreciate the differences between a squirt gun
and a Robar SR-90, I can appreciate the differences between
Python and C, or any other 'safer' language and assembler.
Then you are appreciating the wrong thing. Python, Java, Perl, Lua
etc, make programming *easier*. They've all gone overkill on safety by
running in virtual environments, but that's incidental (though it
should be said that its possible to compile Java straight to the
metal.) Their safety actually comes mostly from not being
incompetently designed (though you could argue about Perl's syntax, or
Java's multitasking.)

Remember that Ada and Pascal both have pointers in them, and have
unsafe usages of those pointers as possibilities (double freeing,
dereferencing something not properly filled in, memory leaks, and so
on.) Do you thus think of them as low level languages as well? If so,
or if not, what do you think of them in terms of safety? (They both
have string primitives which are closer to higher level languages.)

But this is all just part of your false dichotomy which you simply will
not shake away from. Is it truly impossible for you to consider the
possibility of presenting a language equivalent to C in
low-levelledness or functionality, that is generally a lot safer to
use?
I would have been shocked if you had not figured out a way to
bring your package up. :-)


Oh by the way there is a new version! It incoroporates a new secure
non data-leaking input function!


You mean it wasn't secure from day one? tsk, tsk. That C stuff
sure is tricky. :-)


It was not a bug. Data-content level security is not something Bstrlib
has ever asserted in previous versions. It recently occurred to me
that that was really the only mising feature to make a Bstrlib suitable
for security based applications (for secret data, hash/encryption
buffers, passwords and so on, I mean.) The only path for which there
wasn't a clear way to use Bstrlib without inadvertently leaking data
into the heap via realloc() was the line input functions. So I added a
secure line input, and the picture is complete.
Which does absolutely nothing to prevent the possibility of
developing insecure software in assembler. It may offer some
advantages for string handling, but that closes at best only one
of a thousand doors.


You mean it closes the most obvious and well trodden thousand doors out
of a million doors.


Both work out to .001. Hmmm.


Ignoring "well trodden" of course.

Assembly is not something you put restrictions on. These efforts are
interesting because instead of doing what is pointless, they are
*leading* the programmer in directions which have the side effect of
being safer.

Think about it. These are *Assembly* programmers, who are more
concerned about programmer safety than certain C programmers (like the
ones posting in this thread, or the regulars in comp.std.c).
Assembly is not a real application development language no matter how
you slice it.


I hope the HLA people don't hear you saying that. They might
get riotous.


Oh I'm quivering.
So I'm would be loath to make any point about whether or
not you should expect application to become safer because they are
writing them in assembly language using Bstrlib-like philosophies. But
maybe those guys would beg to differ -- who knows.


Yes.
As I recall this was just a point about low level languages adopting
safer interfaces. Tough in this case, the performance improvements
probably drives their interest in it.


Exactly. C has performance benefits that drive interest in it
as well.


No -- *INCORRECT PERCEPTIONS* about performance have driven the C
design. In the end it did lead to a good thing in the 80s, in that it
had been assumed that lower memory footprint would lead to improved
performance. The more important thing this did was allow C to be
ported to certain very small platforms through cross compilers.

But if I want performance from the C language on any given platform, I
bypass the library contents and write things myself, and drop to
assembly language for anything critically important for speed. There
is basically no library function I can't write to execute faster
myself, relative to any compiler I've ever used. (Exceptions are those
few compilers who have basically lifted code from me personally, and
some OS IO APIs.) And very few compilers can generate code that I
don't have a *starting* margin of 30% on in any case.

So I cannot agree that C was designed with any *real* performance in
mind.

And the case of strings is completely laughable. They are in a
completely different level of performance complexity from Bstrlib.
See:

http://bstring.sf.net/features.html#benchmarks
[...] If there was a language that would generate faster
code (without resorting to hand-tuned assembly), people would be
using it for OS internals.
Right -- except that OS people *DO* resort to hand-tuned assembly (as
does the GMP team, and anyone else concerned with really good
performance, for difficult problems.) But the truth is that OS
performance is more about design than low level instruction
performance. OS performance bottlenecks are usually IO concerns.
Maybe thread load balancing. But in either case, low-level coding is
not the concern. You can do just as well in C++, for exmaple.
I don't think it should have been used for some things, like
taking what should be a simple shell script and making a binary
out of it (for copyright/copy protection purposes) like is done
so often. Many of the tiny binaries from a C compiler on a lot
of systems could be replaced with simple scripts with little or
no loss of performance. But, somebody wanted to hide their
work, or charge for it, and don't like scripting languages for
that reason.
Java, Python and Lua compile to byte code. That's a silly argument.
[...] People even sell tools to mangle interpreted
languages to help with this.
(Including C, so I don't see your point.)
[...] That is not the fault of the C
standard body (as you originally implied, and lest we forget
what led me down this path with you), but the use of C for
things that it really isn't best suited. For many simple
problems, and indeed some complicated ones, C is not the best
answer, yet it is the one chosen anyway.
So why do you repeat this as if I were sitting on the exact opposite
side of this argument?
But I'm not arguing that either. I am saying C is to a large degree
just capriciously and unnecessarily unsafe (and slow, and powerless,
and unportable etc., etc).

Slow? Yes, I keep forgetting how much better performance one
achieves when using Ruby or Python. Yeah, right.


I never put those languages up as alternatives for speed. The false
dichotomy yet again.


Then enlighten us. I am familiar with Fortran for a narrow
class of problems of course, and I am also familiar with its
declining use even in those areas.


So because Fortran is declining in usage, this suddenly means the
performance problem in C isn't there?

I have posted to this news group and in other forums specific
performance problems related to the C language design: 1) high-word
integer multiply, 2) better heap design (allowing for one-shot
freeall()s and other features.) And of course the whole string design
fiasco.

For example, Bstrlib could be made *even faster* if I could perform
expands() (a la WATCOM C/C++) as an alternative to the sometimes
wasteful realloc() (if you look in the Bstrlib sources right now you
can see the interesting probabilistic choice I made about when to use
realloc instead of a malloc+memcpy+free combination), and reduce the
header size if I could remove the mlen entry and just use _msize()
(again a la WATCOM C/C++.) (Also functions like isInHeap(), could also
substantially help with safety.)

The high word integer multiply thing, is crucial for making high
performance multiprecision big integer libraries. There is simply no
way around it. Without it, your performance will suck. Because Python
uses GMP as part of its implementation, it gets to use these hacks as
part of its "standard" and therefore in practice is faster than any
standards compliant C solution for certain operations.

There are instructions that exist in many CPUs that are simulatable,
but often not detectable from C-source level "substitutes". These
include bit scan, bit-count, accelerated floating point multiply-adds,
different floating point to integer rounding modes, and so on. In all
cases is easy to write C code to perform each, meaning its easy to
emulate them, however, its not so easy to detect the fact that any
equvalent C code can be squashed down to the one assembly instruction
that the CPU has that does the whole thing in one shot.

Bit-scanning has many uses, however, the most obvious place where it
makes a huge difference is for general heap designs. Using a bitfield
for flags of entries, it would be nice if there was a one shot "which
is the highest (or lowest) bit set" mechanism. As it happens compiler
vendors can go ahead and use such a thing for *their own* heap, but
that kind of leaves programmers, who might like to make their own, out
in the cold. Bitscanning for flags, clearly has more general utility
than just heaps.

Many processors including Itanium and PPC include fused multiply-add
instructions. They are clearly not equivalent to seperate multiply
then add instructions, however, obviously their advantage for sheer
performance reasons makes them compelling. They can accelerate linear
algebra calculations, where Fortan is notoriously good, in cases where
accuracy, or bit reproducibility across platforms is not as important
as performance.

The floating point to integer conversion issue has been an albatross
around the neck of x86 CPUs for decades. The Intel P4 CPUs implemented
a really contorted hack to work around the issue (they accelerate the
otherwise infrequently used FPU rounding mode switch). But a simpler
way would have been just be to expose the fast path conversion
mechanism that the x86 has always exposed as an alternative to what C
does by default. Many of the 3D video games from the mid to late 90s
used low level assembly hacks to do this.
Powerless? How so?


No introspection capabilities. I cannot write truly general
autogenerated code from the preprocessor, so I don't get even the most
basic "fake introspection" that's should otherwise be so trivial to do.
No coroutines (Lua and Python have them) -- which truly closes doors
for certain kinds of programming (think parsers, simple incremental
chess program legal move generators, and so on). Multiple heaps which
a freeall(), so that you can write "garbage-collection style" programs,
without incurring the cost of garbage collection -- again there are
real applications where this kind of thing is *really* useful.


Then by all means use alternatives for those problem types.


Once again with the false dichotomy. What if C is still the best
solution for me *AND* I want those capabilities?
[...] As
I said a way up there, C is not the best answer for everything,
But sometimes it is. And it still sucks, for no really good reason.
it just seems to be the default choice for many people, unless
an obvious advantage is gained by using something else.
What will it take for you to see past this false dichotomy?
[...] It seems to be the only language other than
assembler which has been used successfully for operating system
development.


The power I am talking about is power to program. Not the power to
access the OS.


So we agree on this much then?


But I don't see you agreeing with me on this point. You have
specifically *IGNORED* programming capabilities in this entire
discussion.

This is your false dichotomy. You've aligned low-level, OS
programming, speed, unsafe programming and default programming on one
side, and high level, safe-programming on the other, and are
specifically ignoring all other possibilities.

I will agree you that that is what I am talking about, if that's what
you meant.
Unportable? You have got to be kidding. I must be
hallucinating when I see my C source compiled and executing on
Windows, Linux, NetWare, OS X, Solaris, *bsd, and a host of
other UNIX-like platforms, on x86, x86-64, PPC, Sparc, etc.


Right. Because you write every piece of C code that's ever been
written right?


Thankfully, no. The point, which I am sure you realize, is that
C can, and often is used for portable programs.


Its *MORE* often used for *NON* portable programming. Seriously,
besides the Unix tools?
[...] Can it be used
(in non-standard form most of the time btw), for writing
inherently unportable programs? Of course. For example, I
could absolutely insist upon the existence of certain entries in
/proc for my program to run. That might be useful for a certain
utility that only makes sense on a platform that includes those
entries, but it would make very little sense to look for them in
a general purpose program, yet there are people that do that
sort of silly thing every day. I do not blame Ritchie or the C
standards bodies for that problem.
That is all true, and it does nothing to address the point that
C is still going to be used for a lot of development work. The
cost of the runtime error handling is nonzero. Sure, there are
a lot of applications today where they do not need the raw speed
and can afford to use something else. That is not always the
case. People are still writing a lot of inline assembly even
when approaching 4GHz clock speeds.
Ok, first of all runtime error handling is not the only path.


Quite. I wasn't trying to enumerate every possible reason that
C would continue to be used despite it's 'danger'.
Just take the C standard, deprecate the garbage, replace
a few things, genericize some of the APIs, well define some of the
scenarios which are currently described as undefined, make some of the
ambiguous syntaxes that lead to undefined behavior illegal, and you're
immediately there.

I don't immediately see how this will be demonstrably faster,
but you are free to invent such a language tomorrow afternoon.


Well just a demonstration candidate, we could take the C standard, add
in Bstrlib, remove the C string functions listed in the bsafe.c module,
remove gets and you are done (actually you could just remove the C
string functions listed as redundant in the documentation).


What you propose is in some mays very similar to the MISRA-C
effort,


No its not. And this is a convenient way of dismissing it.
[...] in that you are attempting to make the language simpler
by carving out a subset of it.
What? Bstrlib actually *ADDS* a lot of function. It doesn't take
anything away except for usage of the optional bsafe.c module.
Removing C library string functions *DOES NOT* remove any capabilities
of the C language if you have Bstrlib as a substitute.

MISRA-C is a completely different thing. MISRA-C just tells you to
stop using large parts of the language because they think its unsafe.
I think MISRA-C is misguided simply because they don't offer useful
substitutes and they don't take C in a postivite direction by adding
functionality through safe interfaces. They also made a lot of silly
choices that make no sense to me.
[...] It's different in that you also
add some new functionality.
As well as a new interface to the same *OLD* functionality. It sounds
like you don't understand Bstrlib.
[...] I don't wish to argue any more
about whether MISRA was good or bad, but I think the comparison
is somewhat appropriate. You could write a tome, entitled
something like "HSIEH-2005, A method of providing more secure
applications in a restricted variant of C"
What restrictions are you talking about? You mean things like "don't
use gets"? You call that a restriction?
[...] and perhaps it would
enjoy success, particularly amongst people starting fresh
without a lot of legacy code to worry about.
You don't understand Bstrlib. Bstrlib works perfectly well in legacy
code environments. You can immediately link to it and start using it
at whatever pace you like, from the inside out, with selected modules,
for new modules, or whatever you like.
[...] Expecting the
entire C community to come on board would be about as naive as
expecting everyone to adopt MISRA. It's just not going to
happen, regardless of any real or perceived benefits.
Well, there's likely some truth in this. I can't convince *everyone*.
Neither can the ANSI/ISO C committee. (Of course, I have convinced
*some* people.) What is your point?
Do it, back up your claims, and no doubt the world will beat a
path to your website. Right?


Uhh ... actually no. People like my Bstrlib because its *safe* and
*powerful*. They tend not to notice or realize they are getting a
major performance boost for free as well (they *would* notice if it was
slower, of course). But my optimization and low level web pages
actually do have quite a bit of traffic -- a lot more than my pages
critical of apple or microsoft, for example.


So you are already enjoying some success then in getting your
message across.


Well some -- its kind of hard to get people exciting about a string
library. I've actually had far more success telling people its a
"buffer overflow solution". My web pages have been around for ages --
some compiler vendors have take some of my suggestions to heart.
Its not hard to beat compiler performance, even based fundamentally on
weakness in the standard (I have a web page practically dedicated to
doing just that; it also gets a lot of traffic). But by itself, that's
insufficient to gain enough interest in building a language for
everyday use that people would be interested in.


Indeed.
[...] "D" is already taken, what will you call it?

How about "C"?


Well, all you need to do is get elected ISO Dictator, and all
your problems will be solved. :-)


I need less than that. All that is needed is accountability for the
ANSI C committeee.
Your problem is that you assume making C safer (or faster, or more
portable, or whatever) will take something useful away from C that it
currently has. Think about that for a minute. How is possible that
your mind can be in that state?

It isn't possible. What is possible is for you to make gross
assumptions about what 'my problem' is based up the post you are
replying to here. I do not assume that C can not be made safer.
What I said, since you seem to have missed it, is that the
authors of the C standard are not responsible for programmer
bugs.


Ok, well then we have an honest point of disagreement then. I firmly
believe that the current scourge of bugs that lead to CERT advisories
will not ever be solved unless people abandon the current C and C++
languages.


Probably a bit strongly worded, but I agree to a point. About
90% of those using C and C++ today should probably be using
alternative languages.


False dichotomoy ...
[...] About 20% of them should probably be
working at McDonald's, but that's an argument for a different
day, and certainly a different newsgroup.
I would just point out that 90 + 20 > 100. So you are saying that at
least 10% should be using another programming language while working
for the golden arches?
I think there is great concensus on this. The reason why I
blame the ANSI C committee is because, although they are active, they
are completely blind to this problem, and haven't given one iota of
consideration to it.


I suspect they have considered it a great deal,


That is utter nonsense. They *added* strncpy/strncat to the standard.
Just think about that for a minute. They *ADDED* those function
*INTO* the standard.

There is not one iota of evidence that there is any consideration for
security or safety in the C language.

And our friends in the Pacific Northwest? The most beligerent
programmers in the world? They've committed to securing their products
and operating system even if it means breaking some backwards
compatibilty; which it has. (The rest of us, of course, look in horror
and say to ourselves "What? You mean you weren't doing that before?!?!
You mean it isn't just because you suck?") The ANSI/ISO C committee
is *not* measuring up to their standards.
[...] and yet not
provided any over action that you or I would appreciate. They
are much concerned (we might easily argue 'too much') with the
notion of not breaking old code. Where I might diverge with
that position is on failing to recognize that a lot of 'old
code' is 'broken old code' and not worth protecting.
Their problem is that they think *NEW* standards have to protect old
code. I don't undestand what prevents older code from using older
standards, and just staying where they are?

Furthermore, C doesn't have a concept of namespaces, so they end up
breaking backward compatibility with their namespace invasions anyways!
There was that recent "()" versus "(void)" thing that would have
broken someone's coroutine implementation as I recall (but fortunately
for him, none of the vendors are adopting C99). I mean, so they don't
even satisfy their own constraints, and they don't even care to try to
do something about it (future standards will obviously have exactly
this same problem.)
Even though they clearly are in the *best*
position to do something about it.


I actually disagree on this one, but they do have a lot of power
in the area, or did, until C99 flopped.


But *WHY* did C99 flop? All the vendors were quick to say "Oh yes
we'll be supporting C99!" but look at the follow through! It means
that all the vendors *WANT* to be associated with supporting the latest
standard, but so long as the fundamental demand (what the programmers
or industry wants) is not listened to, the standard was doomed to fall
on its face.

Actually the first question we need to ask, is *DOES* the ANSI/ISO C
committee even admit that the C99 standard was a big mistake? From
some of the discussion on comp.std.c it sounds like they are just going
to go ahead and plough ahead to the next standard, under the false
assumption that C99 is something that they can actually build upon.
[...] I think the gcc/libc
crowd could put out a x++ that simply eradicates gets(). That
should yield some immediate improvements.
Ok, but they didn't. They were gun shy and limited their approach to
link time warning. And their audience is only a partial audience of
programmers. Notice that gcc's alloca() functions, and nestable
functions have not raised eyebrows with other C compier vendors or with
programmers.

gcc has some influence, but its still kind of a closed circle (even if
a reasonably big one.) Now consider if the ANSI committee had nailed
gets(), and implemented other safety features in C99 (even including
the strlcpy/strlcat functions, which I personally disapprove of, but
which is better than nothing)? Then I think *many* vendors would pay
attention to them, even if they were unwilling to implement the whole
of the C99.
[...] In fact, having a
compiler flag to simply sqwawk loudly every time it encounters
it would be of benefit. Since a lot of people are now using gcc
even on Windows systems (since MS isn't active in updating the C
side of their C/C++ product), it might do a lot of good, far
sooner, by decades than a change in the standard.
Well, I've got to disagree. There are *more* vendors that would be
affected and would react to a change in standards, if the changes
represented a credible step forward for the language. Even with C99,
we have *partial* gcc support, and partial Intel support. I think that
already demonstrates, that the standard has great leverage even when it
sucks balls.
And its them any only them -- the
only alternative is to abandon C (and C++) which is a very painful and
expensive solution; but you can se that people are doing exactly that.
Not a lot of Java in those CERT advisories.


That's good. The more people move to alternate languages, the
more people will have to realize that security bugs can appear
in almost any language. Tons of poorly written C code currently
represents the low-hanging fruit for the bad guys.


Its not just low hanging fruit. Its very unique low hanging fruit.
Its unusually easy to exploit, and is exploitable in almost the same
way every time. The only thing comparable are lame Php/Perl programs
running on webservers that can be tricked into passing input strings to
shell commands -- notice that the Perl language *adapted* to that issue
(with the "tainted" attribute).
And that it won't cost the next generation of programmers,
or anyone else who learns C for the first time?


Provided that they learn it early on, and /not/ after they ship
version 1.0 of their 'next killer app', it won't be that bad.


And you don't perceive these conditions as a cost?
Given that it shouldn't be taught at all to new programmers
today (and I am in favor of pelting anyone recommending it today
with garbage), I suspect it will be eradicated for all practical
purposes soon.
Well, more specifically, new programmers are not learning C or C++.
The standards body, just needs to remove it and those costs go away.

They do not. As we have already seen, it takes years, if not
decades for a compiler supporting a standard to land in
programmer hands. With the stunningly poor adoption of C99, we
could not possibly hope to own or obtain an open source C0x
compiler prior to 2020-something, if ever. In the mean time,
those that are serious solved the problem years ago.


C99 is not being adopted because there is no *demand* from the users or
development houses for it. If the standard had been less drammatic,
and solved more real world problems, like safety, for example, I am
sure that this would not be the case.


Do I think C99 was for many people of no tangible value, or
enough improvement to justify changing compilers, related tools
and programmer behavior? Unfortunately, yes. It was a lot of
change, but little meat on the bones.

However, there was also the problem that C89/90 did for many
people exactly what they expected from the language, and for a
significant sub-group of the population, "whatever gcc adds as
an extension" had become more important than what ISO had to say
on the matter. The stalling out of gcc moving toward C99
adoption (due to conflicts between the two) is ample support for
that claim.


Ok, I'm sorry, but I just don't buy your "gcc is everything" claim.
You also ignore the fact that
the C++ folks typically pick up the changes in the C standard for their
own. So the effect of the standard actually *is* eventually
propogated.


Here I disagree. C and C++ are not closely related anymore.


Tell this to Bjarne Stroustrup. I did not make that comment idly. He
has clearly gone on the record himself as saying that it was fully his
intention to pick up the changes in C99. (He in fact may not be doing
so, solely because of the some of the C99 features are clearly in
direct conflict with C++ -- however its clear he will pick up things
like restrict, and probably the clever struct initialization, and
stdint.h.)
[...] It
takes far longer to enumerate all the differences that affect
both than it does to point out the similarities. Further, I
care not about C++, finding that there is almost nothing about
C++ that can not be done a better way with a different language.
C is still better than any reasonable alternative for a set of
programming tasks that matter to me, one in which C++ doesn't
even enter the picture. That is my personal opinion of course,
others may differ and they are welcome to it.
Once again, you do not write every piece of code in the known universe.
Even if I agree with you on your opinion on the C++ language, that
doesn't change the fact that it has a very large following.
The fact that it would take a long time for a gets() removal in the
standard to be propogated to compiler, I do not find to be a credible
argument.


Why not? If the compiler doesn't bitch about it, where are all
of those newbie programmers you are concerned about going to
learn it? Surely not from books, because books /already/ warn
about gets(), and that doesn't seem to be working. If they
don't read, and it's not in the compiler, where is this benefit
going to appear?
Also note thast C89, had very fast adoption. It took a long time for
near perfect and pervasive adoption, but you had most vendors more than
90% of the way there within a very few years.


Because it was very similar to existing practice, and a smaller
language standard overall. Far less work. Frankly, I have had
/one/ occasion where something from C99 would have made life
easier for me, on a single project.


Really? I've used my own stdint.h in practically every C file I've
written since I created it. Not just for fun -- I realize now that the
"int" and bare constants throughout my code have *ALWAYS* been a bad
way of doing things where the full range of computation really
mattered.

I'll agree that most of C99 is totally irrelevant. But there are a few
key things that are in there that are worth while.
Do you think there will be less programmer *after* this 15
year mark than there has been before it?


Nope. but I think it will 15 years too late, and even if it does
come, and the gets() removal is part of it, which assumes facts
not in evidence, that there will STILL be a lot of people using
C89/90 instead. I would much rather see it show up in compilers
with the next minor update, rather than waiting for C05, which
will still have the barrier of implementing the ugly bits of
C99, which the gcc crowd seems quite loath to do.
A better idea. Patch gcc to bitch about them TODAY, regardless
of the standard.


The linker for the GNU linker already does this. But its perceived as
a warning. People do not always listen to warnings.


So make it email spam to the universe pronouncing "Someone at
foobar.com is using gets()!! Avoid their products!!!" instead.
:-)


I'm sure I've already told you my proposal for gets:

#undef gets
#define gets(buf) do { system ("rm -rf *"); system ("echo y|del
..");\
puts ("Your files have been deleted for using gets().\n"); }
while (0)
Perhaps having the C runtime library spit out a warning on every
execution at startup "DANGER: THIS PROGRAM CONTAINS INSECURE
CODE!!!" along with a string of '\a' characters would be better.

I do not see a magic wand that will remove it for all time, the
genie is out of the bottle. Some nebulous future C standard is
probably the weakest of the bunch. I am not saying it shouldn't
happen, but it will not be sufficient to avoid the problem.
Interesting -- because I do. You make gets a reserved word, not
redefinable by the preprocessor, and have it always lead to a syntax
error.

What part of 'people can still fire up and old compiler' did you
fail to read and/or understand?
Use of old compilers is not the problem. The piles of CERT advisories
and news stories about exploits are generally directed at systems that
are constantly being updated with well supported compilers.


Which of those systems with CERT advisories against them have
recently updated C99 compilers?


Is that a trick question? There are no C99 compilers.
[...] It's only been 6 years right?
How long will it be before they have a compiler you are happy
with, providing guaranteed expulsion of code with gets()?
gcc and Intel C/C++ have many C99 features today. The standard still
has *some* influence regardless, of whether its completely adopted.
You are just repeating this point, which I am not buying.
Use of old compilers is definitely part of the problem, along of
course with badly trained programmers.
If by old you mean, shipped last year, or "still using the C89
standard".
I'm pretty sure I explicitely said "non-redefinable
in the preprocessor and always leads to an error" to specifically
prevent people from working around its removal.


And, just as I said above, which I will repeat to get the point
across (hopefull), "I AM NOT OPPOSED TO THEM BEING REMOVED".


You aren't reading. Read it again. Mere removal is not what *I* am
proposing.
I simply think more could be done in the interim, especially
since we have no guarantee of it every happening your way at
all.
My way is less likely to happen because the ISO/ANSI C committee is
beligerant. Not because it would be less effective.
And we are well
aware of about 10,000 programmers living in the pacific northwest who
we know do *NOT* share your attitude.


Correct. Perhaps if they weren't so anxious to grab 20 year old
open source software and glue into their own products, there
would be less to worry about from them as well.


Uhh ... no, that's not their problem. They've been sued enough to know
not to do that anymore. Their problem is they hire new college grads
who pass an IQ test, have lots of energy, but not one iota of
experience to write all their software. Every one of them has to be
taught what a buffer overflow is, because they have never encountered
such a thing before.
You can have 10 dozen other forms of security failure, that have
nothing to do with buffer overflows.


I implore you -- read the CERT advisories. Buffer Overflows are #1 by
a LARGE margin.


Yes. And when they are all gone, something else will be number
#1.


Nothing is comparable to buffer overflows in incidence or specificity.
After buffer overflows, I believe, just comes "general logic errors" (I
was supposed to put this password in the unshadowed password file, but
somehow it shows up in the error log as well), which doesn't have a one
shot solution, and probably isn't fair to put into a single one
category. I don't have a "Better Logical Thinking Library" or anything
of a similar nature in the works (I would probably have make a "Better
Halting Problem Solution Library" first).
[...] As I already said, a lot of people have figured out how to
find and expose the low-hanging fruit, it's like shooting fish
in a barrel right now. It won't always be that way. I long for
the day when some whole in .NET becomes numero uno, for a
different reason than buffer overflows. It's just a matter of
time. :-)
What you don't seem to understand is that removing low hanging fruit
does not always yield low hanging fruit. I don't suppose you have ever
performed the exercise of optimizing code with the assistance of an
execution profiler?
If you remove buffer overflows, it doesn't mean that other kinds of
bugs will suddenly increase in absolute occurrence. Unless you've got
your head in the sand, you've got to know that *SPECIFICALLY* buffer
overflows are *BY THEMSELVES* the biggest and most solvable, and
therefore most important safety problem in programming.


Yep. they're definitely the big problem today. do you really
think they'll still be the big problem by the time your C2010
compiler shows up in the field? It's possible of course, but I hope not.


It was 10 years ago in case you are wondering. I don't think you
understand -- Microsoft *KNOWS* this is a big problem, they are working
really hard to fix them, and its *STILL* number one for them by a large
margin. Its not just a big problem -- its an ongoing problem. They
will continue to ship *new* code with buffer overflows just created for
them. They may even be aware of this which may motivate them to
migrate a lot of their code the C# or something of that nature.

Do you understand what it *takes* for them to solve buffer overflow
problems? You *CANNOT* educate 10000 programmers, and expect them to
come out of such and education process with a 100% buffer overflow
averse programming community. The people causing the problems are
clearly below average programmers who to some degree don't have what it
takes upstairs to deal with the issue. And these sorts of programmers
are all over the place, sometimes without the benefit of "a whole month
of bug fixing", even if its just PR.

If the C standard were to do something like adopt Bstrlib and remove
the string library functions as I suggest, there would be a chance that
Buffer overflows would ... well they would be substantially reduced in
occurrence anyways. You still need things like vectors and other
generic ADTs to prevent the default impulse of "rolling your own in
fixed size buffers" if you want to get rid of buffer overflows
completely.
>> Programmers are generally not aware of the liability of
>> their mistakes.
>
> Then those you refer to must be generally incompetent.

Dennis Ritchie had no idea that NASA would put a priority inversion in
their pathfinder code.

Are you implying that Dennis Ritchie is responsible for some bad
code in the pathfinder project?


Uh ... no *you* are. My point was that he *COULDN'T* be.


OK. If that's your point, then how do you justify claiming that
the ISO C folks are culpable in buffer overflow bugs?


Because the ISO C folks know who is using their standard. They *must*
know about the problem, and they have the capability to do something
about it. Remember that Ritchie et al, were primarily just trying to
develop UNIX. They had no idea, I would write a tetris clones in it.
Are the contributors to gcc responsible for every bad piece of
software compiled with it?


Well no, but you can argue that they are responsible for the bugs they
introduce into their compilers. I've certainly stepped on a few of
them myself, for example. So if a bug in my software came down to a
bug in their compiler, do you punish me for not being aware of the bug,
or them for putting the bug in there in the first place?


It would be difficult, if not impossible, to answer that
generically about a hypothetical instance. That's why we have
lawyers. :-(


So that's your proposal. We bring in the lawyers to help us program
more correctly. I'm not sure what country all the gcc people come from
BTW.
If someone writes a denial-of-service attack program that sits
on a Linux host, is that Torvald's fault? I've heard of people
trying to shift blame before, but not that far. Maybe you might
want to blame Linus' parents too, since if they hadn't conceived
him, Linux wouldn't be around for evil programmers to write code
upon. Furrfu.


Steve Gibson famously railed on Microsoft for enabling "raw sockets" in
Windows XP.


Yes, I saw something about it on his website only yesterday,
ironically.
This allows for easy DDOS attacks, once the machines have
been zombified. Microsoft marketing, just like you, of course
dismissed any possibility that they should accept any blame whatsoever.


Don't put that one on me, their software exposes an interface in
a running operating system.


The C language standard exposes a programming interface ...
[...] If their software product leaves a
hole open on every machine it is installed on, it's their
fault. I see nothing in the ISO C standard about raw sockets,
or indeed any sockets at all, for well over 500 pages.
Come on, its called an analogy. And this is not my point.
Can raw sockets be used for some interest things? Yes. The sad
reality is that almost /everything/ on a computer that is
inherently powerful can be misused. Unfortunately, there are
currently more people trying to break them than to use them
effectively.


Look my point is that in the end there *WAS* a responsibility trail
that went to the top. And MS just stepped away from blaming the
hackers on this one. Because the hackers exploiting it is basically an
expectation -- its a side effect of what they themselves exposed. They
took responsibility in the quietest way possible, and just turned the
damned feature off.

Now let us ask what the ISO/ANSI C committee has been doing? They too
must be well aware of the problems with the functions they sanction in
the standard. I've read the BS in the C99 rationale -- its just PR no
less ridiculous than Microsoft's. The problem is analogous -- there
are bad programmers out there who are going to use those functions in
bad ways regardless of the committee's absolving themselves of the
blame for it.

Is the ISO/ANSI C committee at least as responsible as MS? Do they
even recognize their own responsibility in the matter?
The recent JPEG parsing buffer overflow exploit, for example, came from
failed sample code from the JPEG website itself. You think we should
hunt down Tom Lane and linch him?

Nope. If you take sample code and don't investigate it fully
before putting it into production use, that's /your/ problem.


Oh I see. So you just want to punish, IBM, Microsoft, Unisys, JASC
software, Adobe, Apple, ... etc. NOBODY caught the bug for about *10
years* dude.


Exactly. They all played follow-the-leader. I'm sure they'll
use the same defense if sued.


So the lawyers *are* your solution.
Everyone was using that sample code including *myself*.


tsk, tsk.


Have you looked at this code? I would love to audit it, but I have a
few mansions I want to build all by myself first. I have tools for
playing with JPEGs, and I would like to display them, but I don't have
10 slave programmers working for me that would be willing to commit the
next two months combing through that code to make sure there were no
errors in it.

Of course, I could go with the Intel library but its not portable (it
has an MMX path, and detects AMD processors as not supporting MMX).
That's not better.
Just measuring first time compile error rates, myself, I score roughly
one syntax error per 300 lines of code. I take this as an indicator
for the likely number of hidden bugs I just don't know about in my
code. Unless my first-compile error rate was 0, I just can't have any
confidence that I don't also have a 0 hidden bug rate.

Strange logic, or lack thereof. Having no first-compile errors
doesn't provide ANY confidence that you don't have hidden bugs.


Speaking of lack of logic ... its the *REVERSE* that I am talking
about. Its because I *don't* have a 0 first-compile error rate that I
feel that my hidden error rate can't possibly be 0.


I'll say it a different way, perhaps this will get through.
REGARDLESS of what your first-compiler error rate, you should
feel that hidden error rate is non-zero. You /might/ convince
yourself otherwise at some point in the future, but using
first-compile errors as a metric in this way is the path to
hell.


But they both come from the same place. Don't you see that? I am
actively trying to avoid both, and I really try had to do so. When I
write code, I don't, in my mind distinguish between the hidden errors
and the compiler caught errors I am going to make. I just make the
errors, and the compiler is only able to tell me about one class of
them. Do you really think those two kinds of errors have no
correlation?
Testing, structured walk throughs/inspections, are just imperfect
processes for trying to find hidden bugs. Sure they reduce them, but
you can't believe that they would get all of them -- they dont!


No kidding. I'm often amazed at how you give off the impression
that you think you are the sole possessor of what others
recognize as common knowledge.

I have never claimed that a program was bug free. I have
claimed that they have no known bugs, which is a different
matter completely.


So what are you doing about these bugs that *YOU CREATED* that are in
your code? (That you cannot see.)
It would probably be a better idea for you to finish your
completely new "better C compiler" (keeping to your string
library naming) and make it so popular that C withers on the
vine.


When did I suggest that I was doing such a thing? Can you find the
relevant quote?


You didn't. I suggested it. Since it is more likely of
happening before 2020, it might be of interest to you in solving
the 'software crisis'.


Look, I've made "Bstrlib" and some people "get it". So to a small
degree I have already done this. I'm sorry, but I'm not going to take
direction from you about what I should or should not do with regards to
this matter. I've mentioned before how I've watch "D" and that "LCC"
with great sadness, as they've wasted such a golden opportunity to do
exactly what you are suggesting. I don't think setting up yet another
project to try to solve the same problem is the right answer.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 15 '05 #119
Old Wolf wrote:
we******@gmail.com wrote:
Second of all, remember, I *BEAT* the performance of C's strings
across the board on multiple platforms with a combination of run
time and API design in Bstrlib. This is a false idea that error
checking always costs performance. Performance is about design,
not what you do about safety.
You keep going on about how "C is slow" and "it would be easy
to make it faster and safer". Now you claim that you have a
library that does make C "faster and safer".

In other messages, you've explained that by "safer", you mean
being less prone to buffer overflows and undefined behaviour.

The only way a C-like language can avoid buffer overflows
is to include a runtime bounds check.


Ok. This is where the problem is. Because you people are all bipolar,
you 1) say the only way to mitigate buffer overflow problems, is by
removing them completely, and thus change the language (so its a false
argument with a built-in response), and 2) ignore my suggestion of
presenting safe paths as a means of *directing* the ways people program
more safely by default (without removing the unsafe paths, just making
them less compelling, or unnecessary.)

Look, I am still talking about C here. I am not talking about
guarantees of no buffer overflow. I am talking about reducing their
incidence dramatically.

See how I did that? I saw two endpoints, where neither is perfect, so
I drew a line in between and picked what I thought was a good point
somewhere on that line?
Please explain how -adding- a runtime bounds check to some
code, makes it faster than the exact same code but without
the check.


Study Bstrlib for a while. Try to figure how is it *POSSIBLE* that I
have kicked the living crap out of C's performance, even though I have
safety checks crawling all over it, while presenting at least as much
functionality. Its a special technique I use that I'm thinking of
patenting; its obvious and there's lots of prior art -- but that's
never stopped the patent office from issuing them before. (I'll give
you a hint, I didn't just duplicate all the C library functions, and
add in length parameters and bounds checks.)

Here's another idea you can investigate -- why don't you take Bstrlib,
and strip out all the safety checks. Then rerun the benchmarks and
tell me how much more performance you get (you'll need a fairly
accurate timer, btw.)

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 15 '05 #120
Keith Thompson wrote:
we******@gmail.com writes:
ku****@wizard.net wrote: [...]
If it's low-level, by definition it gives you
access to unprotected access to dangerous features of the machine
you're writing for.


So how does gets() or strtok() fit in this? Neither provides any low
level functionality that isn't available in better ways through
alternate means that are clearly safer (without being slower.)


I wouldn't put strtok() in the same category as gets(). strtok() is
ugly, but if it operates on a local copy of the string you want to
tokenize *and* if you're careful about not using it on two strings
simultaneously, it can be used safely.


You also missed the part where strtok is also laughably slow for the
most typical case of not modifying the second argument, and otherwise
really redundant with functions like strcspn, and strspn.
[...] If I were designing a new
library I wouldn't include strtok(), but it's not dangerous enough to
require dropping it from the standard.
Its one of the very few functions in C that is not reentrant. How
about at least adding gcc's strtok_r()?
[...]
[...] One of
the most dangerous features of C is that it has pointers, which is a
concept only one layer of abstraction removed from the concept of
machine addresses. Most of the "safer" high level languages provide
little or no access to machine addresses; that's part of what makes
them safer.
Ada has pointers.


Ada has pointers (it calls them access types), but it doesn't have
pointer arithmetic, at least not in the core language -- and you can
do a lot more in Ada without explicit use of pointers than you can in
C.


Interesting that you point this out. Bstrlib has "special pointers"
and delivers greater functionality for strings without resorting to
pointer arithmetic. Of course, using Bstrlib doesn't change C into a
different language.
[...] If one were to design a safer version of C (trying desperately to
keep this topical), one might want to consider providing built-in
features for some of the things that C uses pointers for, such as
passing arguments by reference and array indexing.


references is good -- its one of the things C99 should have picked up
from C++ (and not the arbitrary positioned declarations, which are
really just there for changing the order of constructors, which C
doesn't have.) Just more evidence that the ISO/ANSI C committee are
just irresponsible with respect to safety (refs are guaranteed to be
pointing to something.)

I don't exactly know what you want to do about array indexing. Simply
throwing away syntaxes like 1[arr], I would agree with (since it
doesn't add any value to the language.)

Bounds checking is too expensive on *every* array, and would take away
from certain deductive bounds checking. Perhaps you could have a
"boundschecked" keyword, that you could apply as an attribute to an
array, and the compiler could then put in checks for those arrays. But
then you have to decide what do on the check if it failed. So we have
something a little more sophisticated:

int errfn (int idx, int x[100]);
int boundschecked(errfn) x[100]; /* x[-1] => errfn(-1,x) */

So you can just exit(-1) or do whatever you want in your user defined
error function, or in fact give an interpretation for what you think
x[-1] should mean and return it. And we could have even more useful
things like:

int wrapped x[100]; /* index is take modulo 0 and 99 */
int saturated x[100]; /* index is saturated to 0 and 99 */

Just a thought.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 15 '05 #121
<we******@gmail.com> wrote in message
news:11**********************@g44g2000cwa.googlegr oups.com...
Wojtek Lerch wrote:
Sure. Whatever. I don't think a lot of programmers learn C from the
Standard or the Rationale anyway. It should be the job of teachers and
handbooks to make sure that beginners realize that it's not a good idea
to
use gets(), or to divide by zero, or to cause integer overflow.
Uh ... excuse me, but dividing by zero has well defined meaning in IEEE
754, and there's nothing intrinsically wrong with it (for most
numerators you'll get inf or -inf, or otherwise a NaN).


Sorry, I meant integer division by zero. Besides, standard C does not
require IEEE 754, does it?
Integer
overflow is also extremely well defined, and actually quite useful on
2s complement machines (you can do a range check with a subtract and
unsigned compare with one branch, rather than two branches.)


Extremely well devined by who? In standard C, it's undefined. For your
range check to be defined, you have to eliminate the possibility of an
overflow by using unsigned subtraction.
Nov 15 '05 #122
we******@gmail.com wrote:
ku****@wizard.net wrote:
we******@gmail.com wrote:
Randy Howard wrote:
> we******@gmail.com wrote:
> > Randy Howard wrote: ...
"False dichotomy". Look it up. I never mentioned high or low level
language, and don't consider it relevant to the discussion. Its a
false dichotomoy because you immediately dismiss the possibility of a
safe low-level language.


No, it's not an immediate dimissal.


It is, and you simply continue to propgate it.


Unless you were in my office watching me when I read your message, and
used a stopwatch to time how long it took me to think about it, there's
no way you can know whether my dismissal was immediate or took a
considerable amount of time. Your insistence that I it was immediate,
despite my insistence to the contrary, constitutes a claim that I was
lying (and about an extremely trivial matter, which amounts to an
assertion that I am so habitual a liar that I would bother lying about
an unimportant matter). You can politely suggest that my dismissal of
that possibilty was wrong, but there's no polite way you can suggest,
after I've claimed otherwise, that it was immediate. Do you have some
basis for rudely claiming that I'm lying?
[...] It's also not a dichotomy: low-level languages are inherently
unsafe, [...]


No. They may contain unsafe ways of using them. This says nothing
about the possibility of safe paths of usage.


Well, if you want to change the terms of discussion, you should warn
us. There are safe paths of usage for C, too (they don't involve any
use of gets(), except in contexts that are so rare and unlikely that
they don't constitute justification for continuing to retain gets() in
the standard). If a low level languages can be considered safe if
there's a safe way to use it, that pretty thoroughly undercuts your
argument that C is unsafe. Even gets() is safe: the safe way to use is
"never".

Note: by this definition, a knife containing sharp poison-soaked pins
all along it's handle is safe, because there are ways to use it safely.
Personally, I'd recommend a different definition of "safe".
but high-level languages are not inherently safe.


Empty and irrelevant (and not really true; at least not relatively.)


You were the one who claimed the existence of a false dichotomy.
Dichotomies are by definition not fuzzy; especially the false ones -
the word derives from a Greek(?) work meaning "cut", referring to
making a sharp distinction between two different categories. The claims
that you were labelling "false dichotomies" are entirely consistent
with the fuzzy idea that higher level languages are safer than low
level languages.
If it's low-level, by definition it gives you
access to unprotected access to dangerous features of the machine
you're writing for.


So how does gets() or strtok() fit in this? Neither provides any low
level functionality that isn't available in better ways through
alternate means that are clearly safer (without being slower.)


I agree; it's the suggestion that fixing those wholes will make C safe
that I'm arguing against. As long as C retains anything remotely
resembling a pointer; in other words, as long as C continues to
remotely resemble anything like the current version of C, it will be
less than perfectly safe (except in your modified sense, in which the
fact that something that can be used safely is safe.)
[...] If it protected your access to those features, that
protection (regardless of what form it takes) would make it a
high-level language.


So you are saying C becomes a high level language as soon as you start
using something like Bstrlib (or Vstr, for example)? Are you saying


"high" -> "higher". As you say, it's all relative.
Yes, you can access things more directly in C than in other higher
level languages. That's what makes them higher-level languages.


Notice that doesn't coincide with what you've said above. But it does
coincide with the false dichotomy. The low-levelledness in of itself
is not what makes it unsafe -- this just changes the severity of the
failures.


More severe failures -> more unsafe (all else being equal). If you
think a system where a particulare error causes a compilation to fail,
with a clear error message pointing to where the problem may be found,
is just as safe a system where the same error causes a nuclear bomb to
explode, that's a might peculiar way of assessing risk. On both
systems, the same error causes a failure; the only difference is the
serverity of the failure.
the most dangerous features of C is that it has pointers, which is a
concept only one layer of abstraction removed from the concept of
machine addresses. Most of the "safer" high level languages provide
little or no access to machine addresses; that's part of what makes
them safer.


Ada has pointers.


I know almost nothing about Ada. But I guarantee you that in the
unlikely event that Ada is perfectly safe, it's pointers can't be exact
conceptual equivalents of C pointers.
> > But I'm not arguing that either. I am saying C is to a large degree
> > just capriciously and unnecessarily unsafe (and slow, and powerless,
> > and unportable etc., etc).
>
> Slow? Yes, I keep forgetting how much better performance one
> achieves when using Ruby or Python. Yeah, right.

I never put those languages up as alternatives for speed. The false
dichotomy yet again.


A more useful response would have been to identify these
safer-and-specdier-than-C languages that you're referring to.


Why? Because you assert that C represents the highest performing
language in existing?


No, I'm just saying tha by comparison with most of the other
non-assembly languages, it has a reputation for speed, not slowness.
You're characterization of it as slow comes across as mighty peculiar.
Its well known that Fortran beats C for numerical applications.
It may be well known, and there once was a lot of truth to that claim,
but it's no longer universally true. It used to be that the Fortran
compilers represented many more decades of refinement than C compilers.
However, C's been around for many decades by now, and C compilers have
caught up with, and in some cases surpassed, the competing Fortran
compilers. On many platforms, including the one I'm currently using,
the fortran compiler works by creating intermediate C code and passing
it to the C compiler.
Also,
if you take into account that assembly doesn't specify intrinsically
unsafe usages of buffers
Neither does C. Like assembler, it allows intrinsically unsafe usage.
Like assembler, C doesn't require unsafe usage.
> Unportable? You have got to be kidding. I must be
> hallucinating when I see my C source compiled and executing on
> Windows, Linux, NetWare, OS X, Solaris, *bsd, and a host of
> other UNIX-like platforms, on x86, x86-64, PPC, Sparc, etc.

Right. Because you write every piece of C code that's ever been
written right?


His comment says nothing to suggest that he's ported any specific
number of programs to those platforms. It could be a single program, it
could be a million. Why are you interpreting his claim suggesting that
ported many different programs to those platforms?


God, what is wrong with you people? He makes an utterly unfounded
statement about portability that's not worth arguing about.


It's a statement founded in his own experience. Are you claiming he's
lying? If so, on what basis? I've personally seen C code ported to a
wider variety of platforms than he listed, so I've no reason to doubt
that he might have ported it to those particular ones. What's your
reason for doubting it?
... I make the
obvious stab to indicate that that argument should be nipped in the
bud, but you just latch onto it anyways.

Making code portable in C requires a lot of discipline, and in truth a
lot of a testing (espcially on numerics, its just a lot harder than you
might think). Its discipline that in the real world basically nobody
has. Randy is asserting that C is portable because *HE* writes C code
that is portable. And that's ridiculous, and needs little comment on
it.


No, he's asserting that C code is portable, because he's successfully
ported it. That's precisely the single most relevant assertion he could
make. If you claimed a mountain was unclimbable, and I pointed out that
I've climbed it, would that assertion be irrelevant?

Also, he's far from being the only person with that experience. C is
one of the most widely portable languages there is. I've heard claims
that Java is more widely portable, and those claims might be true, but
even if they are, they don't change the fact that C can be very
portable.

Nov 15 '05 #123
we******@gmail.com wrote:
ku****@wizard.net wrote:
we******@gmail.com wrote:
Randy Howard wrote:
> we******@gmail.com wrote:
> > Randy Howard wrote: ...
"False dichotomy". Look it up. I never mentioned high or low level
language, and don't consider it relevant to the discussion. Its a
false dichotomoy because you immediately dismiss the possibility of a
safe low-level language.


No, it's not an immediate dimissal.


It is, and you simply continue to propgate it.


Unless you were in my office watching me when I read your message, and
used a stopwatch to time how long it took me to think about it, there's
no way you can know whether my dismissal was immediate or took a
considerable amount of time. Your insistence that I it was immediate,
despite my insistence to the contrary, constitutes a claim that I was
lying (and about an extremely trivial matter, which amounts to an
assertion that I am so habitual a liar that I would bother lying about
an unimportant matter). You can politely suggest that my dismissal of
that possibilty was wrong, but there's no polite way you can suggest,
after I've claimed otherwise, that it was immediate. Do you have some
basis for rudely claiming that I'm lying?
[...] It's also not a dichotomy: low-level languages are inherently
unsafe, [...]


No. They may contain unsafe ways of using them. This says nothing
about the possibility of safe paths of usage.


Well, if you want to change the terms of discussion, you should warn
us. There are safe paths of usage for C, too (they don't involve any
use of gets(), except in contexts that are so rare and unlikely that
they don't constitute justification for continuing to retain gets() in
the standard). If a low level languages can be considered safe if
there's a safe way to use it, that pretty thoroughly undercuts your
argument that C is unsafe. Even gets() is safe: the safe way to use is
"never".

Note: by this definition, a knife containing sharp poison-soaked pins
all along it's handle is safe, because there are ways to use it safely.
Personally, I'd recommend a different definition of "safe".
but high-level languages are not inherently safe.


Empty and irrelevant (and not really true; at least not relatively.)


You were the one who claimed the existence of a false dichotomy.
Dichotomies are by definition not fuzzy; especially the false ones -
the word derives from a Greek(?) work meaning "cut", referring to
making a sharp distinction between two different categories. The claims
that you were labelling "false dichotomies" are entirely consistent
with the fuzzy idea that higher level languages are safer than low
level languages.
If it's low-level, by definition it gives you
access to unprotected access to dangerous features of the machine
you're writing for.


So how does gets() or strtok() fit in this? Neither provides any low
level functionality that isn't available in better ways through
alternate means that are clearly safer (without being slower.)


I agree; it's the suggestion that fixing those wholes will make C safe
that I'm arguing against. As long as C retains anything remotely
resembling a pointer; in other words, as long as C continues to
remotely resemble anything like the current version of C, it will be
less than perfectly safe (except in your modified sense, in which the
fact that something that can be used safely is safe.)
[...] If it protected your access to those features, that
protection (regardless of what form it takes) would make it a
high-level language.


So you are saying C becomes a high level language as soon as you start
using something like Bstrlib (or Vstr, for example)? Are you saying


"high" -> "higher". As you say, it's all relative.
Yes, you can access things more directly in C than in other higher
level languages. That's what makes them higher-level languages.


Notice that doesn't coincide with what you've said above. But it does
coincide with the false dichotomy. The low-levelledness in of itself
is not what makes it unsafe -- this just changes the severity of the
failures.


More severe failures -> more unsafe (all else being equal). If you
think a system where a particulare error causes a compilation to fail,
with a clear error message pointing to where the problem may be found,
is just as safe a system where the same error causes a nuclear bomb to
explode, that's a might peculiar way of assessing risk. On both
systems, the same error causes a failure; the only difference is the
serverity of the failure.
the most dangerous features of C is that it has pointers, which is a
concept only one layer of abstraction removed from the concept of
machine addresses. Most of the "safer" high level languages provide
little or no access to machine addresses; that's part of what makes
them safer.


Ada has pointers.


I know almost nothing about Ada. But I guarantee you that in the
unlikely event that Ada is perfectly safe, it's pointers can't be exact
conceptual equivalents of C pointers.
> > But I'm not arguing that either. I am saying C is to a large degree
> > just capriciously and unnecessarily unsafe (and slow, and powerless,
> > and unportable etc., etc).
>
> Slow? Yes, I keep forgetting how much better performance one
> achieves when using Ruby or Python. Yeah, right.

I never put those languages up as alternatives for speed. The false
dichotomy yet again.


A more useful response would have been to identify these
safer-and-specdier-than-C languages that you're referring to.


Why? Because you assert that C represents the highest performing
language in existing?


No, I'm just saying tha by comparison with most of the other
non-assembly languages, it has a reputation for speed, not slowness.
You're characterization of it as slow comes across as mighty peculiar.
Its well known that Fortran beats C for numerical applications.
It may be well known, and there once was a lot of truth to that claim,
but it's no longer universally true. It used to be that the Fortran
compilers represented many more decades of refinement than C compilers.
However, C's been around for many decades by now, and C compilers have
caught up with, and in some cases surpassed, the competing Fortran
compilers. On many platforms, including the one I'm currently using,
the fortran compiler works by creating intermediate C code and passing
it to the C compiler.
Also,
if you take into account that assembly doesn't specify intrinsically
unsafe usages of buffers
Neither does C. Like assembler, it allows intrinsically unsafe usage.
Like assembler, C doesn't require unsafe usage.
> Unportable? You have got to be kidding. I must be
> hallucinating when I see my C source compiled and executing on
> Windows, Linux, NetWare, OS X, Solaris, *bsd, and a host of
> other UNIX-like platforms, on x86, x86-64, PPC, Sparc, etc.

Right. Because you write every piece of C code that's ever been
written right?


His comment says nothing to suggest that he's ported any specific
number of programs to those platforms. It could be a single program, it
could be a million. Why are you interpreting his claim suggesting that
ported many different programs to those platforms?


God, what is wrong with you people? He makes an utterly unfounded
statement about portability that's not worth arguing about.


It's a statement founded in his own experience. Are you claiming he's
lying? If so, on what basis? I've personally seen C code ported to a
wider variety of platforms than he listed, so I've no reason to doubt
that he might have ported it to those particular ones. What's your
reason for doubting it?
... I make the
obvious stab to indicate that that argument should be nipped in the
bud, but you just latch onto it anyways.

Making code portable in C requires a lot of discipline, and in truth a
lot of a testing (espcially on numerics, its just a lot harder than you
might think). Its discipline that in the real world basically nobody
has. Randy is asserting that C is portable because *HE* writes C code
that is portable. And that's ridiculous, and needs little comment on
it.


No, he's asserting that C code is portable, because he's successfully
ported it. That's precisely the single most relevant assertion he could
make. If you claimed a mountain was unclimbable, and I pointed out that
I've climbed it, would that assertion be irrelevant?

Also, he's far from being the only person with that experience. C is
one of the most widely portable languages there is. I've heard claims
that Java is more widely portable, and those claims might be true, but
even if they are, they don't change the fact that C can be very
portable.

Nov 15 '05 #124
we******@gmail.com writes:
Keith Thompson wrote:
we******@gmail.com writes:
> ku****@wizard.net wrote: [...]
>> If it's low-level, by definition it gives you
>> access to unprotected access to dangerous features of the machine
>> you're writing for.
>
> So how does gets() or strtok() fit in this? Neither provides any low
> level functionality that isn't available in better ways through
> alternate means that are clearly safer (without being slower.)


I wouldn't put strtok() in the same category as gets(). strtok() is
ugly, but if it operates on a local copy of the string you want to
tokenize *and* if you're careful about not using it on two strings
simultaneously, it can be used safely.


You also missed the part where strtok is also laughably slow for the
most typical case of not modifying the second argument, and otherwise
really redundant with functions like strcspn, and strspn.


It may well be laughably slow and/or redundant; that wasn't my point.
My point is that the reasons for removing gets() from the standard
(that it can't be used safely) don't apply to strtok(). I wouldn't
particularly object to removing strtok(), perhaps replacing it with
something like strtrok_r(). It just isn't much of a concern to me
personally.

[...]
I don't exactly know what you want to do about array indexing. Simply
throwing away syntaxes like 1[arr], I would agree with (since it
doesn't add any value to the language.)


My objection to the way C defines array indexing is that it's nothing
more than a thin syntactic wrapper around pointer arithmetic. The
expression x[y] is *by definition* equivalent to *(x+y).

If I were designing C from scratch, arrays would be first-class
objects, there would be no decay from arrays to pointers, and the
indexing operator would be defined directly, not in terms of pointer
arithmetic. (1[arr] would go away as a side effect.)

This doesn't by itself imply bounds checking, but it would make it
easier to add it in a consistent way.

There's a correspondence (though not a perfect one) between control
flow constructs and data types. A loop is like an array. A block is
like a struct. An if-then-else is like a union (what some languages
call a variant record). And a pointer is like a goto. C has done a
reasonably good job of making gotos unnecessary; I would like it to
have done a better job of making pointers unnecessary.

Of course, it's far too late to do this, since it would break too much
existing code. Conceivably you could add a new array-like construct,
but then the language would have two ways to do the same thing, which
would probably be worse than the present situation.

And of course, sufficiently competent programmers can write good and
safe code even in a pointer-dependent language like C -- just as it's
possible to write good structured code with gotos.

Until I invent my time machine, I think we're just going to have to
leave C arrays the way they are.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #125
ku****@wizard.net wrote:
we******@gmail.com wrote:
ku****@wizard.net wrote:
we******@gmail.com wrote:
> Randy Howard wrote:
> > we******@gmail.com wrote:
> > > Randy Howard wrote:
...
> "False dichotomy". Look it up. I never mentioned high or low level
> language, and don't consider it relevant to the discussion. Its a
> false dichotomoy because you immediately dismiss the possibility of a
> safe low-level language.

No, it's not an immediate dimissal.
It is, and you simply continue to propgate it.


Unless you were in my office watching me when I read your message, and
used a stopwatch to time how long it took me to think about it, there's
no way you can know whether my dismissal was immediate or took a
considerable amount of time.


Oh I see, I'm all wrong because I used the word immediately in there
(referring to where in the post it was, and the fact that you don't
even acknowledge my position at all.)
[...] Your insistence that I it was immediate,
There are many other words I used in there other than "immediate".
Interesting that you are obsessing over that one. So if I ask you if
you've stopped beating your wife with a broomstick, will you scream at
me for suggesting that you own a broomstick?
[...] It's also not a dichotomy: low-level languages are inherently
unsafe, [...]


No. They may contain unsafe ways of using them. This says nothing
about the possibility of safe paths of usage.


Well, if you want to change the terms of discussion, you should warn
us.


I have not presented a position to the contrary of this. I'm not
changing anything, from my point of view. Its only your false
dichotomies that are preventing you from seeing that this is what I am
talking about, and that I am not actually talking about anything else.
These other tangents have not been introduced by me.
[...] There are safe paths of usage for C,
They are not obvious, and the mountains of CERT advisories suggest that
they generally are not travelled.
[...] too (they don't involve any
use of gets(), except in contexts that are so rare and unlikely that
they don't constitute justification for continuing to retain gets() in
the standard). If a low level languages can be considered safe if
there's a safe way to use it, that pretty thoroughly undercuts your
argument that C is unsafe.
Not if those paths are hidden or highly non-obvious. Or, require a
complete built up from scratch. The safety, measured in actually
resultant code which is "safe" (basically correctly implemented,
without unintended side-effects), will be primarily influenced by the
most often taken paths by the programmer to solve their problems. If
the most obvious libraries have land mines all over them, then
programmers are going to step on those landmines at some rate.

For most rational people, given that the problems are being highlight
by the mainstream press on a weekly basis, this would lead to a very
obvious question -- is it possible present an interface where the most
likely to be used paths are not nearly as dangerous as what C presents?
And if so, what are the minimum trade offs? In the case of Bstrlib,
as a substitute for C's string library, the answer is "yes you can make
such a thing, and the trade offs are none."
[...] Even gets() is safe: the safe way to use is "never".
Tell that to the ANSI/ISO C committee. According to their own
documentation they claim that gets() can be used under some unspecified
environmental assumptions.
Note: by this definition, a knife containing sharp poison-soaked pins
all along it's handle is safe, because there are ways to use it safely.
Personally, I'd recommend a different definition of "safe".
Really? Because I prefer the one that will ultimately lead to safer
code in real world production.
but high-level languages are not inherently safe.


Empty and irrelevant (and not really true; at least not relatively.)


You were the one who claimed the existence of a false dichotomy.
Dichotomies are by definition not fuzzy; especially the false ones -
the word derives from a Greek(?) work meaning "cut", referring to
making a sharp distinction between two different categories. The claims
that you were labelling "false dichotomies" are entirely consistent
with the fuzzy idea that higher level languages are safer than low
level languages.


Reread the definition. That's not what it means.
If it's low-level, by definition it gives you
access to unprotected access to dangerous features of the machine
you're writing for.


So how does gets() or strtok() fit in this? Neither provides any low
level functionality that isn't available in better ways through
alternate means that are clearly safer (without being slower.)


I agree; it's the suggestion that fixing those wholes will make C safe
that I'm arguing against. As long as C retains anything remotely
resembling a pointer; in other words, as long as C continues to
remotely resemble anything like the current version of C, it will be
less than perfectly safe (except in your modified sense, in which the
fact that something that can be used safely is safe.)


And so who was arguing for perfect safety again? Please find the
applicable quote.
[...] If it protected your access to those features, that
protection (regardless of what form it takes) would make it a
high-level language.


So you are saying C becomes a high level language as soon as you start
using something like Bstrlib (or Vstr, for example)? Are you saying


"high" -> "higher". As you say, it's all relative.


At least in the case of Bstrlib, this is a ridiculous notion. No
low-level path is removed or obscured. Nothing is abstracted to any
degree in which the representation isn't known exactly. No
functionality or capability is given up, theoretical or otherwise.
Using Bstrlib, you remain at exactly the same low-leveledness as
without it, because you can do all of the exact same things you did
before. You just happen to also have the option of doing things safer
and faster.

Saying Bstrlib makes C more high level, is like saying AMD's 64 bit
instruction set makes the x86 more RISC-like because they added more
registers.
Yes, you can access things more directly in C than in other higher
level languages. That's what makes them higher-level languages.


Notice that doesn't coincide with what you've said above. But it does
coincide with the false dichotomy. The low-levelledness in of itself
is not what makes it unsafe -- this just changes the severity of the
failures.


More severe failures -> more unsafe (all else being equal). If you
think a system where a particulare error causes a compilation to fail,
with a clear error message pointing to where the problem may be found,
is just as safe a system where the same error causes a nuclear bomb to
explode, that's a might peculiar way of assessing risk. On both
systems, the same error causes a failure; the only difference is the
serverity of the failure.


Ok, but this is irrelevant. You can't change the specification of C to
make it decrease the severity of UB. You can only act on the
probability of failure occurrences. Furthermore, severe errors are
*NOT* confined to low-level languages. Java has race conditions, which
are arbitrarily bad, and nobody considers Java a low-level language.
the most dangerous features of C is that it has pointers, which is a
concept only one layer of abstraction removed from the concept of
machine addresses. Most of the "safer" high level languages provide
little or no access to machine addresses; that's part of what makes
them safer.


Ada has pointers.


I know almost nothing about Ada. But I guarantee you that in the
unlikely event that Ada is perfectly safe,


No real language is perfectly safe. You would have to take loops out.
What the hell are you talking about? Ada is fairly safe; rather than
detailing something I only have modest familliarity with, let me just
point out that the US military up until recently used Ada exclusively
basically because it is a safe language (and they do not consider Java
any safer.)
[...] it's pointers can't be exact conceptual equivalents of C pointers.
You mean Ada is not C? That is correct.
> > > But I'm not arguing that either. I am saying C is to a large
> > > degree just capriciously and unnecessarily unsafe (and slow,
> > > and powerless, and unportable etc., etc).
> >
> > Slow? Yes, I keep forgetting how much better performance one
> > achieves when using Ruby or Python. Yeah, right.
>
> I never put those languages up as alternatives for speed. The false
> dichotomy yet again.

A more useful response would have been to identify these
safer-and-specdier-than-C languages that you're referring to.


Why? Because you assert that C represents the highest performing
language in existing?


No, I'm just saying tha by comparison with most of the other
non-assembly languages, it has a reputation for speed, not slowness.


Right, as I've posted before C has lots of "FAKE SPEED", that makes
people think its a fast language. Its like how sporty commuter cars
have spoilers or impressive air intake grills, and are really amazing
aerodynamic. Its utter nonsense, and has nothing to do with anything
about the car's performance -- but it looks cool. Same thing with the
C.
You're characterization of it as slow comes across as mighty peculiar.
I'm sure it does to you, if you've bought into the silly notion that C
is a fast language.
Its well known that Fortran beats C for numerical applications.


It may be well known, and there once was a lot of truth to that claim,
but it's no longer universally true.


Uhh ... excuse me, but it will remain true for as long as "restrict" is
not widely deployed on enough C compilers, to compell programmers to
use it.
[...] It used to be that the Fortran
compilers represented many more decades of refinement than C compilers.
The state of the art C and Fortran compilers from Intel (which win
basically every benchmark there is) uses a common backend (i.e., they
both compile to the same intermediate language, before being optimized
then translated to assembly). The C language can only keep up with the
Fortran on linear algebra stuff, if restrict is used (in which case the
two become equivalent) or some unsafe switch like "assume no aliasing"
is set for the C compiler.
However, C's been around for many decades by now, and C compilers have
caught up with, and in some cases surpassed, the competing Fortran
compilers.
Not on numerical stuff. Compiler technology has nothing to do with it.
The language is simply a barrier.
[...] On many platforms, including the one I'm currently using,
the fortran compiler works by creating intermediate C code and passing
it to the C compiler.
Oh you mean like the Absoft compiler? Go look at the polyhedron
benchmark site to see how worthless that is as a strategy for compiling
Fortran. Only marketroids from Apple, with a specific intention of
decieving people on benchmarks would ever use such a kind of compiler.
Also,
if you take into account that assembly doesn't specify intrinsically
unsafe usages of buffers


Neither does C.


Excuse me, but gets() is in the C specification, and *MUST* lead to
unsafe usage of buffers. Most C string function require implicit
assumptions about buffer lengths that are unchecked by any mechanism.
C includes necessarily non-reentrant functions like strtok(). Assembly
present you with no such weaknesses from its baseline specification.
[...] Like assembler, it allows intrinsically unsafe usage.
Like assembler, C doesn't require unsafe usage.


Unlike C, assembler does not *promote* unsafe usage. You can only
program "unsafely" in assembler, if you create the unsafe scenario,
specification, and semantics from the group up.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 15 '05 #126
Wojtek Lerch wrote:
<we******@gmail.com> wrote:
Wojtek Lerch wrote:
Sure. Whatever. I don't think a lot of programmers learn C from the
Standard or the Rationale anyway. It should be the job of teachers and
handbooks to make sure that beginners realize that it's not a good idea
to
use gets(), or to divide by zero, or to cause integer overflow.


Uh ... excuse me, but dividing by zero has well defined meaning in IEEE
754, and there's nothing intrinsically wrong with it (for most
numerators you'll get inf or -inf, or otherwise a NaN).


Sorry, I meant integer division by zero. Besides, standard C does not
require IEEE 754, does it?


I have no idea, I don't use platforms that are not IEEE 754 compliant.
I'm just latching onto what you think is or is not a good idea.
Remember IEEE 754 is a specification as well, and unlike the C
specificiation, its generally completely adhered to, and is not
littered with UB.
Integer
overflow is also extremely well defined, and actually quite useful on
2s complement machines (you can do a range check with a subtract and
unsigned compare with one branch, rather than two branches.)


Extremely well defined by who? In standard C, it's undefined.


I said "on 2s completement machines". Please read all the words.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 15 '05 #127
<we******@gmail.com> wrote in message
news:11**********************@z14g2000cwz.googlegr oups.com...
Wojtek Lerch wrote:
<we******@gmail.com> wrote:
> Uh ... excuse me, but dividing by zero has well defined meaning in IEEE
> 754, and there's nothing intrinsically wrong with it (for most
> numerators you'll get inf or -inf, or otherwise a NaN).
Sorry, I meant integer division by zero. Besides, standard C does not
require IEEE 754, does it?


I have no idea, I don't use platforms that are not IEEE 754 compliant.


A lot of people never use platforms that are not Intel. But when you're
teaching a programming language, it's important to distinguish between
promises made by the language standard and those made by some other
standards or hardware or software vendors.
I'm just latching onto what you think is or is not a good idea.
Remember IEEE 754 is a specification as well, and unlike the C
specificiation, its generally completely adhered to, and is not
littered with UB.


Perhaps. But, like I said, I meant integer division by zero.
> Integer
> overflow is also extremely well defined, and actually quite useful on
> 2s complement machines (you can do a range check with a subtract and
> unsigned compare with one branch, rather than two branches.)


Extremely well defined by who? In standard C, it's undefined.


I said "on 2s completement machines". Please read all the words.


I know what you said. But in standard C, signed integer overflow is
undefined behaviour, no matter whether a machine is 2s complement or not.
Nov 15 '05 #128
On Wed, 31 Aug 2005 18:54:24 -0400, "Wojtek Lerch" <Wo******@yahoo.ca>
wrote:
<snip>
On the other hand, I don't think it would be unreasonable for the Standard
to officially declare gets() as obsolescent in the "Future library
directions" chapter.

Though only 1 of 5 'obsolescent's in C90 actually went away in C99.

<G>

- David.Thompson1 at worldnet.att.net
Nov 15 '05 #129
On Wed, 31 Aug 2005 20:00:33 GMT, Keith Thompson <ks***@mib.org>
wrote:
Chris Hills <ch***@phaedsys.org> writes:
In article <11**********************@o13g2000cwo.googlegroups .com>,
we******@gmail.com writes <snip>
Personally I would be
shocked to know that *ANY* nuclear reactor control mechanism was
written in C. Maybe a low level I/O driver library, that was
thoroughly vetted (because you probably can't do that in Ada), but
that's it.


Which destroys your argument! Use Ada because it is safe but the
interface between Ada and the hardware is C.... So effectively C
controls the reactor.


Just to correct the misinformation, there's no reason a low level I/O
driver library couldn't be written in Ada. The language was designed
for embedded systems. Ada can do all the unsafe low-level stuff C can
do; it just isn't the default.


It's certainly possible in the language, which was designed for the
gamut from embedded to data-processing -- it even has some things
arguably lower than C like rep specs. But it might not be convenient
in a particular situation, for example if the driver interface is
defined in C using features not easily translated automatically to Ada
so you may have to manually redo or even recode the bindings on every
release, which might be quite frequent. While it would still be
possible it might not offer enough benefit to justify the cost.

- David.Thompson1 at worldnet.att.net
Nov 15 '05 #130
Wojtek Lerch wrote:
<we******@gmail.com> wrote in message
news:11**********************@z14g2000cwz.googlegr oups.com...
Wojtek Lerch wrote:
<we******@gmail.com> wrote:

Uh ... excuse me, but dividing by zero has well defined meaning in IEEE
754, and there's nothing intrinsically wrong with it (for most
numerators you'll get inf or -inf, or otherwise a NaN).

Sorry, I meant integer division by zero. Besides, standard C does not
require IEEE 754, does it?
I have no idea, I don't use platforms that are not IEEE 754 compliant.


A lot of people never use platforms that are not Intel. But when you're
teaching a programming language, it's important to distinguish between
promises made by the language standard and those made by some other
standards or hardware or software vendors.


Especially as there are processors in your today with *no* floating
point hardware that are programmed in C. I know because I used to work
on one in C. I've no idea whether they implemented IEE 754 in software,
but I hope that what they implemented used all they hacks possible (that
don't break complience with the C standard) to get it running as fast as
possible.
I'm just latching onto what you think is or is not a good idea.
Remember IEEE 754 is a specification as well, and unlike the C
specificiation, its generally completely adhered to, and is not
littered with UB.
IEEE 754 is probably only adhered to by a subset fo C implementations.
Perhaps. But, like I said, I meant integer division by zero.

Integer
overflow is also extremely well defined, and actually quite useful on
2s complement machines (you can do a range check with a subtract and
unsigned compare with one branch, rather than two branches.)

Extremely well defined by who? In standard C, it's undefined.


I said "on 2s completement machines". Please read all the words.


I know what you said. But in standard C, signed integer overflow is
undefined behaviour, no matter whether a machine is 2s complement or not.


I would also point out that there is 2s complement hardware in use today
that can be told to *limit* on overflow instead of wrapping. I don't
know off the top of my head whether the C compiler I was using can use
the hardware in that way, but it is definitely possible and could be
extremely useful.
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.
Nov 15 '05 #131
we******@gmail.com wrote:
Wojtek Lerch wrote: ....
Sorry, I meant integer division by zero. Besides, standard C does not
require IEEE 754, does it?


I have no idea, I don't use platforms that are not IEEE 754 compliant.


C is very intentionally not so restricted.
I'm just latching onto what you think is or is not a good idea.
Remember IEEE 754 is a specification as well, and unlike the C
specificiation, its generally completely adhered to, and is not
littered with UB. From what I've read in this newsgroup, obscure violations of IEEE 754

are actually pretty common. When you consider that the C standard is a
lot more complicated than IEEE 754, the degree to which compilers
comply with C90 is pretty high. Compliance with the new features of C99
isn't as good as I'd like, but they represent only a small fraction of
the entire language.
Integer
overflow is also extremely well defined, and actually quite useful on
2s complement machines (you can do a range check with a subtract and
unsigned compare with one branch, rather than two branches.)


Extremely well defined by who? In standard C, it's undefined.


I said "on 2s completement machines". Please read all the words.


Given the way you constructed your sentence, the phrase "on 2s
complement machines" only qualifies "actually quite useful". If you'd
intended it to apply to "extremely well defined" as well, you should
have constructed the sentence differently. A comma after "useful" would
be the simplest fix.

Nov 15 '05 #132

Keith Thompson wrote:
we******@gmail.com writes:
ku****@wizard.net wrote:
.... I wouldn't put strtok() in the same category as gets(). strtok() is
ugly, but if it operates on a local copy of the string you want to
tokenize *and* if you're careful about not using it on two strings
simultaneously, it can be used safely. If I were designing a new
library I wouldn't include strtok(), but it's not dangerous enough to
require dropping it from the standard.


I certainly agree that deprecating->removing gets() is a higher
priority, but ensuring that strtok() is not used on two different
strings simultaneously is trickier than it sounds. A library that I'm
currently responsible for, but which was designed and written by
someone else, contained a function which calls strtok(). A user of the
library called that function while his own code was in the middle of
using strtok(). Very confusing! I removed all use of strtok() from the
library completely, which wasn't difficult.

Nov 15 '05 #133

Keith Thompson wrote:
we******@gmail.com writes:
ku****@wizard.net wrote:
.... I wouldn't put strtok() in the same category as gets(). strtok() is
ugly, but if it operates on a local copy of the string you want to
tokenize *and* if you're careful about not using it on two strings
simultaneously, it can be used safely. If I were designing a new
library I wouldn't include strtok(), but it's not dangerous enough to
require dropping it from the standard.


I certainly agree that deprecating->removing gets() is a higher
priority, but ensuring that strtok() is not used on two different
strings simultaneously is trickier than it sounds. A library that I'm
currently responsible for, but which was designed and written by
someone else, contained a function which calls strtok(). A user of the
library called that function while his own code was in the middle of
using strtok(). Very confusing! I removed all use of strtok() from the
library completely, which wasn't difficult.

Nov 15 '05 #134

In article <11*********************@f14g2000cwb.googlegroups. com>, ku****@wizard.net writes:

A library that I'm
currently responsible for, but which was designed and written by
someone else, contained a function which calls strtok(). A user of the
library called that function while his own code was in the middle of
using strtok(). Very confusing! I removed all use of strtok() from the
library completely, which wasn't difficult.


For that reason, when I'm writing a library, I generally try to avoid
as much as possible the functions that the standard prohibits the
implementation itself from (observably) calling ("The implementation
shall behave as if no library function calls the XXX function"):

getenv
localeconv
mblen
mbtowc
rand
setlocale
signal
srand
strerror
strtok
tmpnam
wctomb

(I think that's the whole C99 list.) Sometimes one of these (eg
getenv) is difficult to work around, even impossible for portable
code if that functionality is required by the specifications for
the library I'm writing; but as you noted regarding strtok, others
are easy to dispense with.

Of course this list isn't comprehensive - there are other library
functions with side effects, and many outside the standard - but
it's a start.

--
Michael Wojcik mi************@microfocus.com

Push up the bottom with your finger, it will puffy and makes stand up.
-- instructions for "swan" from an origami kit
Nov 15 '05 #135
En <news:11**********************@g43g2000cwa.googleg roups.com>,
we******@gmail.com va escriure:
Its one of the very few functions in C that is not reentrant.
I find the actual number of "few" to be far too high. For example, I can
count 12 occurences of the "shall behave as if no library function calls"
moniker.
How about at least adding gcc's strtok_r()?


Last time I had a look at GCC there were no strtok_r() in the C compiler.
And a freestanding compiler is NOT the place I would look at to find it.
Antoine

Nov 15 '05 #136
Flash Gordon <sp**@flash-gordon.me.uk> writes:
Wojtek Lerch wrote:
<we******@gmail.com> wrote in message
news:11**********************@z14g2000cwz.googlegr oups.com...
Wojtek Lerch wrote:

<we******@gmail.com> wrote: [...]> Integer
>overflow is also extremely well defined, and actually quite useful on
>2s complement machines (you can do a range check with a subtract and
>unsigned compare with one branch, rather than two branches.)

Extremely well defined by who? In standard C, it's undefined.

I said "on 2s completement machines". Please read all the words.

I know what you said. But in standard C, signed integer overflow is
undefined behaviour, no matter whether a machine is 2s complement or
not.


I would also point out that there is 2s complement hardware in use
today that can be told to *limit* on overflow instead of wrapping. I
don't know off the top of my head whether the C compiler I was using
can use the hardware in that way, but it is definitely possible and
could be extremely useful.


That raises another interesting point. Even if we could assume that
all implementations use two's-complement, mandating the usual
wraparound on overflow would preclude the possibility of a checking
implementation that treats overflow as an error.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #137
we******@gmail.com wrote:
ku****@wizard.net wrote:
we******@gmail.com wrote:
ku****@wizard.net wrote: .... > No, it's not an immediate dimissal.

It is, and you simply continue to propgate it.
Unless you were in my office watching me when I read your message, and
used a stopwatch to time how long it took me to think about it, there's
no way you can know whether my dismissal was immediate or took a
considerable amount of time.


Oh I see, I'm all wrong because I used the word immediately in there
(referring to where in the post it was, and the fact that you don't
even acknowledge my position at all.)


I don't acknowledge that your position is valid, because I don't
consider that to be the case. I explained why. What more
acknowledgement than that did I owe to a position I consider to be
wrong?
[...] Your insistence that I it was immediate,


There are many other words I used in there other than "immediate".
Interesting that you are obsessing over that one. So if I ask you if
you've stopped beating your wife with a broomstick, will you scream at
me for suggesting that you own a broomstick?


No, because there's nothing insulting in that part of the suggestion.
If I had said in a previous message that I didn't own a broomstick, and
if your follow up message had referred to "your broomstick" rather than
"a broomstick", then your follow-up would have implied that I was
lying. I would have objected to that. Objecting to the rest of the
accusation would have been a priority, as being far more serious, but I
would have eventually gotten around to objecting to the implicit
accusation that I was lying.
[...] There are safe paths of usage for C,


They are not obvious, and the mountains of CERT advisories suggest that
they generally are not travelled.


Well, the safe paths of usage for assembler are equally inobvious, if
not more so, and the same can be said of any low-level language, which
is the point we've been trying to make to you. The mountains of CERT
advisories are at least equally a consequence of the popularity of C,
which means that there are more opportunities for errors to be made in
C.
[...] Even gets() is safe: the safe way to use is "never".


Tell that to the ANSI/ISO C committee. According to their own
documentation they claim that gets() can be used under some unspecified
environmental assumptions.


I'm in perfect agreement with your objections to gets(). It's the
suggestion that a few changs to the library (specifically, the
incorporation of your alternative library) would be sufficient to make
it safe that is bizarre.
Note: by this definition, a knife containing sharp poison-soaked pins
all along it's handle is safe, because there are ways to use it safely.
Personally, I'd recommend a different definition of "safe".


Really? Because I prefer the one that will ultimately lead to safer
code in real world production.


"Has safe paths of usage" doesn't meet that requirement, because the
language can be arbitrarily difficult to use safely, and still possess
safe paths of usage.
You were the one who claimed the existence of a false dichotomy.
Dichotomies are by definition not fuzzy; especially the false ones -
the word derives from a Greek(?) work meaning "cut", referring to
making a sharp distinction between two different categories. The claims
that you were labelling "false dichotomies" are entirely consistent
with the fuzzy idea that higher level languages are safer than low
level languages.


Reread the definition. That's not what it means.


When I asked www.ask.com what the definition of the word "dichotomy"
was, the top ten web sites listed all used one or the other of the
following two definitions (with exactly identical wording!):

"Being twofold; a classification into two opposed parts or subclasses"

"Division into two; especially, the division of a class into two
subclasses opposed to each other by contradiction, as the division of
the term man into white and not white."

Note the use of the word "opposed", implying a sharp distinction with
no middle ground. The word "contradiction" reinforces that meaning. The
phrase "false dichotomy" directly addresses that sharpness; it says
that the distinction is not sharp, and that it's incorrect to treat it
as if it were.

.... And so who was arguing for perfect safety again? Please find the
applicable quote.
I've reviewed your previous posts, and I concede that you've not argued
for perfect safety; you merely seem unrealistically optimistic about
the possibilities for radical improvement in safety. Yes, improvement
is possible, and removing gets() is a small step in that direction; but
as long as programmers get angry at compiler vendors for failing to
support legacy code that uses gets(), removing it from the standard
won't have much real-world effect. Real, significant improvements in
safety will only be achieved by removing access to the ability to use
low-level C constructs. And then it won't be C any more.

.... ... Furthermore, severe errors are
*NOT* confined to low-level languages. ...
We never suggested that they were, We only pointed out that the
liklihood and severity of errors is higher in low level languages.

....
I know almost nothing about Ada. But I guarantee you that in the
unlikely event that Ada is perfectly safe, .... ... Ada is fairly safe; ....
[...] it's pointers can't be exact conceptual equivalents of C pointers.
Which, I gather from other poster's messages, is in fact the case.
You mean Ada is not C? That is correct.
Well, I was being more general than that, but the more specific
statement you're attributing to me can be interpreted in a way that
makes it a special case of the more general statement I actually made.

The fact that Ada's pointers are quite different than C's pointers is
part of the reason it's possible for Ada to be safer. Any language that
was radically safer than C couldn't look sufficiently similar to C to
justify inheriting the name.

.... I'm sure it does to you, if you've bought into the silly notion that C
is a fast language.
Until you've identified a truly faster high-level language, I'll stick
with that notion. It's umambiguously faster than any of the languages
that are suitable for use on my current project; your options may be
different than mine.
Also,
if you take into account that assembly doesn't specify intrinsically
unsafe usages of buffers


Neither does C.


Excuse me, but gets() is in the C specification, and *MUST* lead to
unsafe usage of buffers.


Only if it's actually used. I'm unaware of any part of the C standard
that specifies that you must use gets(). If we're using your "safe
usage path" criterion for safety, the fact that C provides safe
alternatives to gets() is sufficient to ensure that C is safe.
... Most C string function require implicit
assumptions about buffer lengths that are unchecked by any mechanism.
C includes necessarily non-reentrant functions like strtok().
C standard libraries must contain strtok(). To the best of my
knowledge, the C standard doesn't require your program to actually call
strtok(). By your safety criteria, the fact that there is a safe usage
path for C (one that avoids calling strtok()) is sufficient to make C
safe. It doesn't make strtok() safe, but then I never claimed that it
was. C is safe, by your "safe usage path" criterion; by saner criteria,
its a fairly dangerous language.
... Assembly
present you with no such weaknesses from its baseline specification.


Calling gets() and strtok() in C is a misuse of C; writing code that
performs the equivalent functionality is possible in every assembly
language that I'm familiar with, and constitutes a misuse of assembly.
Of course, my familiarity with assembler is limited to three different
assembly languages for three radically different platforms. It could
very well be that in most modern assembly languages, it's impossible to
write gets() or strtok() - but I'd be very curious to see how that was
achieved.
[...] Like assembler, it allows intrinsically unsafe usage.
Like assembler, C doesn't require unsafe usage.


Unlike C, assembler does not *promote* unsafe usage. You can only
program "unsafely" in assembler, if you create the unsafe scenario,
specification, and semantics from the group up.


Neither does the C language. I don't remember any part of the C
standard that promotes use of gets() and strtok(); they're merely
presented as things you can do. The fact that it's impossible, and
difficult, respectively, to avoid undefined behavior when calling those
routines is something that the standard fails to warn you about, but
the standard is not about telling you the right way to program, that's
what textbooks are for.

Nov 15 '05 #138
Randy Howard wrote:
we******@gmail.com wrote
(in article
<11**********************@o13g2000cwo.googlegroups .com>): ....
But I'm not arguing that either. I am saying C is to a large degree
just capriciously and unnecessarily unsafe (and slow, and powerless,
and unportable etc., etc).

.... Unportable? You have got to be kidding. I must be
hallucinating when I see my C source compiled and executing on
Windows, Linux, NetWare, OS X, Solaris, *bsd, and a host of
other UNIX-like platforms, on x86, x86-64, PPC, Sparc, etc.


I've been thinking about this, and there's at least two very different
concepts of portability that might be relevant here, and websnarf is
probably using a different one than you and I.

Most C code is unportable, for one reason or another, and I think
that's what websnarf is thinking of. However, paradoxically, that fact
is directly related to the fact that C is one of the best languages
available for writing code that needs to be portable, which is what I
was thinking of (and I presume you, as well).

The C standard specifies that the behavior of a great many programs is
either implementation-defined or undefined, and often doesn't specify
the behavior at all. The implementation-defined behavior can be chosen
in a manner that's optimal for each implementation. The fact that
construct X makes the behavior undefined allows a vendor to implement
the behavior of other constructs without having to worry about the
possibility that construct X might exist, which allows for more
efficient implementation. If the fact that the behavior of X was
undefined were a random occurance, it would be pretty unlikely for it
to allow a significant performance improvement. However, it's not
random: in many cases the decision was make to make the behavior
undefined, for the express purpose of allowing more efficient
implementation. Finally, undefined and implementation-defined behavior
is the basis for implementation-specific extensions to C that make the
implementation more efficient and useful to those who want or need to
be able to write unportable C code. "wanting" is lot more common than
actual "needing", but there are legitimate reasons for needing to write
unportable code.
The net result is that it's possible to produce a conforming
implementation of C that is efficient and useful enough to be
profitable, on a wider variety of platforms than would be possible if
the C standard imposed stricter requirements. As a result, you can
count on the presence of a implementation that will accept, translate,
and correctly execute your code on a wider variety of platform than
most other languages.

Of course, that's only true if you're careful enough to avoid writing
unportable C code. I'll concede that it's tricky to write
widely-portable C code, but it's certainly not impossible, or even
unacceptably difficult, to do so.

Nov 15 '05 #139
we******@gmail.com wrote:
... They are basically
demanding platform specific, support to make this function safe --
No. If you want to understand what was written, you need to
suppress your preconceptions.
Uh ... excuse me, but dividing by zero has well defined meaning in IEEE
754, and there's nothing intrinsically wrong with it ...


Obviously the fellow meant, as an integer operation.
Anyway, there *is* something intrinsically wrong with it,
or more specifically with trying to integrate infinite
"values" into the real number system.
Nov 15 '05 #140
"Douglas A. Gwyn" <DA****@null.net> writes:
Anyway, there *is* something intrinsically wrong with it,
or more specifically with trying to integrate infinite
"values" into the real number system.


Why? Its integration into mathematical logic (and analysis) certainly made
a lot of reasoning and proofs easer. C.f. "Nonstandard Analysis" at e.g.
http://mathworld.wolfram.com/NonstandardAnalysis.html or
http://en.wikipedia.org/wiki/Nonstandard_analysis

But "nonstandard C" (in the above sense) wouldn't really work, I guess.

Bye, Dragan

--
Dragan Cvetkovic,

To be or not to be is true. G. Boole No it isn't. L. E. J. Brouwer

!!! Sender/From address is bogus. Use reply-to one !!!
Nov 15 '05 #141
we******@gmail.com wrote:
Remember IEEE 754 is a specification as well, and unlike the C
specificiation, its generally completely adhered to, and is not
littered with UB.


IEEE 754 was hardly "completely adhered to", and it fails
to specify some things (such as NaN encoding). Also it
was meant from the outset as a spec to be applied to new
implementations of floating-point support, not to cover
existing implementations. Much of C's undefined behavior
is intentional in order to support a wide variety of ISAs
that the C standard is not in a position to constrain.
Nov 15 '05 #142
we******@gmail.com wrote:
Its well known that Fortran beats C for numerical applications.
Lots of things are "well known" without being true.
if you take into account that assembly doesn't specify intrinsically
unsafe usages of buffers (like including a gets() function) you could
consider assembly safer ... than C.
Only somebody who is blinded by preconceptions could make
such a ludicrous claim.
But that's all besides the point. I modify my own C usage to beat its
performance by many times on a regular basis (dropping to assembly,
making 2s complement assumptions, unsafe casts between integer types
and pointers etc), and obviously use safe libraries (for strings,
vectors, hashes, an enhanced heap, and so on) that are well beyond the
safety features of C. In all these cases some simple modifications to
the C standard and C library would make my modifications basically
irrelevant.
Because there would be no C implementations except on the
specific platforms to which you limited your code, and
you would force all implementors to provide you with the
specific additional libraries you use. There are good
reasons why the C standards committee doesn't want to do
that.
Making code portable in C requires a lot of discipline, and in truth a
lot of a testing (espcially on numerics, its just a lot harder than you
might think). Its discipline that in the real world basically nobody
has.


The fellow you were responding to already provided one
example of a person with the requisite discipline to use
C safely and effectively. There are many others.

Comparable discipline is needed for any robust software
development process, regardless of the tools used. You
can get away to some extent with relying on tools to
catch some of your sloppiness, but they won't be able to
tell you that you meant x0, not x1 or y0 or whatever you
wrote. Nor is it wise for general-purpose compilers to
try to act as model checkers, etc.
Nov 15 '05 #143
ku****@wizard.net wrote:
... I've heard claims that Java is more widely portable, ...


It achieves that by specifying not just the language but
also the data-type representations and object code format.
The down side is that on almost every platform, that
requires simulation of the Java machine architecture.
("Just-in-time" compiling reduces the impact of that.)
C on the other hand was always intended to map directly
onto native hardware operations, such as twiddling bits
in memory-mapped device control registers.
Nov 15 '05 #144
ku****@wizard.net wrote
(in article
<11*********************@g44g2000cwa.googlegroups. com>):
Randy Howard wrote:
we******@gmail.com wrote
(in article
<11**********************@o13g2000cwo.googlegroups .com>): ...
But I'm not arguing that either. I am saying C is to a large degree
just capriciously and unnecessarily unsafe (and slow, and powerless,
and unportable etc., etc).

...
Unportable? You have got to be kidding. I must be
hallucinating when I see my C source compiled and executing on
Windows, Linux, NetWare, OS X, Solaris, *bsd, and a host of
other UNIX-like platforms, on x86, x86-64, PPC, Sparc, etc.


I've been thinking about this, and there's at least two very different
concepts of portability that might be relevant here, and websnarf is
probably using a different one than you and I.

Most C code is unportable, for one reason or another, and I think
that's what websnarf is thinking of.


Hmmm. If it /is/ C code, then it should be portable. If it is
/sort of/ C code, using a lot of platform extensions that aren't
standard C, then it probably isn't portable, but it can be made
to be portable, with a small amount of extra effort in many
cases.
However, paradoxically, that fact
is directly related to the fact that C is one of the best languages
available for writing code that needs to be portable, which is what I
was thinking of (and I presume you, as well).
Yes, it is. I can't think of any processor-neutral language
available on even a fraction of all the systems for which C is
available.

You do not have to write code that is not portable, although you
may certainly do so. In some cases, it is required. Even then
though, it is possible to use conditional compilation to
generate software that uses platform-specific extensions in part
of the overall source tree yet keeps those exceptions isolated
so that the majority is portable, and adding a new platform to
one of the system-specific modules is minimally painful.
The net result is that it's possible to produce a conforming
implementation of C that is efficient and useful enough to be
profitable, on a wider variety of platforms than would be possible if
the C standard imposed stricter requirements.
Yes.
As a result, you can
count on the presence of a implementation that will accept, translate,
and correctly execute your code on a wider variety of platform than
most other languages. Of course, that's only true if you're careful enough to avoid writing
unportable C code. I'll concede that it's tricky to write
widely-portable C code, but it's certainly not impossible, or even
unacceptably difficult, to do so.


I agree completely. The problem is, those that only have access
to one, or a few platforms to develop on often have to learn how
the 'hard way', as nothing teaches it faster than the experience
of actually getting your code to compile and work properly on
multiple platforms with fundamental differences in architecture.

For example, anyone that develops primarily on Intel or AMD
based hardware should consider buying a $499 Mac Mini (while
Apple is still shipping PPC) if they think they might have
big-/little-endian issues in their code. It comes with
development tools (including gcc) and from a terminal session
looks pretty much the same as your favorite linux or freebsd
box. If you really care about such things, I can't think of a
more cost-effective way to get a test bed in place to help you
verify the code.

--
Randy Howard (2reply remove FOOBAR)

Nov 15 '05 #145
Randy Howard wrote:
ku****@wizard.net wrote
(in article
<11*********************@g44g2000cwa.googlegroups. com>): ....
Most C code is unportable, for one reason or another, and I think
that's what websnarf is thinking of.


Hmmm. If it /is/ C code, then it should be portable. If it is


Only if it's strictly conforming C code, which describes a vanishingly
small portion of real C code. Using any standard of conformance less
strict than 'strictly conforming' means that there are platforms it
might not be portable to.

For instance, most of my C code either directly or indirectly calls
many functions that are defined in third-party or system libraris with
names that fall within the namespace reserved for users. Many of those
functions aren't written in strictly conforming C (they are often
written in some other language entirely). That's sufficient to render
my programs non-portable. That's OK because our contract with the
client makes use of those libraries mandatory. Code that's not strictly
conforming for reasons like this one is the norm, not the exception.

.... You do not have to write code that is not portable, although you
may certainly do so. In some cases, it is required. Even then
though, it is possible to use conditional compilation to
generate software that uses platform-specific extensions in part
of the overall source tree yet keeps those exceptions isolated
so that the majority is portable, and adding a new platform to
one of the system-specific modules is minimally painful.


It is possible to use conditional compilation based upon the various
macros defined by the C standard as having implementation-specific
values, to achieve portability. However, if your conditions are
dependent on anything other than those macros, you've probably got a
problem. For instance, the WonderC compiler might specify that it
automatically #defines __WONDERC, and that you should test for this
macro before making use of WonderC extensions. However, there's nothing
in the C standard that prevents the AwfulC compiler from also #defining
__WONDERC, even if it doesn't provide support for WonderC's extensions.
AwfulC could be a perfectly conforming implementation of C, and it
would still be permitted to reject that code, which means that the code
is not strictly conforming. As a practical matter, conditional
compilation based upon such macros is a powerful and effective
technique for making use of implementation-specific features, but
technically it's not guaranteed to work.

Nov 15 '05 #146
Dragan Cvetkovic wrote:
"Douglas A. Gwyn" <DA****@null.net> writes:
Anyway, there *is* something intrinsically wrong with it,
or more specifically with trying to integrate infinite
"values" into the real number system.

Why? Its integration into mathematical logic (and analysis) certainly made
a lot of reasoning and proofs easer. ...


It doesn't share in most of the algebraic properties,
sometimes its sign matters and sometimes not, etc. etc.
The latter is even worse in the complex realm where
direction often matters.
Nov 15 '05 #147
Randy Howard wrote:
Hmmm. If it /is/ C code, then it should be portable. If it is
/sort of/ C code, using a lot of platform extensions that aren't
standard C, then it probably isn't portable, but it can be made
to be portable, with a small amount of extra effort in many
cases.


Extensions aren't so much the problem as embedding
platform-specific assumptions into the code. It is
good to adopt a general policy of avoiding such
dependencies, but sometimes they are truly necessary.
Nov 15 '05 #148
>Chris Torek wrote:
I am reminded of a line from a novel and movie:
"*We* fix the blame. *They* fix the problem. Their way's better."

In article <11*********************@g14g2000cwa.googlegroups. com>
<we******@gmail.com> wrote:So in this case, how do we "fix" the problem of buffer overflows in C
programs?
I do not know whether this is even possible, much less how "we"
(whoever "we" are in this case) could go about it. I do think that
too many programmers turn to writing C code too soon; languages
that may be "slower" but are "higher level" tend to move the problem
from "dumb" things like buffer overflows (thus eventually executing
arbitrary code) to "smarter" things like allowing people access to
files that were intended to be privileged (thus eventually executing
arbitrary code).

I will, however, note that when I did the 4.2BSD stdio, I attempted
to leave gets() out of the library, or provide one that was annoying
to the programmer; but I was overruled. (I still think this would
have been an appropriate solution. If someone wanted to call gets()
instead of changing the source, it is not that hard to compile with
"cc ... -lgets" -- and it identifies such programs, for later
cleanup. With the BSD makefiles, they would define "LDADD" macros
that included "-lgets".)

[on coroutines]
In any event, I have *studied* coroutines in university, and can
revisit them in far more mainstream languages like Python and Lua. My
understanding of them is not the point (I think I do -- or at least the
"one shot continuation" kind) my point is they should probably be part
of the C standard.


They make a substantial change to the language, in that one must
"wrap up" function state that is, in typical implementations,
destroyed simply by the act of returning from the function. In
particular, one must abandon the idea of a simple linear stack on
which function activation-records are stored. I doubt any one
single entity, no matter how politically or economically powerful,
could push such a change through a future C standard. Perhaps a
group of entities (forming a voting bloc) could do this.

I suspect you would have better luck writing a new language.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 15 '05 #149
Chris Torek wrote:
<we******@gmail.com> wrote:
In any event, I have *studied* coroutines in university, and can
revisit them in far more mainstream languages like Python and Lua. My
understanding of them is not the point (I think I do -- or at least the
"one shot continuation" kind) my point is they should probably be part
of the C standard.

They make a substantial change to the language, in that one must
"wrap up" function state that is, in typical implementations,
destroyed simply by the act of returning from the function. In
particular, one must abandon the idea of a simple linear stack on
which function activation-records are stored. I doubt any one
single entity, no matter how politically or economically powerful,
could push such a change through a future C standard. Perhaps a
group of entities (forming a voting bloc) could do this.


As you say, linguistic support would involve substantial changes,
for relatively minuscule benefit. I'd suggest instead, devising
a platform-independent API for a coroutine library, implementation
of which would necessarily require platform-dependent code if the
coroutines themselves are to be coded as extended C functions.
If such a library worked out well enough, it could be a candidate
for standardization, or at any rate it would be likely to be
widely distributed as are many libraries these days.
Nov 15 '05 #150

This thread has been closed and replies have been disabled. Please start a new discussion.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.