469,576 Members | 1,682 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,576 developers. It's quick & easy.

Draft Secure C

http://www.open-std.org/jtc1/sc22/wg14/
http://www.open-std.org/jtc1/sc22/wg...docs/n1135.pdf

Has anyone gone through this?
Is this useful? Will it make it to the next standard?

Jan 12 '07 #1
68 2359
Jack wrote:
http://www.open-std.org/jtc1/sc22/wg14/
http://www.open-std.org/jtc1/sc22/wg...docs/n1135.pdf

Has anyone gone through this?
Yes.
Is this useful?
Not in my opinion, but others differ.
Will it make it to the next standard?
Let's hope not.

Robert Gamble

Jan 12 '07 #2
Robert Gamble said:
Jack wrote:
>http://www.open-std.org/jtc1/sc22/wg14/
http://www.open-std.org/jtc1/sc22/wg...docs/n1135.pdf

Has anyone gone through this?

Yes.
>Is this useful?

Not in my opinion, but others differ.
>Will it make it to the next standard?

Let's hope not.
Well, let's at least hope that we don't start designing a new Standard until
C99 has been widely implemented. What would be the *point*?

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
Jan 12 '07 #3
Richard Heathfield wrote:
Robert Gamble said:
Jack wrote:
http://www.open-std.org/jtc1/sc22/wg14/
http://www.open-std.org/jtc1/sc22/wg...docs/n1135.pdf

Has anyone gone through this?
Yes.
Is this useful?
Not in my opinion, but others differ.
Will it make it to the next standard?
Let's hope not.

Well, let's at least hope that we don't start designing a new Standard until
C99 has been widely implemented. What would be the *point*?
I think that a new Standard with carefully thought-out features which
reflect what the community values most may actually serve to increase
adoption of the rest of the C99 features faster than not. If a new
version provided functionality that the majority of the community could
get excited about and rally behind there is little doubt that
implementors would move at a much faster pace to implement it. Such a
venture would need to be extremely careful not to get bogged down with
the kinds of drastic changes, "special interests", and (what some
consider) unnecessary bloat that C99 brought along with it lest it
further jeopardize the relevance of the Standard, but success may be
the only thing that redeems it. TR 24731 qualifies for at least
"special interest" and bloat.

Robert Gamble

Jan 13 '07 #4
Jack wrote:
>
http://www.open-std.org/jtc1/sc22/wg14/
http://www.open-std.org/jtc1/sc22/wg...docs/n1135.pdf

Has anyone gone through this?
Is this useful? Will it make it to the next standard?
It originated at Microsoft. Nuff said.

--
"I was born lazy. I am no lazier now than I was forty years
ago, but that is because I reached the limit forty years ago.
You can't go beyond possibility." -- Mark Twain
Jan 13 '07 #5
Jack a écrit :
http://www.open-std.org/jtc1/sc22/wg14/
http://www.open-std.org/jtc1/sc22/wg...docs/n1135.pdf

Has anyone gone through this?
Is this useful? Will it make it to the next standard?
The lcc-win32 compiler system implements more than
50% of that proposal. Only the wide character stuff
remains to be implemented.

Even it is not a solution to the bugs in the language,
it is a big step forward. I have discussed several
of the questions concerned in that proposal and in other
related ones in this group (where it was received by
the usual remarks of the "regulars" group) and in comp.std.c

Specifically the proposal is weak concerning the solution
for the zero terminated strings problems. It gives some extra
security without eliminating the problem at the root.

The problem is a bad data structure, and that is the
error that should be corrected.

jacob

---
lcc-win32: a compiler system for windows
www.cs.virginia.edu/~lcc-win32
Jan 13 '07 #6
Robert Gamble wrote:
Richard Heathfield wrote:
Robert Gamble said:
Jack wrote:
>http://www.open-std.org/jtc1/sc22/wg14/
>http://www.open-std.org/jtc1/sc22/wg...docs/n1135.pdf
>>
>Has anyone gone through this?
>
Yes.
>
>Is this useful?
>
Not in my opinion, but others differ.
>
>Will it make it to the next standard?
>
Let's hope not.
Well, let's at least hope that we don't start designing a new Standard until
C99 has been widely implemented. What would be the *point*?

I think that a new Standard with carefully thought-out features which
reflect what the community values most may actually serve to increase
adoption of the rest of the C99 features faster than not. If a new
version provided functionality that the majority of the community could
get excited about and rally behind there is little doubt that
implementors would move at a much faster pace to implement it.
Absolutely correct. The Standard Committee may, or may not realize
this, but one thing is for sure, they do *NOT* realize that they
themselves are the reason this has not happened.
[...] Such a
venture would need to be extremely careful not to get bogged down with
the kinds of drastic changes, "special interests", and (what some
consider) unnecessary bloat that C99 brought along with it lest it
further jeopardize the relevance of the Standard, but success may be
the only thing that redeems it.
The C standard committee does not see the value of the language they
have ownership over, they don't see problems in the industry, and are
completely blind to the problems of the C language. There is a long
list of highly desirable features that the C language is crying out for
(no, not operator overloading -- I mean actual *functionality*) that
nobody is even thinking about.
[...] TR 24731 qualifies for at least "special interest" and bloat.
Huh? TR 24731 qualifies as *PLACEBO*. It does *NOT* accomplish what
it claims to set out to do. Because of the RSIZE_MAX fiasco, the
standard just continues to create portability problems. The thing is
just utter nonsense is what it is. The Robert Seacord proposal is a
little better (claimed to be targetted towards something different) but
*WAY* too slow. (It was sugggested by someone that I propose Bstrlib
to the C standard, however, manditory availability of the source is
considered one of its features.)

The *intended* scope of TR 24731 is not what I would call special
interest. Certainly any text processing programs should be using some
kind of alternate string functionality versus the CLIB's C strings.
Its just that TR 24731 is hardly any better.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jan 13 '07 #7
we******@gmail.com a écrit :
The C standard committee does not see the value of the language they
have ownership over, they don't see problems in the industry, and are
completely blind to the problems of the C language. There is a long
list of highly desirable features that the C language is crying out for
(no, not operator overloading -- I mean actual *functionality*) that
nobody is even thinking about.
Can you specify?
Jan 13 '07 #8
jacob navia <ja***@jacob.remcomp.frwrites:
[...]
Specifically the proposal is weak concerning the solution
for the zero terminated strings problems. It gives some extra
security without eliminating the problem at the root.

The problem is a bad data structure, and that is the
error that should be corrected.
Zero-terminated strings are not themselves a problem. They have both
advantages and disadvantages. And it's not possible to eliminate them
entirely from the C language, at least not without breaking existing
code. (Requiring *any* source-level modification to make a program
work under a new standard constitutes "breaking" existing code.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jan 13 '07 #9
Keith Thompson a écrit :
jacob navia <ja***@jacob.remcomp.frwrites:
[...]
>>Specifically the proposal is weak concerning the solution
for the zero terminated strings problems. It gives some extra
security without eliminating the problem at the root.

The problem is a bad data structure, and that is the
error that should be corrected.


Zero-terminated strings are not themselves a problem. They have both
advantages and disadvantages. And it's not possible to eliminate them
entirely from the C language, at least not without breaking existing
code. (Requiring *any* source-level modification to make a program
work under a new standard constitutes "breaking" existing code.)
Nobody wants to eliminate them in one
sweep. But an alternative could exist, that makes
their usage obsolete. Then, after 10-20 years
they are phased out.

That's all
Jan 13 '07 #10
jacob navia <ja***@jacob.remcomp.frwrites:
Keith Thompson a écrit :
>jacob navia <ja***@jacob.remcomp.frwrites:
[...]
>>>Specifically the proposal is weak concerning the solution
for the zero terminated strings problems. It gives some extra
security without eliminating the problem at the root.

The problem is a bad data structure, and that is the
error that should be corrected.
Zero-terminated strings are not themselves a problem. They have both
advantages and disadvantages. And it's not possible to eliminate them
entirely from the C language, at least not without breaking existing
code. (Requiring *any* source-level modification to make a program
work under a new standard constitutes "breaking" existing code.)
Nobody wants to eliminate them in one
sweep. But an alternative could exist, that makes
their usage obsolete. Then, after 10-20 years
they are phased out.

That's all
And we lose any *advantages* that zero-terminated strings might have
over counted strings.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jan 13 '07 #11
jacob navia wrote:
we******@gmail.com a écrit :
The C standard committee does not see the value of the language they
have ownership over, they don't see problems in the industry, and are
completely blind to the problems of the C language. There is a long
list of highly desirable features that the C language is crying out for
(no, not operator overloading -- I mean actual *functionality*) that
nobody is even thinking about.

Can you specify?
1) A better mapping to functionality commonly available in hardware.
I.e., high-word or widening multiply. Bit scan operations (i.e., fast
lg2, and scan for lowest.). Bit count. Endian swap. etc. Just look
through the instruction sets of the popular CPUs and see which useful
instructions are practically accessible through translation of C source
and what is not. Then add library functions that perform those and
allow the compiler vendor to support them however they wish. In this
way the most optimal paths for a given piece of hardware will be
available without resorting to non-portable inline assembly code. (And
for those platforms that do not support each function, each function
can be still be emulated which is equivalent to how such functionality
is portably delivered today.)

Just look through GMP or any similar multi-precision library. Yes it
has a portable pure C back-end, but it is utter nonsense and is only
invoked on compilers people have never heard of. But on those
compilers it is *AT LEAST* 4 times slower than the likely potential for
that hardware precisely because the language has no access to the
fastest, most functional instructions that are commonly available on
most hardware.

2) Better control flow functionality: Coroutines and alloca(). The
variable length array nonsense in C99 probably seemed like a good
cleaner substitute for alloca() at the time, but gcc has clearly shown
that its actually *harded* to implement that portably in at least their
compiler.

The whole setjmp, longjmp nonsense is precisely that. Personally I
have never gotten it to work correctly, and I have a hard time
approaching debugging of it -- I have always just found a way around
using them instead. However, it turns out that coroutines are fairly
straight forward to implement in assembly. And they incurr very little
overhead. It requires some method of allocating a new "call stack".
Although this sounds grossly exposing of platform details, if it is
abstracted correctly, it actually is a big benefit. Currently a
program itself within its source code cannot assert a requirement for
minimum of stack availability. So programs that use deep recursion can
crash easily. Besides that, coroutines represent truly new
functionality in the language that is just absolutely not duplicatable
by other means that can be considered scalable.

3) Better heap functionality. The only access we have to the heap is
through malloc/realloc/calloc/free. Its just not enough at all. The
thing is, even the best implementations of those functions put a best
amortized cost of a scalable implementation at around 50+ clocks per
allocation. That means memory allocation is something that you
necessarily try to push out of your inner loops. That means that
throwing in more functionality and more overhead into those calls will
not affect real world performance of applications that care about
performance, since such call will not be sitting in inner loops at all.

Now you can see from here in CLC itself, that there is a great need for
heaps with built-in debugging help. Simple functions that tell you the
total amount of allocated heap, or the size of a given allocation. C
is not a language that can implement garbage collection easily, but it
is very often the case that you can implement very close stand-ins that
achieve the same level of leak safety. In particular you can implement
seperate heaps, and include a "freeall()" function that will tear down
an entire heap at once rather than performing individual frees -- which
is often a lot of meaningless expense.

Another useful thing to have is an isFromAllocatedMemory() function.
The idea would be that the function would be well defined for certain
void * pointers that were from static memory, auto-memory, a pointer
allocated from heap memory and NULL. In this way you could determine
whether or not a pointer came from the heap. In practice, this would
be used primarily for debugging, but it could be used for functions
that perform "automatic freeing" of structures passed to it when they
are done with it, but *NOT* perform a free() if the structure passed to
it came from static memory.

4) A better preprocessor. Basically you want to be able to directly
compete with LISP's lambda functionality, and just make automatic code
generation more plausible in C. To be fully general, I would recommend
modelling it after the language LUA (since its such a small language,
but fully general) in terms of functionality. This kind of thing would
greatly assist things like generic programming, but also provide ways
for programmers to perform certain optimization techniques like
"constant propogation" with relative ease.

5) A universal 64bit time standard. Dave Tribble has posted his idea
and library that explains what he after. The point is to create a
universal time standard that is guaranteed to not barf in 2038 and that
lets you correctly calculate time differences in a universal way,
without being tied to the current local time (without daylight savings
messing you up). I have not looked too deeply, but obviously such a
standard would need also to include accurrate "sub-second" real-time
timer functionality which is currently not available (clock() delivers
processor tick time on UNIX systems, which is a different kind of
animal.)

The real point behind this list of features, is that they represent
true enhancements to the C language. They are enhancements that really
put the question to other languages like Java, Python, etc. Combined,
this set of features would enhance the performance, and rewiden the gap
between C and Java (or C#) for example, in ways that those two
languages could not easily catch up. These features would push the
frontiers of C precisely where other languages do not dare tread, or
else already have another approach they are locked into.

Today I can simply say to you, that if I want to use coroutines, I
cannot use the C language and am better off going to another language.
If I want multiprecision mathematics, I know that Python will deliver
it to me, and that I cannot really get a good C implementation unless
it is GMP (which is not thread-safe, BTW) or non-portable hand coded
assembly language. If I want better heap control, then I roll my own
non-portable solution, and go ahead and violate ANSI-strictness by
overriding malloc/free/realloc/calloc. On occasion I do write
auto-code generators -- but these days, I do it in Python, not the C
pre-processor or even C. And of course, I still do the occasional
in-line assembly code where I just don't otherwise have access to
certain instructions.

Strings, I don't care as much about, of course, because I have written
a portable library that anyone can use that completely solves the
problems using todays C compilers. So enhancements to C's string
functions at the standards level is actually completely useless to me.
It just goes to show how misguided that ANSI C committee is that they
are entertaining nonsense like TR 24731, which is both bad (or at best
benign), and not really necessary at the standards level.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jan 14 '07 #12
Keith Thompson wrote:
jacob navia <ja***@jacob.remcomp.frwrites:
Keith Thompson a écrit :
jacob navia <ja***@jacob.remcomp.frwrites:
[...]
Specifically the proposal is weak concerning the solution
for the zero terminated strings problems. It gives some extra
security without eliminating the problem at the root.

The problem is a bad data structure, and that is the
error that should be corrected.
Zero-terminated strings are not themselves a problem. They have both
advantages and disadvantages. And it's not possible to eliminate them
entirely from the C language, at least not without breaking existing
code. (Requiring *any* source-level modification to make a program
work under a new standard constitutes "breaking" existing code.)
Nobody wants to eliminate them in one
sweep. But an alternative could exist, that makes
their usage obsolete. Then, after 10-20 years
they are phased out.

That's all

And we lose any *advantages* that zero-terminated strings might have
over counted strings.
You can, of course, come up with a single example of such an
"advantage" (that applies to the 10-20 year time frame Jacob was
talking about)?

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jan 14 '07 #13
In article <11********************@11g2000cwr.googlegroups.co m>,
<we******@gmail.comwrote:
>jacob navia wrote:
>we******@gmail.com a =E9crit :
The C standard committee does not see the value of the language they
have ownership over, they don't see problems in the industry, and are
completely blind to the problems of the C language. There is a long
list of highly desirable features that the C language is crying out for
(no, not operator overloading -- I mean actual *functionality*) that
nobody is even thinking about.
>Can you specify?

1) A better mapping to functionality commonly available in hardware.
I=2Ee., high-word or widening multiply. Bit scan operations (i.e., fast
lg2, and scan for lowest.). Bit count. Endian swap. etc.
>2) Better control flow functionality: Coroutines and alloca(). The
>3) Better heap functionality.
>The real point behind this list of features, is that they represent
true enhancements to the C language.
In 25 years of C programming, I have rarely needed any of the
functionality you list. Memory leak analysis sometimes, but only
one of the several extensions to heap functionality you propose would
make any difference to me (and that only on the odd occasion.)

Endian swap I have only needed to the extent provided by Unix's
ntohl() and htonl() -- i.e., in the context of network programming
in which such routines would definitely be available as part of the
network stack; the implementation details have been irrelevant to me.
I have needed to convert file format byte orders, but such conversions
are inherently non-portable and do not necessarily represent
conversions from any actual hardware; the ability to nail down the
order of bitfields would sometimes be a convenience, but taking the
appropriate adjustments at runtime has never ever been a factor that
affected execution time measurably.
In short, it appears to me that your view of what "the C language is
crying out for" is very heavily coloured by the -kind- of work that
you do. There was *nothing* in your list that had me nodding and
saying, "Me too!". Hence I have strong doubts that the C language is
actually "crying out for" those features, and I have a definite
suspicion that implementing the features you suggest would push
C even more towards the "portable assembler" status -- at a time in
computing when assembler has, to -most- people, become nearly irrelevant.

Now, maybe what the embedded programming industry needs is a really
first rate "portable assembler" so that developers can write code
usable on anything from GPUs to cell phones to toasters to "hyperthreaded"
PCs.... but as best I understand, programming history is littered with
the carcasses of programming languages that attempted to be
provide portable high-level interfaces to hardware. And I can't say
I've ever seen -any- demand for ADA outside of the US DoD...

There is a growing gap between "implementors" (responsible for the
nitty gritty of providing funcitonality on particular hardware),
and "programmers" (who don't necessarily care what happens under the
hood). The programmers are in the growing majority; catering to
implementors is going to have about the same long-term effect as
catering to DBase 3 programmers: useful to some, yes, but C would
pretty much drop out of general consciousness as a general purpose
language.
--
"It is important to remember that when it comes to law, computers
never make copies, only human beings make copies. Computers are given
commands, not permission. Only people can be given permission."
-- Brad Templeton
Jan 14 '07 #14
Walter Roberson wrote:
In article <11********************@11g2000cwr.googlegroups.co m>,
<we******@gmail.comwrote:
jacob navia wrote:
we******@gmail.com a =E9crit :
The C standard committee does not see the value of the language they
have ownership over, they don't see problems in the industry, and are
completely blind to the problems of the C language. There is a long
list of highly desirable features that the C language is crying out for
(no, not operator overloading -- I mean actual *functionality*) that
nobody is even thinking about.
Can you specify?
1) A better mapping to functionality commonly available in hardware.
I=2Ee., high-word or widening multiply. Bit scan operations (i.e., fast
lg2, and scan for lowest.). Bit count. Endian swap. etc.
2) Better control flow functionality: Coroutines and alloca(). The
3) Better heap functionality.
The real point behind this list of features, is that they represent
true enhancements to the C language.

In 25 years of C programming, I have rarely needed any of the
functionality you list. Memory leak analysis sometimes, but only
one of the several extensions to heap functionality you propose would
make any difference to me (and that only on the odd occasion.)

Endian swap I have only needed to the extent provided by Unix's
ntohl() and htonl() -- i.e., in the context of network programming
in which such routines would definitely be available as part of the
network stack; the implementation details have been irrelevant to me.
I have needed to convert file format byte orders, but such conversions
are inherently non-portable and do not necessarily represent
conversions from any actual hardware; the ability to nail down the
order of bitfields would sometimes be a convenience, but taking the
appropriate adjustments at runtime has never ever been a factor that
affected execution time measurably.
In short, it appears to me that your view of what "the C language is
crying out for" is very heavily coloured by the -kind- of work that
you do. There was *nothing* in your list that had me nodding and
saying, "Me too!". Hence I have strong doubts that the C language is
actually "crying out for" those features, and I have a definite
suspicion that implementing the features you suggest would push
C even more towards the "portable assembler" status -- at a time in
computing when assembler has, to -most- people, become nearly irrelevant.
Isn't it precisely because of C's rise that assembler was severely
sidelined? Wouldn't adding more functionality to C take it away from
assembler rather than towards it?
Now, maybe what the embedded programming industry needs is a really
first rate "portable assembler" so that developers can write code
usable on anything from GPUs to cell phones to toasters to "hyperthreaded"
PCs.... but as best I understand, programming history is littered with
the carcasses of programming languages that attempted to be
provide portable high-level interfaces to hardware. And I can't say
I've ever seen -any- demand for ADA outside of the US DoD...
By the time Ada matured C/C++ had developed enough momentum to ensure
that the former would not be widely considered. As I see it, Java was
the first language to break the stranglehold of C and it's brother.
There is a growing gap between "implementors" (responsible for the
nitty gritty of providing funcitonality on particular hardware),
and "programmers" (who don't necessarily care what happens under the
hood). The programmers are in the growing majority; catering to
implementors is going to have about the same long-term effect as
catering to DBase 3 programmers: useful to some, yes, but C would
pretty much drop out of general consciousness as a general purpose
language.
IMHO, C has already nearly lost out on the applications development
front, particularly on PCs. It seems that it's major use now and in the
future would be in the implementation space. So additions to the
language that make it more relevant for that area might not be so bad
an idea.

Jan 14 '07 #15
Walter Roberson wrote:
<we******@gmail.comwrote:
jacob navia wrote:
we******@gmail.com a =E9crit :
The C standard committee does not see the value of the language they
have ownership over, they don't see problems in the industry, and are
completely blind to the problems of the C language. There is a long
list of highly desirable features that the C language is crying out for
(no, not operator overloading -- I mean actual *functionality*) that
nobody is even thinking about.
Can you specify?
1) A better mapping to functionality commonly available in hardware.
I=2Ee., high-word or widening multiply. Bit scan operations (i.e., fast
lg2, and scan for lowest.). Bit count. Endian swap. etc.
2) Better control flow functionality: Coroutines and alloca(). The
3) Better heap functionality.
The real point behind this list of features, is that they represent
true enhancements to the C language.

In 25 years of C programming, I have rarely needed any of the
functionality you list.
I noticed how you didn't say never. So when you *have* needed some of
that functionality, what did you do about it? I would like to point
out that most of the entire C library is completely useless to me as
well. So I don't know what the point of this comment is.
[...] Memory leak analysis sometimes, but only
one of the several extensions to heap functionality you propose would
make any difference to me (and that only on the odd occasion.)
So your argument then, is that you don't think there should be memory
leak assistance, because the other proposals I made are not something
you would be interested in?
Endian swap I have only needed to the extent provided by Unix's
ntohl() and htonl() -- i.e., in the context of network programming
in which such routines would definitely be available as part of the
network stack; the implementation details have been irrelevant to me.
Okay ... So here is another one that you would use, but only if they
were tied to Unix and named "nothl" and "htonl"? Are you saying these
are only possibly useful to Unix and therefore must not be available to
other platforms in a portable way?
I have needed to convert file format byte orders, but such conversions
are inherently non-portable and do not necessarily represent
conversions from any actual hardware; the ability to nail down the
order of bitfields would sometimes be a convenience, but taking the
appropriate adjustments at runtime has never ever been a factor that
affected execution time measurably.

In short, it appears to me that your view of what "the C language is
crying out for" is very heavily coloured by the -kind- of work that
you do.
Ok, but your idea that these are unimportant to you (even though,
apparently some are) is clearly coloring your response here. So I
don't see the value of your argument. BTW, what do you know about the
kind of work that I do?

My ideas come from looking at other programming languages, and from
looking at real world applications:

Coroutines come from the fact that Lua has them, Python has something
similar but less general (generators) and they are very useful for
web-browsers (yeilding on socket blocks to allow a single tasking
application to efficiently download a web page) and chess engines (just
the way the jumble of loops for move generation intertwines with the
alpha-beta algorithm can be drastically simplified with coroutines).

Alloca(), and better heap management is actually a reaction to garbage
collection. Garbage collection makes memory management in other
languages a complete non-issue. Now, C cannot easily implement GC, and
I don't recommend this. However, C has to justify its strategy as an
alternative. All the mechanisms I recommend would enhance the
functionality of C in ways that make its weaknesses versus GC less
obvious (with leak detection assistance, GC's leak-lessness is less of
an advantage, of course; with isFromHeap(), double-free becomes less of
a problem.) C's memory management strategy is better for realtime and
precise memory handling. Since it can't be exactly the same as GC, is
has to be something else -- it might as well be the very best possible
instance of its alternative strategy.

The better preprocessor clearly is aimed at Lisp's Lambda and just
generally more powerful macro programming. However, its purpose is
also as a good stand-in for generics/templates. It can be used to
automatically unroll a massive number of loops with a single source
rendering.
[...] There was *nothing* in your list that had me nodding and
saying, "Me too!". Hence I have strong doubts that the C language is
actually "crying out for" those features, and I have a definite
suspicion that implementing the features you suggest would push
C even more towards the "portable assembler" status -- at a time in
computing when assembler has, to -most- people, become nearly irrelevant.
Ok, so do you think the need for large number arithmetic has become
irrelevant? The use of crypto is going *up* not down. I don't even
know what this means. People have run away from assembly because
people want to engange in sustainable programming and Moore's Law has
enabled them to ignore performance for many applications (please note
that crypto is not one of these applications -- you just do inline
assembly or you can forget it). However, enabling such functionality
in C, brings back the performance benefit to a language that, in
theory, is at least partially scalable -- if not, then C++, which is
more seriously scalable, can at least inherit the features.
Now, maybe what the embedded programming industry needs is a really
first rate "portable assembler" so that developers can write code
usable on anything from GPUs to cell phones to toasters to "hyperthreaded"
PCs.... but as best I understand, programming history is littered with
the carcasses of programming languages that attempted to be
provide portable high-level interfaces to hardware. And I can't say
I've ever seen -any- demand for ADA outside of the US DoD...
What in the holy hell are you talking about? Ada is just a more
functional variant of Pascal. How is Ada a portable assermbler? The
features I suggest would also not be of any help to a GPUs or DSPs.
There is a growing gap between "implementors" (responsible for the
nitty gritty of providing funcitonality on particular hardware),
and "programmers" (who don't necessarily care what happens under the
hood). The programmers are in the growing majority; catering to
implementors is going to have about the same long-term effect as
catering to DBase 3 programmers: useful to some, yes, but C would
pretty much drop out of general consciousness as a general purpose
language.
You are saying that adding enhancements to C are not a good idea,
because adding them to C++ would be better?!?! Hint: If you add them
to C, C++ *will* pick them up. Sorry, but I find your entire response
completely vacuous.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jan 14 '07 #16
Walter Roberson a écrit :
In article <11********************@11g2000cwr.googlegroups.co m>,
<we******@gmail.comwrote:
>>jacob navia wrote:
>>>we******@gmail.com a =E9crit :

The C standard committee does not see the value of the language they
have ownership over, they don't see problems in the industry, and are
completely blind to the problems of the C language. There is a long
list of highly desirable features that the C language is crying out for
(no, not operator overloading -- I mean actual *functionality*) that
nobody is even thinking about.

>>>Can you specify?

1) A better mapping to functionality commonly available in hardware.
I=2Ee., high-word or widening multiply. Bit scan operations (i.e., fast
lg2, and scan for lowest.). Bit count. Endian swap. etc.

>>2) Better control flow functionality: Coroutines and alloca(). The

>>3) Better heap functionality.

>>The real point behind this list of features, is that they represent
true enhancements to the C language.


In 25 years of C programming, I have rarely needed any of the
functionality you list. Memory leak analysis sometimes, but only
one of the several extensions to heap functionality you propose would
make any difference to me (and that only on the odd occasion.)

Yes, I have seen that argument a lot of times.

"I never needed memory leak detectors"
"Only on odd accasion I did use them"

Yes. Everybody here is a genius programmers, looking at this
messages.

BUGS?

ERRORS?

Defensive programming?

That is for wimps...
Endian swap I have only needed to the extent provided by Unix's
ntohl() and htonl() -- i.e., in the context of network programming
in which such routines would definitely be available as part of the
network stack; the implementation details have been irrelevant to me.
I have needed to convert file format byte orders, but such conversions
are inherently non-portable and do not necessarily represent
conversions from any actual hardware; the ability to nail down the
order of bitfields would sometimes be a convenience, but taking the
appropriate adjustments at runtime has never ever been a factor that
affected execution time measurably.
In short, it appears to me that your view of what "the C language is
crying out for" is very heavily coloured by the -kind- of work that
you do. There was *nothing* in your list that had me nodding and
saying, "Me too!". Hence I have strong doubts that the C language is
actually "crying out for" those features, and I have a definite
suspicion that implementing the features you suggest would push
C even more towards the "portable assembler" status -- at a time in
computing when assembler has, to -most- people, become nearly irrelevant.
You get that wrong. The situation now forces you to use non portable
assembler constructs. Incoporating them into the language makes the
need for including assembly less urgent...
Now, maybe what the embedded programming industry needs is a really
first rate "portable assembler" so that developers can write code
usable on anything from GPUs to cell phones to toasters to "hyperthreaded"
PCs.... but as best I understand, programming history is littered with
the carcasses of programming languages that attempted to be
provide portable high-level interfaces to hardware. And I can't say
I've ever seen -any- demand for ADA outside of the US DoD...
Ada is nothing like a "portable assembler"!!!! You are dreaming.
There is a growing gap between "implementors" (responsible for the
nitty gritty of providing funcitonality on particular hardware),
and "programmers" (who don't necessarily care what happens under the
hood). The programmers are in the growing majority; catering to
implementors is going to have about the same long-term effect as
catering to DBase 3 programmers: useful to some, yes, but C would
pretty much drop out of general consciousness as a general purpose
language.
Incredible how much nonsense someone can say in just a few sentences.
C users are implementors, since they implement algorithms and
software...

If you do not want to know what happens "under the hood" just use BASIC
or VB, or C# for that matter.
Jan 14 '07 #17
we******@gmail.com writes:
Keith Thompson wrote:
>jacob navia <ja***@jacob.remcomp.frwrites:
[...]
Nobody wants to eliminate them in one
sweep. But an alternative could exist, that makes
their usage obsolete. Then, after 10-20 years
they are phased out.

That's all

And we lose any *advantages* that zero-terminated strings might have
over counted strings.

You can, of course, come up with a single example of such an
"advantage" (that applies to the 10-20 year time frame Jacob was
talking about)?
I don't know about a 10-20 year time frame, but consider this. If a
program is going to scan a string anyway, there's not much benefit in
storing its length separately. In a recent discussion here, somebody
posted an example of such a program (a fairly small one). jacob
claimed that a solution using memcpy() (which requires knowing the
length in advance) was faster than an equivalent solution using
strcpy() (which doesn't) -- but he only provided actual numbers for an
x86 platform. I demonstrated that the strcpy() solution is actually
faster on some other platforms.

Now if you're doing a lot of processing that *does* require knowing
the length in advance, then yes, counted strings are advantageous.
But if you don't happen need it, then computing and storing it is
useless overhead. I'm not arguing that C-style zero-terminated
strings are superior to counted strings, merely that there is a
tradeoff. I don't know which is better in general. jacob thinks he
does know, and that zero-termainted strings are inherently a bug in
the language.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jan 14 '07 #18
jacob navia said:
Walter Roberson a écrit :
<snip>
>In 25 years of C programming, I have rarely needed any of the
functionality you list. Memory leak analysis sometimes, but only
one of the several extensions to heap functionality you propose would
make any difference to me (and that only on the odd occasion.)


Yes, I have seen that argument a lot of times.

"I never needed memory leak detectors"
Who has said this?
"Only on odd accasion I did use them"
Well, that may well be true for some people. Personally, I use a leak
detector I wrote myself, which is why I don't need to use a third-party
product such as valgrind.
Yes. Everybody here is a genius programmers, looking at this
messages.
Nobody here has claimed this. But a memory leak is normally pretty easy to
track down and fix.
>
BUGS?

ERRORS?

Defensive programming?

That is for wimps...
No, defensive programming is a sound strategy, but it's not a binary state -
defensive or not defensive. Rather, it's a matter of degree, and a good
programmer will be able to decide for himself which parts of his program
are vulnerable and need to be defended deeply, and which parts can be
allowed to fly, without let or hindrance from *unnecessary* checking.

<snip>
If you do not want to know what happens "under the hood" just use BASIC
or VB, or C# for that matter.
Or C, where you very often get to choose whether you want, or do not want,
to know what happens behind the scenes.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
Jan 14 '07 #19
In article <45***********************@news.orange.fr>,
jacob navia <ja***@jacob.remcomp.frwrote:
>Walter Roberson a écrit :
>In 25 years of C programming, I have rarely needed any of the
functionality you list. Memory leak analysis sometimes, but only
one of the several extensions to heap functionality you propose would
make any difference to me (and that only on the odd occasion.)
>Yes, I have seen that argument a lot of times.
>"I never needed memory leak detectors"
"Only on odd accasion I did use them"
>Yes. Everybody here is a genius programmers, looking at this
messages.
According to the RCS logs, the last time I worked on a program
for which memory leak analysis was a significant factor, was
11 years ago -- and the actual leak analysis was 12 1/2 years ago.

I have programmed a far bit in C since that time, but the
memory usage patterns became less and less relevant: if the program
didn't run out of physical memory before termination, then
any leaks didn't matter. And I have the discipline to be consistant
in my memory usage, making leaks relatively improbable.

Self-discipline in programming doesn't take a genius programmer,
merely a stubborn one.
--
Prototypes are supertypes of their clones. -- maplesoft
Jan 15 '07 #20
In article <45***********************@news.orange.fr>,
jacob navia <ja***@jacob.remcomp.frwrote:
>Walter Roberson a écrit :
>Now, maybe what the embedded programming industry needs is a really
first rate "portable assembler" so that developers can write code
usable on anything from GPUs to cell phones to toasters to "hyperthreaded"
PCs.... but as best I understand, programming history is littered with
the carcasses of programming languages that attempted to be
provide portable high-level interfaces to hardware. And I can't say
I've ever seen -any- demand for ADA outside of the US DoD...
>Ada is nothing like a "portable assembler"!!!! You are dreaming.
As best I recall, Ada's premise was that all operations would be
precisely defined and implemented on all platforms. It was
very important in the history of Ada that there would be only *one*
legal interpretation of any Ada program. The strong intent was that
an Ada program would have exactly the same semantics on -every-
platform that Ada was available on, and that every Ada program could
be taken and executed (with the same results) without ANY changes
(other than recompilation) on every Ada implementation. And it
was important to Ada that it be usable for robust multiprocessing
and time-sensitive work. Ada was, in short, intended to provide
portable high-level interfaces to hardware. It seems to me that the
lackluster demand for Ada should tell us something about the
market conditions for 'a really first rate "portable assembler"'.
--
"It is important to remember that when it comes to law, computers
never make copies, only human beings make copies. Computers are given
commands, not permission. Only people can be given permission."
-- Brad Templeton
Jan 15 '07 #21
In article <11**********************@a75g2000cwd.googlegroups .com>,
<we******@gmail.comwrote:
>Walter Roberson wrote:
>In 25 years of C programming, I have rarely needed any of the
functionality you list.
>I noticed how you didn't say never. So when you *have* needed some of
that functionality, what did you do about it?
Implimented in terms of the platforms I cared about, and documented
the platform restriction. Beyond those, I never received requests
to port to additional platforms.

Which is not to say that I paid lipservice to platform dependancies:
instead, it was the case that I paid close attention to what was or
was not promised by C, and in so doing, wrote code that avoided
the issues when possible, and isolated the affected areas when
dependancies were unavoidable.

Now here's the point: in NO case that I can remember, did I choose
another language because it offered portability guarantees that C did
not. Each time, I chose a language suitable for the nature of the
project.

The portability issues that you describe were never more than a
miniscule consideration in the work I did. Much more difficult was
portability at the OS level -- matters such as dealing with network
programming interfaces or serial port interfaces. The layers you
describe would be, for the work I did, essentially akin to
microoptimizations.

>[...] Memory leak analysis sometimes, but only
one of the several extensions to heap functionality you propose would
make any difference to me (and that only on the odd occasion.)
>So your argument then, is that you don't think there should be memory
leak assistance, because the other proposals I made are not something
you would be interested in?
Your proposals would, in my opinion, do almost nothing to save C
amongst the general populace of programmers: most of your proposals
are irrelevant for most programs, I believe. They might do an
admirable job of fixing one corner of the language, but I don't
believe for a moment that C is "crying out for" that set of
changes. You accuse the C99 committee of not addressing the "real"
problems of C, but in my assessment, what you propose would be
largely greated by a Hearty High-Ho "So What?" by the great majority
of C programmers.

>Endian swap I have only needed to the extent provided by Unix's
ntohl() and htonl() -- i.e., in the context of network programming
in which such routines would definitely be available as part of the
network stack; the implementation details have been irrelevant to me.
>Okay ... So here is another one that you would use, but only if they
were tied to Unix and named "nothl" and "htonl"?
No, if they were provided by the C library, my first question would
be how to override them to get at the implementation's routines
of the same name: unless the C standards committee -defined-
them as operating the same way as in POSIX, C's versions would
be of no utility to me. I have no use for the operations outside
of network programming, and I'm sure the C standards committee knows
to butt out of the network programming standards area.
>Are you saying these
are only possibly useful to Unix and therefore must not be available to
other platforms in a portable way?
I'm not in the business of writing Linux drivers or OS kernels that
would, optimally, be writable without changes for every platform
that the code might -possibly- be ported to. The effort that the
implementors of my network stack have to go through to provide
ntohl() and htonl() are of little interest to me: that's host
implementation, and I don't care what compiler extension or whatever
that they hide away in a system library or system header file intended
for system use. Whether such extensions are built into C or not
wouldn't have made C more useful for much of anything I did in
the last decade; such extensions might have marginally increased the
-theoretical- portability of some of my programs, but not one iota
would they have increased the -practical- portability of what I did.

>Alloca(), and better heap management is actually a reaction to garbage
collection. Garbage collection makes memory management in other
languages a complete non-issue.
I've been using the symbolic computation language, "maple" a fair
bit over the last year. I profiled one of my programs to figure
out where to expend the most effort in speed improvement... and
found that 68% of the execution time was being spent in garbage
collection. I would have had to have developed hefty mathematical
theorems able to operate on the terms in-place (with no pointers,
and with the order of the terms open to change without notice)
to mathematically bypass the need for the garbage collection in
order to have a chance of significantly improving the speed of
my program -- greatly improving my program complexity (if such
theorems could be found at all) just to work around the slow
garbage collection. Sometime later, one of the developers mentioned
in passing that the speed of the garbage collector is proportional
to the -amount- of memory allocated, not to the number of memory
allocations. I am at a loss for words to describe how glad I am
to have your assurance that "Garbage collection makes memory
management in other languages a complete non-issue."

>There is a growing gap between "implementors" (responsible for the
nitty gritty of providing funcitonality on particular hardware),
and "programmers" (who don't necessarily care what happens under the
hood). The programmers are in the growing majority; catering to
implementors is going to have about the same long-term effect as
catering to DBase 3 programmers: useful to some, yes, but C would
pretty much drop out of general consciousness as a general purpose
language.
>You are saying that adding enhancements to C are not a good idea,
because adding them to C++ would be better?!?!
I have a copy of the official printed ISO C++ standard. It is
a bear to find anything useful in it, precisely because C++
added so many (mandatory) features of narrow utility that the noise
drowns out the signal. If I -had- mentioned C++ at all (and I
did not), then Yes, it might have been with the notion that it
would be better to add the features to C++ than to C -- better to
let C++ degrade even faster on its own obsiety than to inflate C
for little gain.
Sorry, but I find your entire response
completely vacuous.
And I found your response to my response to be full of bad logic
and strawman arguments.
--
All is vanity. -- Ecclesiastes
Jan 15 '07 #22
ro******@ibd.nrc-cnrc.gc.ca (Walter Roberson) writes:
In article <45***********************@news.orange.fr>,
jacob navia <ja***@jacob.remcomp.frwrote:
[...]
>>Ada is nothing like a "portable assembler"!!!! You are dreaming.

As best I recall, Ada's premise was that all operations would be
precisely defined and implemented on all platforms. It was
very important in the history of Ada that there would be only *one*
legal interpretation of any Ada program. The strong intent was that
an Ada program would have exactly the same semantics on -every-
platform that Ada was available on, and that every Ada program could
be taken and executed (with the same results) without ANY changes
(other than recompilation) on every Ada implementation. And it
was important to Ada that it be usable for robust multiprocessing
and time-sensitive work. Ada was, in short, intended to provide
portable high-level interfaces to hardware. It seems to me that the
lackluster demand for Ada should tell us something about the
market conditions for 'a really first rate "portable assembler"'.
<OT>
Not quite. Ada has the equivalent of C's "undefined behavior" (Ada
calls it "erroneous execution"), though there are fewer instances than
in C. And there are a number of things that are system-specific. For
example, the sizes and ranges of Ada's predefined integer types
(Integer, Long_Integer, etc.) are implementation-defined, much as they
are in C.

Ada aims to make portable code easier to write (for example, you can
easily declare an integer type with a specified range), but it doesn't
make non-portable code impossible, or even particularly difficult.

It also has a number of features designed to interface to low-level
hardware (embedded systems are a major target), but an attempt is made
to keep such features cleanly separated from the higher-level
features.

Ada is no more a "portable assembler" than C is. I'm not sure what
its lackluster demand tells us.
</OT>

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jan 15 '07 #23
In article <11**********************@a75g2000cwd.googlegroups .com>,
<we******@gmail.comwrote:
>My ideas come from looking at other programming languages, and from
looking at real world applications:
>Coroutines come from the fact that Lua has them, Python has something
similar but less general (generators) and they are very useful for
web-browsers (yeilding on socket blocks to allow a single tasking
application to efficiently download a web page) and chess engines (just
the way the jumble of loops for move generation intertwines with the
alpha-beta algorithm can be drastically simplified with coroutines).
In the web-browser case, what you are essentially doing is asking
to import thread capabilities into C -- possibly only
"cooperative threading" on uniprocessors, but still thread capabilities.
It seems to me that you would need to import noticably more than
just co-routines: you would need to import socket blocking
control, extend fread() and kin to return states such as
EAGAIN (i.e., no data is waiting), and probably a signal or two
would have to get involved so as to provide notification that
the co-routine is ready to proceed.

There is perhaps room for very lightweight threads in C: the
POSIX threading model seems to require a big library and
understanding a lot of routines. I would have to think more
about how such a thing would require extending C itself, versus
how much of it could essentially be pushed off to a set of library
routines; if it can all be reasonably handled as library routines,
then I'm not certain that it would be a good thing to nail the
functionality into the C standard.
--
"law -- it's a commodity"
-- Andrew Ryan (The Globe and Mail, 2005/11/26)
Jan 15 '07 #24
Walter Roberson wrote:
<we******@gmail.comwrote:
My ideas come from looking at other programming languages, and from
looking at real world applications:
Coroutines come from the fact that Lua has them, Python has something
similar but less general (generators) and they are very useful for
web-browsers (yeilding on socket blocks to allow a single tasking
application to efficiently download a web page) and chess engines (just
the way the jumble of loops for move generation intertwines with the
alpha-beta algorithm can be drastically simplified with coroutines).

In the web-browser case, what you are essentially doing is asking
to import thread capabilities into C -- possibly only
"cooperative threading" on uniprocessors, but still thread capabilities.
It seems to me that you would need to import noticably more than
just co-routines: you would need to import socket blocking
control, extend fread() and kin to return states such as
EAGAIN (i.e., no data is waiting), and probably a signal or two
would have to get involved so as to provide notification that
the co-routine is ready to proceed.
That is incorrect. All you need is a probing/peeking function for any
potential blocking read. Everything else is just a matter of program
design. There is a very specific reason why you don't want to add in
full multi-threading. Multithreading is very hard to make totally
portable, and introduces advanced concepts like semaphores, mutexes,
and other critical section solutions. Coroutines are a very special
subcase that doesn't require any of those complications, is extremely
low-overhead, and does not, by itself lend itself to dead-locking. So
it would introduce even more undefined behavior into C (which would
probably please the standards committee people to no end.)

It turns out that there are many server applications that are most
appropriately solved by just coroutines. But the added value is that
coroutines are also more useful than for the simplest of multitasking
problems. They allow you to synch up two complicated loops while
keeping each loop as simple as possible.
There is perhaps room for very lightweight threads in C: the
POSIX threading model seems to require a big library and
understanding a lot of routines. I would have to think more
about how such a thing would require extending C itself, versus
how much of it could essentially be pushed off to a set of library
routines; if it can all be reasonably handled as library routines,
then I'm not certain that it would be a good thing to nail the
functionality into the C standard.
Well, Microsoft has their own threading system with, as far as I
understand it, far more complicated synchronization objects. Another
useful standard is MPI. Of course none of these standards are
universally implemented, of course.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jan 16 '07 #25
Keith Thompson wrote:
we******@gmail.com writes:
Keith Thompson wrote:
jacob navia <ja***@jacob.remcomp.frwrites:
[...]
Nobody wants to eliminate them in one
sweep. But an alternative could exist, that makes
their usage obsolete. Then, after 10-20 years
they are phased out.

That's all

And we lose any *advantages* that zero-terminated strings might have
over counted strings.
You can, of course, come up with a single example of such an
"advantage" (that applies to the 10-20 year time frame Jacob was
talking about)?

I don't know about a 10-20 year time frame, but consider this. If a
program is going to scan a string anyway, there's not much benefit in
storing its length separately. In a recent discussion here, somebody
posted an example of such a program (a fairly small one). jacob
claimed that a solution using memcpy() (which requires knowing the
length in advance) was faster than an equivalent solution using
strcpy() (which doesn't) -- but he only provided actual numbers for an
x86 platform. I demonstrated that the strcpy() solution is actually
faster on some other platforms. [...]
Well, those platforms would definately be looking *backwards* in time.
So indeed the 10-20 year time frame qualification *does matter*. But
more to the point *EVERY* architecture created from this point forward
will prefer length prefixed string copying (that is because a parallel
dependency is always better than a serial one -- its easier to add ALUs
than increase the clock rate). If the C standard has no interest in
the future and is only concerned with architectures from antiquity,
then fine. But don't complain when C get branded with the COBOL label.
Now if you're doing a lot of processing that *does* require knowing
the length in advance, then yes, counted strings are advantageous.
Whether it requires it or not, having the length will *speed it up* or
be neutral for all scenarios, on all modern platforms, and make your
code safer, and make it easier to write and maintain.
But if you don't happen need it, then computing and storing it is
useless overhead.
If you dive into Bstrlib, you will find that often that additional
overhead can exist primarily in your auto-space, not necessarily in
your heap space (depending on your algorithm, or what exactly you are
doing.) In general, where it matters, this overhead can usually be
amortized using various packing methods (Bstrlib comes with good CSV
parsing code and netstrings if you really want to pack and serialize
many strings at once) or by treating the string data as if it were a
file (Bstrlib comes with something called bstreams which does exactly
this) which again only costs auto-space.
[...] I'm not arguing that C-style zero-terminated
strings are superior to counted strings, merely that there is a
tradeoff.
Still waiting for the example.
[...] I don't know which is better in general. jacob thinks he
does know, and that zero-termainted strings are inherently a bug in
the language.
Well, I only know it from direct comparison and fairly extensive
analysis of the situation. '\0' terminated strings *are* more error
prone; there is just no comparison. Its a white-hot flash point of
MAXIMIZED manifestations of buffer overflows (which TR 24731 doesn't
usefully address, BTW). Compare that with Bstrlib where its nearly
impossible to cause any kind UB due to a buffer overflow scenario
unless you are directly and unnecessarily hacking on it, or have
corrupted the data externally. (Other solutions such as Vstr are
basically about as good on this point.)

And in terms of performance comparison, you can put the two side by
side on any general task -- bstrings never give up the possibility of
falling back onto the Clib, so it cannot lose. However, it never needs
to do this as all the portable hand coded algorithms are equal or
faster than pretty much all the Clibs out there on a wide variety of
string kernels.

In particular, look at sub-string searching. Thats an algorithm which,
intuitively, should really be equal for both styles, since you have to
do character by character stuff no matter what. But it turns out that
good algorithms try to *unroll* the inner loop so that you can examine
two characters back to back without an intervening loop check. In C
you have to put in an extra test for an intermediate '\0' check (see
GCC's Clib source for strstr() for an example of this). With Bstrlib
you only do one test to see if you have at least two characters more
that you can scan. Its little things like this that just show up all
over the place.

And that doesn't even bring up the fiasco that is strcat(), where C
actually manages to lose to pathetically slow languages like TCL and
Python.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jan 17 '07 #26
we******@gmail.com a écrit :
In particular, look at sub-string searching. Thats an algorithm which,
intuitively, should really be equal for both styles, since you have to
do character by character stuff no matter what. But it turns out that
good algorithms try to *unroll* the inner loop so that you can examine
two characters back to back without an intervening loop check. In C
you have to put in an extra test for an intermediate '\0' check (see
GCC's Clib source for strstr() for an example of this). With Bstrlib
you only do one test to see if you have at least two characters more
that you can scan. Its little things like this that just show up all
over the place.

And that doesn't even bring up the fiasco that is strcat(), where C
actually manages to lose to pathetically slow languages like TCL and
Python.
I have repeated this over and over. For instance strrchr is vastly more
efficient when it can start at the end of the string and find the first
occcurrence of the searched for character *backwards* instead of
searching the whole string to find the last one!!!
Jan 17 '07 #27
jacob navia wrote:
>
I have repeated this over and over. [...]
Play it again, Sam. This time with more cowbell.

--
Eric Sosman
es*****@acm-dot-org.invalid
Jan 17 '07 #28
On Wed, 17 Jan 2007 12:54:38 +0100, jacob navia
<ja***@jacob.remcomp.frwrote:
>we******@gmail.com a écrit :
>In particular, look at sub-string searching. Thats an algorithm which,
intuitively, should really be equal for both styles, since you have to
do character by character stuff no matter what. But it turns out that
good algorithms try to *unroll* the inner loop so that you can examine
two characters back to back without an intervening loop check. In C
you have to put in an extra test for an intermediate '\0' check (see
GCC's Clib source for strstr() for an example of this). With Bstrlib
you only do one test to see if you have at least two characters more
that you can scan. Its little things like this that just show up all
over the place.

And that doesn't even bring up the fiasco that is strcat(), where C
actually manages to lose to pathetically slow languages like TCL and
Python.

I have repeated this over and over. For instance strrchr is vastly more
efficient when it can start at the end of the string and find the first
occcurrence of the searched for character *backwards* instead of
searching the whole string to find the last one!!!
Given the mean and varance of the length of a string, I find it hard
to believe that it would be "vastly more efficient". If you're talking
about a Mega-byte-length sting, then yes. But most strings are no more
than 4 or 16 or 64 or even hundreds of bytes in length. And in these
cases, strrchr(), as defined by the standard, should suffice.

Certainly it might take longer for strrchr() to operate on longer
strings compared to shorter strings, on the average, but if that's the
least of your worries, then you have bigger fish to fry.

#include <stdio.h>
#include <string.h>
#include <time.h>

int main(void)
{
clock_t t1;
clock_t t2;
volatile char *p;

t1= clock();
p = strrchr("Hello", 'H');
t2= clock();
printf("p is %p,\n", (void*)p);
printf("and that took %.12f seconds\n",
(double)(t2 - t1) / CLOCKS_PER_SEC);
t1= clock();
p = strrchr("Hello World!", 'H');
t2= clock();
printf("p is %p,\n", (void*)p);
printf("and that took %.12f seconds\n",
(double)(t2 - t1) / CLOCKS_PER_SEC);
t1= clock();
p = strrchr
(
"Hello the very, very, "
"quite contrary, benevolent "
"and sometimes forgiving, but also, "
"at the same time, very "
"unforgiving, World!",
'H'
);
t2= clock();
printf("p is %p,\n", (void*)p);
printf("and that took %.12f seconds\n",
(double)(t2 - t1) / CLOCKS_PER_SEC);
return 0;
}

Output:

p is 0042603C,
and that took 0.000000000000 seconds
p is 0042602C,
and that took 0.000000000000 seconds
p is 00426FA4,
and that took 0.000000000000 seconds
Press any key to continue

Regards
--
jay
Jan 18 '07 #29
jaysome a écrit :
On Wed, 17 Jan 2007 12:54:38 +0100, jacob navia
<ja***@jacob.remcomp.frwrote:

>>we******@gmail.com a écrit :
>>>In particular, look at sub-string searching. Thats an algorithm which,
intuitively, should really be equal for both styles, since you have to
do character by character stuff no matter what. But it turns out that
good algorithms try to *unroll* the inner loop so that you can examine
two characters back to back without an intervening loop check. In C
you have to put in an extra test for an intermediate '\0' check (see
GCC's Clib source for strstr() for an example of this). With Bstrlib
you only do one test to see if you have at least two characters more
that you can scan. Its little things like this that just show up all
over the place.

And that doesn't even bring up the fiasco that is strcat(), where C
actually manages to lose to pathetically slow languages like TCL and
Python.

I have repeated this over and over. For instance strrchr is vastly more
efficient when it can start at the end of the string and find the first
occcurrence of the searched for character *backwards* instead of
searching the whole string to find the last one!!!


Given the mean and varance of the length of a string, I find it hard
to believe that it would be "vastly more efficient". If you're talking
about a Mega-byte-length sting, then yes. But most strings are no more
than 4 or 16 or 64 or even hundreds of bytes in length. And in these
cases, strrchr(), as defined by the standard, should suffice.

Certainly it might take longer for strrchr() to operate on longer
strings compared to shorter strings, on the average, but if that's the
least of your worries, then you have bigger fish to fry.

#include <stdio.h>
#include <string.h>
#include <time.h>

int main(void)
{
clock_t t1;
clock_t t2;
volatile char *p;

t1= clock();
p = strrchr("Hello", 'H');
t2= clock();
printf("p is %p,\n", (void*)p);
printf("and that took %.12f seconds\n",
(double)(t2 - t1) / CLOCKS_PER_SEC);
t1= clock();
p = strrchr("Hello World!", 'H');
t2= clock();
printf("p is %p,\n", (void*)p);
printf("and that took %.12f seconds\n",
(double)(t2 - t1) / CLOCKS_PER_SEC);
t1= clock();
p = strrchr
(
"Hello the very, very, "
"quite contrary, benevolent "
"and sometimes forgiving, but also, "
"at the same time, very "
"unforgiving, World!",
'H'
);
t2= clock();
printf("p is %p,\n", (void*)p);
printf("and that took %.12f seconds\n",
(double)(t2 - t1) / CLOCKS_PER_SEC);
return 0;
}

Output:

p is 0042603C,
and that took 0.000000000000 seconds
p is 0042602C,
and that took 0.000000000000 seconds
p is 00426FA4,
and that took 0.000000000000 seconds
Press any key to continue

Regards

OK. Your arguments are very convincing, being voiced by all people that
support the c strings:

WE DO NOT CARE ABOUT OPTIMIZATION OR GOOD ALGORITHMS.
Machines are fast this days. Yes. Bad constructs can go on
forever without anyone noticing it.
Jan 18 '07 #30
jacob navia said:
OK. Your arguments are very convincing, being voiced by all people that
support the c strings:

WE DO NOT CARE ABOUT OPTIMIZATION OR GOOD ALGORITHMS.
Yes, we do.

Would you mind dropping the shouting and the sarcasm and the knee-jerk
responses and the thoughtlessness and the "anyone who disagrees with me
must be an idiot" thing?

We'd get on a lot better with you if you just *tried* a little, you know.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
Jan 18 '07 #31
jacob navia <ja***@jacob.remcomp.frwrote:
jaysome a écrit :
On Wed, 17 Jan 2007 12:54:38 +0100, jacob navia
>I have repeated this over and over. For instance strrchr is vastly more
efficient when it can start at the end of the string and find the first
occcurrence of the searched for character *backwards* instead of
searching the whole string to find the last one!!!
Given the mean and varance of the length of a string, I find it hard
to believe that it would be "vastly more efficient". If you're talking
about a Mega-byte-length sting, then yes. But most strings are no more
than 4 or 16 or 64 or even hundreds of bytes in length. And in these
cases, strrchr(), as defined by the standard, should suffice.
OK. Your arguments are very convincing, being voiced by all people that
support the c strings:

WE DO NOT CARE ABOUT OPTIMIZATION OR GOOD ALGORITHMS.
Wrong. Not only do we not shout, because we are not petulant children
whose favourite toy is being criticised; but also, we _do_ care about
good programming constructs. That is why, being well aware of the use of
strings in the average program, we know that counted strings are _less_
efficient under most circumstances than terminated strings, bumf and
blather notwithstanding.

Richard
Jan 18 '07 #32
In article <45****************@news.xs4all.nl>,
Richard Bos <rl*@hoekstra-uitgeverij.nlwrote:
....
>good programming constructs. That is why, being well aware of the use of
strings in the average program, we know that counted strings are _less_
efficient under most circumstances than terminated strings, bumf and
Simply not true. As Jacob notes, simple dishonesty on your (and your
brethen's) part. Everybody knows that the only reason we stick with
terminated strings is because of history.

Note: I fully understand why you are lying and I'll even say that it
(doing so) is a necessary evil. But, it is a lie nonetheless.

Sorta like those WMDs... (another necessary lie)

Jan 28 '07 #33
In article <kM******************************@bt.com>,
Richard Heathfield <rj*@see.sig.invalidwrote:
>jacob navia said:
>OK. Your arguments are very convincing, being voiced by all people that
support the c strings:

WE DO NOT CARE ABOUT OPTIMIZATION OR GOOD ALGORITHMS.

Yes, we do.

Would you mind dropping the shouting and the sarcasm and the knee-jerk
responses and the thoughtlessness and the "anyone who disagrees with me
must be an idiot" thing?
Um, Pot, Kettle, Black.

I.e., he (Jacob) learned from the best. He was not at all abusive until
quite a while after you and your ilk had been pouring it on him.

Jan 28 '07 #34
Kenny McCormack said:
In article <kM******************************@bt.com>,
Richard Heathfield <rj*@see.sig.invalidwrote:
>>
Would you mind dropping the shouting and the sarcasm and the knee-jerk
responses and the thoughtlessness and the "anyone who disagrees with me
must be an idiot" thing?

Um, Pot, Kettle, Black.
Not so.
I.e., he (Jacob) learned from the best. He was not at all abusive until
quite a while after you and your ilk had been pouring it on him.
Again, not so.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
Jan 28 '07 #35

Kenny McCormack wrote:
In article <45****************@news.xs4all.nl>,
Richard Bos <rl*@hoekstra-uitgeverij.nlwrote:
...
good programming constructs. That is why, being well aware of the use of
strings in the average program, we know that counted strings are _less_
efficient under most circumstances than terminated strings, bumf and

Simply not true. As Jacob notes, simple dishonesty on your (and your
brethen's) part. Everybody knows that the only reason we stick with
terminated strings is because of history.

Note: I fully understand why you are lying and I'll even say that it
(doing so) is a necessary evil. But, it is a lie nonetheless.
If so, then why're you doing the disservice of exposing it?

Jan 28 '07 #36
In article <11**********************@a75g2000cwd.googlegroups .com>,
santosh <sa*********@gmail.comwrote:
>
Kenny McCormack wrote:
>In article <45****************@news.xs4all.nl>,
Richard Bos <rl*@hoekstra-uitgeverij.nlwrote:
...
>good programming constructs. That is why, being well aware of the use of
strings in the average program, we know that counted strings are _less_
efficient under most circumstances than terminated strings, bumf and

Simply not true. As Jacob notes, simple dishonesty on your (and your
brethen's) part. Everybody knows that the only reason we stick with
terminated strings is because of history.

Note: I fully understand why you are lying and I'll even say that it
(doing so) is a necessary evil. But, it is a lie nonetheless.

If so, then why're you doing the disservice of exposing it?
Because that is what I do.

I would, in fact, wager that most "muckrakers" - people who tell the
truth about government and other corrupt entities - actually know why
the lies are being told, but they choose to go ahead and tell the truth
(often at great personal peril) anyway, just because that is what they do.

Jan 28 '07 #37
"santosh" <sa*********@gmail.comwrites:
Kenny McCormack wrote:
[more of the same]
If so, then why're you doing the disservice of exposing it?
KM is a troll. I strongly recommend ignoring him, and killfiling him
if you're so inclined.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jan 28 '07 #38
In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.orgwrote:
>"santosh" <sa*********@gmail.comwrites:
>Kenny McCormack wrote:
[more of the same]
>If so, then why're you doing the disservice of exposing it?

KM is a troll. I strongly recommend ignoring him, and killfiling him
if you're so inclined.
KT is a moron. I strongly recommend ignoring him, and killfiling him
if you're so inclined.

Jan 29 '07 #39
santosh said:
>
Kenny McCormack wrote:
>In article <45****************@news.xs4all.nl>,
Richard Bos <rl*@hoekstra-uitgeverij.nlwrote:
...
>good programming constructs. That is why, being well aware of the use of
strings in the average program, we know that counted strings are _less_
efficient under most circumstances than terminated strings, bumf and

Simply not true. As Jacob notes, simple dishonesty on your (and your
brethen's) part. Everybody knows that the only reason we stick with
terminated strings is because of history.

Note: I fully understand why you are lying and I'll even say that it
(doing so) is a necessary evil. But, it is a lie nonetheless.

If so, then why're you doing the disservice of exposing it?
If it is a lie, then it's a big one, and it should certainly be exposed. But
of course it's not a lie. Richard Bos may be many things, but he is no
liar. The reason we stick with terminated strings is... is... well, there
isn't one, because we *don't* all stick with them! At least, not all of us
do so all the time.

The C language provides support for a rudimentary string model, which makes
no great claims to be anything special, but which basically works. If that
is good enough for you, fine, use it - and it /is/ good enough for many
people, so they use it. But for many /other/ people, it isn't good enough,
because their needs (or desires, or perceptions) are different. So C makes
it fairly easy to develop your own string model in C.

I've done this myself, but nevertheless I often find myself writing C
programs using the in-built string model. Why? Well, because it's simple
and quick to cut code that way. There are reasons to use more powerful
models, of course, but those reasons don't *always* apply. When they don't
apply, the good old-fashioned C string is perfectly adequate to the task
and is generally a bit quicker from the developer's (typist's!) point of
view. Someone who spends a lot of time writing programs that don't have to
deal with (the possibility of) insanely long inputs may well find that C
strings are more efficient than so-called "counted" (or "stretchy")
strings.

Just bear in mind when replying to Kenny McCormack that his articles give
every indication that he's not interested in C, not interested in helping
people, not interested in correctness, not interested in truth - he's only
interested in trying to poke fun at those who /are/ interested in C,
helping people, correctness, and truth. Don't expect reasoning, and don't
expect a shared objective. He's just trying to wreck the group. But we
don't have to let him.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
Jan 29 '07 #40
"santosh" <sa*********@gmail.comwrote:
Kenny McCormack wrote:
In article <45****************@news.xs4all.nl>,
Richard Bos <rl*@hoekstra-uitgeverij.nlwrote:
...
>good programming constructs. That is why, being well aware of the use of
>strings in the average program, we know that counted strings are _less_
>efficient under most circumstances than terminated strings, bumf and
Simply not true. As Jacob notes, simple dishonesty on your (and your
brethen's) part. Everybody knows that the only reason we stick with
terminated strings is because of history.

Note: I fully understand why you are lying and I'll even say that it
(doing so) is a necessary evil. But, it is a lie nonetheless.

If so, then why're you doing the disservice of exposing it?
Because Kenny is, as usual, talking bullshit. Just killfile it.

Richard
Jan 29 '07 #41
Richard Heathfield wrote:
>I.e., he (Jacob) learned from the best. He was not at all abusive until
quite a while after you and your ilk had been pouring it on him.

Again, not so.
I do see Jacob beat up on by regulars subtlely and overtly in the past. Why
not give the guy a break?
Feb 3 '07 #42
In article <11************@news-west.n>,
Christopher Layne <cl****@com.anodizedwrote:
>Richard Heathfield wrote:
>>I.e., he (Jacob) learned from the best. He was not at all abusive until
quite a while after you and your ilk had been pouring it on him.

Again, not so.

I do see Jacob beat up on by regulars subtlely and overtly in the past. Why
not give the guy a break?
Careful now. Logic makes these guys' little psyches hurt.

Feb 3 '07 #43
Christopher Layne said:
Richard Heathfield wrote:
>>I.e., he (Jacob) learned from the best. He was not at all abusive until
quite a while after you and your ilk had been pouring it on him.

Again, not so.

I do see Jacob beat up on by regulars subtlely and overtly in the past.
I don't. I see mistakes in his articles being corrected by regulars. It is
only because he makes so many mistakes that he is corrected so often.
Why not give the guy a break?
Gladly. All he has to do is make fewer mistakes. Then he won't get corrected
so often. It's very simple.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
Feb 3 '07 #44
In article <N7*********************@bt.com>,
Loony Richard Heathfield <rj*@see.sig.invalidblathered as usual:
....
>I don't. I see mistakes in his articles being corrected by regulars. It is
only because he makes so many mistakes that he is corrected so often.
>Why not give the guy a break?

Gladly. All he has to do is make fewer mistakes. Then he won't get corrected
so often. It's very simple.
What a tool!

Feb 3 '07 #45
Richard Heathfield wrote:
Gladly. All he has to do is make fewer mistakes. Then he won't get corrected
so often. It's very simple.
The same rationale abusers use as well.
Feb 3 '07 #46
Christopher Layne said:
Richard Heathfield wrote:
>Gladly. All he has to do is make fewer mistakes. Then he won't get
corrected so often. It's very simple.

The same rationale abusers use as well.
When I make mistakes, I hope and expect that others will correct me. When
others make mistakes, then, it is only courteous for me to correct those
mistakes, if I happen to notice them and if there is time available to me
to do that. You can misdescribe the process as much as you wish, but that
doesn't change the facts.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
Feb 3 '07 #47
In article <11************@news-west.n>,
Christopher Layne <cl****@com.anodizedwrote:
>Richard Heathfield wrote:
>Gladly. All he has to do is make fewer mistakes. Then he won't get
corrected so often. It's very simple.

The same rationale abusers use as well.
Funny how that works, innit?

Feb 3 '07 #48
In article <Bb*********************@bt.com>,
Richard Heathfield <rj*@see.sig.invalidwrote:
>Christopher Layne said:
>Richard Heathfield wrote:
>>Gladly. All he has to do is make fewer mistakes. Then he won't get
corrected so often. It's very simple.

The same rationale abusers use as well.

When I make mistakes, I hope and expect that others will correct me. When
others make mistakes, then, it is only courteous for me to correct those
mistakes, if I happen to notice them and if there is time available to me
to do that.
Spoken like a true abuser.
>You can misdescribe the process as much as you wish, but that
doesn't change the facts.
You are living proof of that.

Feb 3 '07 #49
On Sat, 03 Feb 2007 14:32:41 +0000, in comp.lang.c , Richard
Heathfield <rj*@see.sig.invalidwrote:
>Christopher Layne said:
>Richard Heathfield wrote:
>>Gladly. All he has to do is make fewer mistakes. Then he won't get
corrected so often. It's very simple.

The same rationale abusers use as well.
That doesn't make it wrong, merely hijacked.
>When I make mistakes, I hope and expect that others will correct me.
Doesn't often happen of course... :-)
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Feb 3 '07 #50

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by Randy Charles Morin | last post: by
48 posts views Thread by David J Patrick | last post: by
12 posts views Thread by Alberto Giménez | last post: by
7 posts views Thread by Wiebe Tijsma | last post: by
4 posts views Thread by guiromero | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.