What is Expressiveness in a Computer Language

Anton van Straaten <an***@appsolutions.com> wrote:

The problem is that there are no useful sound definitions for the type
systems (in the static sense) of dynamically-typed languages. Yet, we
work with type-like static properties in those languages all the time,
as I've been describing.
I honestly don't find it very compelling that there are similarities
between the kinds of reasoning that goes on in static type systems
versus those used by programmers in dynamically typed languages.
Rather, of course there are similarities. It would be shocking if there
were not similarities. These similarities, though, have nothing to with
the core nature of types.

Basically, there are ways to reason about programs being correct.
Static type systems describe their reasoning (which is always axiomatic
in nature) by assigning "types" to expressions. Programmers in
dynamically typed languages still care about the correctness of their
programs, though, and they find ways to reason about their programs as
well. That reasoning is not completely axiomatic, but we spend an
entire lifetime applying this basic logical system, and it makes sense
that programmers would try to reason from premises to conclusions in
ways similar to a type system. Sometimes they may even ape other type
systems with which they are familiar; but even a programmer who has
never worked in a typed language would apply the same kind of logic as
is used by type systems, because it's a form of logic with which
basically all human beings are familiar and have practiced since the age
of three (disclaimer: I am not a child psychologist).

On the other hand, programmers don't restrict themselves to that kind of
pure axiomatic logic (regardless of whether the language they are
working in is typed). The also reason inductively -- in the informal
sense -- and are satisfied with the resulting high probabilities. They
generally apply intuitions about a problem statement that is almost
surely not written clearly enough to be understood by a machine. And so
forth.

What makes static type systems interesting is not the fact that these
logical processes of reasoning exist; it is the fact that they are
formalized with definite axioms and rules of inference, performed
entirely on the program before execution, and designed to be entirely
evaluatable in finite time bounds according to some procedure or set of
rules, so that a type system may be checked and the same conclusion
reached regardless of who (or what) is doing the checking. All that,
and they still reach interesting conclusions about programs. If
informal reasoning about types doesn't meet these criteria (and it
doesn't), then it simply doesn't make sense to call it a type system
(I'm using the static terminology here). It may make sense to discuss
some of the ways that programmer reasoning resembles types, if indeed
there are resemblances beyond just that they use the same basic rules of
logic. It may even make sense to try to identify a subset of programmer
reasoning that even more resembles... or perhaps even is... a type
system. It still doesn't make sense to call programmer reasoning in
general a type system.

In the same post here, you simultaneously suppose that there's something
inherently informal about the type reasoning in dynamic languages; and
then talk about the type system "in the static sense" of a dynamic
language. How is this possibly consistent?
We know that we can easily take dynamically-typed program fragments and
assign them type schemes, and we can make sure that the type schemes
that we use in all our program fragments use the same type system.
Again, it would be surprising if this were not true. If programmers do,
in fact, tend to reason correctly about their programs, then one would
expect that there are formal proofs that could be found that they are
right. That doesn't make their programming in any way like a formal
proof. I tend to think that a large number of true statements are
provable, and that programmers are good enough to make a large number of
statements true. Hence, I'd expect that it would be possible to find a
large number of true, provable statements in any code, regardless of
language.
So we actually have quite a bit of evidence about the presence of static
types in dynamically-typed programs.
No. What we have is quite a bit of evidence about properties remaining
true in dynamically typed programs, which could have been verified by
static typing.
We're currently discussing something that so far has only been captured
fairly informally. If we restrict ourselves to only what we can say
about it formally, then the conversation was over before it began.

I agree with that statement. However, I think the conversation
regarding static typing is also over when you decide that the answer is
to weaken static typing until the word applies to informal reasoning.
If the goal isn't to reach a formal understanding, then the word "static
type" shouldn't be used; and when that is the goal, it still shouldn't
be applied to process that aren't formalized until they manage to become
formalized.

Hopefully, this isn't perceived as too picky. I've already conceded
that we can use "type" in a way that's incompatible with all existing
research literature. I would, however, like to retain some word with
actually has that meaning. Otherwise, we are just going to be
linguistically prevented from discussing it.

--
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation

Jun 25 '06 #402

Jürgen Exner

Scott David Daniels wrote:

ro******@ps.uni-sb.de wrote:
Huh? There is a huge, fundamental difference: namely whether a type
system is sound or not. A soundness proof is obligatory for any
serious type theory, and failure to establish it simply is a bug in
the theory.

So you claim Java and Objective C are "simply bugs in the theory."

They are certainly off topic in CLPM.

jue

Jun 25 '06 #403

Chris Uppal wrote:

It seems to me that most (all ? by definition ??) kinds of reasoning where we
want to invoke the word "type" tend to have a form where we reduce values (and
other things we want to reason about) to equivalence classes[*] w.r.t the
judgements we wish to make, and (usually) enrich that structure with
more-or-less stereotypical rules about which classes are substitutable for
which other ones. So that once we know what "type" something is, we can
short-circuit a load of reasoning based on the language semantics, by
translating into type-land where we already have a much simpler calculus to
guide us to the same answer.

(Or representative symbols, or... The point is just that we throw away the
details which don't affect the judgements.)

One question: is this more specific than simply saying that we use
predicate logic? A predicate divides a set into two subsets: those
elements for which the predicate is true, and those for which it is
false. If we then apply axioms or theorems of the form P(x) -> Q(x),
then we are reasoning from an "equivalence class", throwing away the
details we don't care about (anything that's irrelevant to the
predicate), and applying stereotypical rules (the theorem or axiom). I
don't quite understand what you mean by substituting classes.

--
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation

Jun 25 '06 #404

Chris Smith schrieb:

Joachim Durchholz <jo@durchholz.org> wrote:
Sorry, I have to insist that it's not me who's stretching terms here.

All textbook definitions that I have seen define a type as the
set/operations/axioms triple I mentioned above.
No mention of immutability, at least not in the definitions.

The immutability comes from the fact (perhaps implicit in these
textbooks, or perhaps they are not really texts on formal type theory)
that types are assigned to expressions,

That doesn't *define* what's a type or what isn't!

If it's impossible to assign types to all expressions of a program in a
language, that does mean that there's no useful type theory for the
program, but it most definitely does not mean that there are no types in
the program.
I can still sensibly talk about sets of values, sets of allowable
operations over each value, and about relationships between inputs and
outputs of these operations.

So programs have types, even if they don't have a static type system.
Q.E.D.

Regards,
Jo

Jun 25 '06 #405

rossberg

Gabriel Dos Reis wrote:

ro******@ps.uni-sb.de writes:

| Gabriel Dos Reis wrote:
| > |
| > | (Unfortunately, you can hardly write interesting programs in any safe
| > | subset of C.)
| >
| > Fortunately, some people do, as living job.
|
| I don't think so. Maybe the question is what a "safe subset" consists
| of. In my book, it excludes all features that are potentially unsafe.

if you equate "unsafe" with "potentially unsafe", then you have
changed gear and redefined things on the fly, and things that could
be given sense before ceases to have meaning. I decline following
such an uncertain, extreme, path.
An unsafe *feature* is one that can potentially exhibit unsafe
behaviour. How else would you define it, if I may ask?

A safe *program* may or may not use unsafe features, but that is not
the point when we talk about safe *language subsets*.
I would suggest you give more thoughts to the claims made in

http://www.seas.upenn.edu/~sweirich/.../msg00298.html

Thanks, I am aware of it. Taking into account the hypothetical nature
of the argument, and all the caveats listed with respect to C, I do not
think that it is too relevant for the discussion at hand. Moreover,
Harper talks about a relative concept of "C-safety". I assume that
everybody here understands that by "safe" in this discussion we mean
something else (in particular, memory safety).

Or are you trying to suggest that we should indeed consider C safe for
the purpose of this discussion?

- Andreas

Jun 25 '06 #406

ro******@ps.uni-sb.de schrieb:

Joachim Durchholz write:
Another observation: type safeness is more of a spectrum than a clearcut
distinction. Even ML and Pascal have ways to circumvent the type system,
No. I'm not sure about Pascal,

You'd have to use an untagged union type.
It's the standard maneuver in Pascal to get unchecked bitwise
reinterpretation.
Since it's an undefined behavior, you're essentially on your own, but in
practice, any compilers that implemented a different semantics were
hammered with bug reports until they complied with the standard - in
this case, an unwritten (and very unofficial) one, but a rather
effective one.
but (Standard) ML surely has none.
NLFFI?
Same with Haskell as defined by its spec.
Um... I'm not 100% sure, but I dimly (mis?)remember having read that
UnsafePerformIO also offered some ways to circumvent the type system.

(Not that this would be an important point anyway.)
OCaml has a couple of clearly
marked unsafe library functions, but no unsafe feature in the language
semantics itself.
If there's a library with not-typesafe semantics, then at least that
language implementation is not 100% typesafe.
If all implementations of a language are not 100% typesafe, then I
cannot consider the language itself 100% typesafe.

Still, even Pascal is quite a lot "more typesafe" than, say, C.

and even C is typesafe unless you use unsafe constructs.

Tautology. Every language is "safe unless you use unsafe constructs".

No tautology - the unsafe constructs aren't just typewise unsafe ;-p

That's exactly why I replaced Luca Cardelli's "safe/unsafe"
"typesafe/not typesafe". There was no definition to the original terms
attached, and this discussion is about typing anyway.
(Unfortunately, you can hardly write interesting programs in any safe
subset of C.)

Now that's a bold claim that I'd like to see substantiated.

IOW from a type-theoretic point of view, there is no real difference
between their typesafe and not typesafe languages in the "statically
typed" column; the difference is in the amount of unsafe construct usage
in practial programs.

Huh? There is a huge, fundamental difference: namely whether a type
system is sound or not.

I think you're overstating the case.

In type theory, of course, there's no such things as an "almost typesafe
language" - it's typesafe or it isn't.

In practice, I find no implementation where type mismatches cannot
occur, if only when interfacing with the external world (except if you
cheat by treating everything external as a byte sequence, which is like
saying that all languages have at least a universal ANY type and are
hence statically-typed).
And in those languages that do have type holes, these holes may be more
or less relevant - and it's a *very* broad spectrum here.
And from that perspective, if ML indeed has no type hole at all, then
it's certainly an interesting extremal point, but I still have a
relevant spectrum down to assembly language.

Regards,
Jo

Jun 25 '06 #407

rossberg

Scott David Daniels wrote:

ro******@ps.uni-sb.de wrote:
Huh? There is a huge, fundamental difference: namely whether a type
system is sound or not. A soundness proof is obligatory for any serious
type theory, and failure to establish it simply is a bug in the theory.

So you claim Java and Objective C are "simply bugs in the theory."

You're misquoting me. What I said is that lack of soundness of a
particular type theory (i.e. a language plus type system) is a bug in
that theory.

But anyway, Java in fact is frequently characterised as having a bogus
type system (e.g. with respect to array subtyping). Of course, in a
hybrid language like Java with its dynamic checks it can be argued
either way.

Naturally, the type system of Objective-C is as broken as the one of
plain C.

- Andreas

Jun 25 '06 #408

ro******@ps.uni-sb.de schrieb:

Gabriel Dos Reis wrote:
|
| (Unfortunately, you can hardly write interesting programs in any safe
| subset of C.)

Fortunately, some people do, as living job.

I don't think so. Maybe the question is what a "safe subset" consists
of. In my book, it excludes all features that are potentially unsafe.

Unless you define "safe", *any* program is unsafe.
Somebody could read the program listing, which could trigger a traumatic
childhood experiece.
(Yes, I'm being silly. But the point is very serious. Even with less
silly examples, whether a language or subset is "safe" entirely depends
on what you define to be "safe", and these definitions tend to vary
vastly across language communities.)

Regards,
Jo

Jun 25 '06 #409

Pascal Costanza schrieb:

Another observation: type safeness is more of a spectrum than a
clearcut distinction. Even ML and Pascal have ways to circumvent the
type system, and even C is typesafe unless you use unsafe constructs.
IOW from a type-theoretic point of view, there is no real difference
between their typesafe and not typesafe languages in the "statically
typed" column; the difference is in the amount of unsafe construct
usage in practial programs.

It's also relevant how straightforward it is to distinguish between safe
and unsafe code, how explicit you have to be when you use unsafe code,
how likely it is that you accidentally use unsafe code without being
aware of it, what the generally accepted conventions in a language
community are, etc. pp.

Fully agreed.

Jun 25 '06 #410

ro******@ps.uni-sb.de wrote:

In this context, the term "latently-typed language" refers to the
language that a programmer experiences, not to the subset of that
language which is all that we're typically able to formally define.

That language is not a subset, if at all, it's the other way round, but
I'd say they are rather incomparable. That is, they are different
languages.

The "subset" characterization is not important for what I'm saying. The
fact that they are different languages is what's important. If you
agree about that, then you can at least understand which language I'm
referring to when I say "latently-typed language".

Besides, many dynamically-typed languages have no formal models, in
which case the untyped formal model I've referred to is just a
speculative construct. The language I'm referring to with
"latently-typed language" is the language that programmers are familiar
with, and work with.
That is starting to get a bit too mystical for my tastes.

I have to agree.

\sarcasm One step further, and somebody starts calling C a "latently
memory-safe language", because a real programmer "knows" that his code
is in a safe subset... And where he is wrong, dynamic memory page
protection checks will guide him.

That's a pretty apt comparison, and it probably explains how it is that
the software we all use, which relies so heavily on C, works as well as
it does.

But the comparison critiques the practice of operating without static
guarantees, it's not a critique of the terminology.

Anton

Jun 25 '06 #411

Joachim Durchholz <jo@durchholz.org> wrote:

The immutability comes from the fact (perhaps implicit in these
textbooks, or perhaps they are not really texts on formal type theory)
that types are assigned to expressions,

That doesn't *define* what's a type or what isn't!

I'm sorry if you don't like it, but that's all there is. That's the
point that's being missed here. It's not as if there's this thing
called a type, and then we can consider how it is used by formal type
systems. A type, in formal type theory, is ANY label that is assigned
to expressions of a program for the purpose of making a formal type
system work. If you wish to claim a different use of the word and then
try to define what is or is not a type, then be my guest. However,
formal type theory will still not adopt whatever restrictions you come
up with, and will continue to use the word type as the field has been
using the word for a good part of a century now.

The first step toward considering the similarities and differences
between the different uses of "type" in dynamic type systems and formal
type theory is to avoid confusing aspects of one field with the other.

--
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation

Jun 25 '06 #412

Andrew McDonagh

Joachim Durchholz wrote:

Chris Smith schrieb:
Joachim Durchholz <jo@durchholz.org> wrote:
Sorry, I have to insist that it's not me who's stretching terms here.

All textbook definitions that I have seen define a type as the
set/operations/axioms triple I mentioned above.
No mention of immutability, at least not in the definitions.
The immutability comes from the fact (perhaps implicit in these
textbooks, or perhaps they are not really texts on formal type theory)
that types are assigned to expressions,

That doesn't *define* what's a type or what isn't!

If it's impossible to assign types to all expressions of a program in a
language, that does mean that there's no useful type theory for the
program, but it most definitely does not mean that there are no types in
the program.
I can still sensibly talk about sets of values, sets of allowable
operations over each value, and about relationships between inputs and
outputs of these operations.

So programs have types, even if they don't have a static type system.
Q.E.D.

Of course not. Otherwise programs using dynamically typed systems
wouldnt exist.
Regards,
Jo

I haven't read all of this thread, I wonder, is the problem to do with
Class being mistaken for Type? (which is usually the issue)

Jun 25 '06 #413

Marshall

Chris F Clark wrote:

Chris F Clark (I) wrote:
I'm particularly interested if something unsound (and perhaps
ambiguous) could be called a type system. I definitely consider such
things type systems.
"Marshall" <ma*************@gmail.com> wrote:
I don't understand. You are saying you prefer to investigate the
unsound over the sound?

...
Again, I cannot understand this. In a technical realm, vagueness
is the opposite of understanding.

At the risk of injecting too much irrelevant philosophy into the
discussion, I will with great trepdiation reply.

I agree this is OT, but I'm not sure about the source of trepidation.

First in the abstrtact: No, understanding is approximating.
Agreed.

The world is inherently vague.
Our understanding of the world is vague. The world itself is
not at all vague.

We make false symbolic models of the world which
are consistent, but at some level they do not reflect reality,
Yes...
because reality isn't consistent.
What?!

Only by abtracting away the inherent
infinite amout of subtlety present in the real universe can we come to
comprehensible models.
Sure. (Although I object to "infinite.")

Those models can be consistent, but they are
not the universe. The models in their consistency, prove things which
are not true about the real universe.
Sure, sure, sure. None of these is a reaon to prefer the unsound
over the sound.

Now in the concrete: In my work productivity is ultimately important.
Therefore, we live by the 80-20 rule in numerous variations. One of
ths things we do to achieve productivity is simplify things. In fact,
we are more interested in an unsound simplification that is right 80%
of the time, but takes only 20% of the effort to compute, than a
completely sound (100% right) model which takes 100% of the time to
compute (which is 5 times as long). We are playing the probabilities.
What you are describing is using a precise mathematical function
to approximate a different precise mathematical function. This
argues for the value of approximation functions, which I do not
dispute. But this does not in any way support the idea of vague
trumping precise, informal trumping formal, or unsoundness as
an end in itself.

It's not that we don't respect the sound underpining, the model which
is consistent and establishes certain truths. However, what we want
is the rule of thumb which is far simpler and approximates the sound
model with reasonable accuracy. In particular, we accept two types of
unsoundness in the model. One, we accept a model which gives wrong
answers which are detectable. We code tests to catch those cases, and
use a different model to get the right answer. Two, we accept a model
which gets the right answer only when over-provisioned. for example,
if we know a loop will occassionaly not converge in the n-steps that
it theoretically should, we will run the loop for n+m steps until the
approximation also converges--even if that requires allocating m extra
copies of some small element than are theoretically necessary. A
small waste of a small resource, is ok if it saves a waste of a more
critical resource--the most critical resource of all being our project
timeliness.
Î¥es, I'm quite familiar with modelling, abstraction, approximation,
etc. However nothing about those endevours suggests to me
that unsoundness is any kind of goal.

Marshall's last point:
I flipped a coin to see who would win the election; it came
up "Bush". Therefore I *knew* who was going to win the
election before it happened. See the probem?
Flipping one coin to determine an election is not playing the
probabilities correctly. You need a plausible explanation for why the
coin should predict the right answer and a track record of it being
correct. If you had a coin that had correctly predicted the previous
42 presidencies and you had an explanation why the coin was right,
then it would be credible and I would be willing to wager that it
could also predict that the coin could tell us who the 44th president
would be. One flip and no explanation is not sufficient. (And to the
abstract point, to me that is all knowledge is, some convincing amount
of examples and a plausible explanation--anyone who thinks they have
more is appealing to a "knowledge" of the universe that I don't
accept.)

I used a coin toss; I could also have used a psycic hotline. There
is an explanation for why those things work, but the explanation
is unsound.

Look at where that got Russell and Whitehead.
Universal acclaim, such that their names will be praised for
centuries to come?

I'm just trying to be "honest" about that fact and find ways to
compensate for my own failures.

Limitation != failure.
Marshal

Jun 25 '06 #414

Andrew McDonagh <ne**@andmc.com> wrote:

I haven't read all of this thread, I wonder, is the problem to do with
Class being mistaken for Type? (which is usually the issue)

Hi Andrew!

Not much of this thread has to do with object oriented languages... so
the word "class" would be a little out of place. However, it is true
that the word "type" is being used in the dynamically typed sense to
include classes from class-based OO languages (at least those that
include run-time type features), as well as similar concepts in other
languages. Some of us are asking for, and attempting to find, a formal
definition to justify this concept, and are so far not finding it.
Others are saying that the definition is somehow implicitly
psychological in nature, and therefore not amenable to formal
definition... which others (including myself) find rather unacceptable.

I started out insisting that "type" be used with its correct formal
definition, but I'm convinced that was a mistake. Asking someone to
change their entire terminology is unlikely to succeed. I'm now
focusing on just trying to represent the correct formal definition of
types in the first place, and make it clear when one or the other
meaning is being used.

Hopefully, that's a fair summary of the thread to date.

--
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation

Jun 25 '06 #415

Andrew McDonagh

Chris Smith wrote:

Andrew McDonagh <ne**@andmc.com> wrote:
I haven't read all of this thread, I wonder, is the problem to do with
Class being mistaken for Type? (which is usually the issue)
Hi Andrew!

Hi Chris

Not much of this thread has to do with object oriented languages... so
the word "class" would be a little out of place.
Glad to here.
However, it is true
that the word "type" is being used in the dynamically typed sense to
include classes from class-based OO languages (at least those that
include run-time type features), as well as similar concepts in other
languages. Some of us are asking for, and attempting to find, a formal
definition to justify this concept, and are so far not finding it.
Others are saying that the definition is somehow implicitly
psychological in nature, and therefore not amenable to formal
definition... which others (including myself) find rather unacceptable.

I started out insisting that "type" be used with its correct formal
definition, but I'm convinced that was a mistake. Asking someone to
change their entire terminology is unlikely to succeed. I'm now
focusing on just trying to represent the correct formal definition of
types in the first place, and make it clear when one or the other
meaning is being used.

Hopefully, that's a fair summary of the thread to date.

Cheers much appreciated!

Andrew

Jun 25 '06 #416

George Neuner

On Sun, 25 Jun 2006 13:42:45 +0200, Joachim Durchholz
<jo@durchholz.org> wrote:

George Neuner schrieb:
The point is really that the checks that prevent these things must be
performed at runtime and can't be prevented by any practical type
analysis performed at compile time. I'm not a type theorist but my
opinion is that a static type system that could, a priori, prevent the
problem is impossible.
No type theory is needed.
Assume that the wide index type goes into a function and the result is
assigned to a variable fo the narrow type, and it's instantly clear that
the problem is undecidable.

Yes ... the problem is undecidable and that can be statically checked.
But the result is that your program won't compile even if it can be
proved at runtime that an illegal value would never be possible.

Undecidability can always be avoided by adding annotations, but of
course that would be gross overkill in the case of index type widening.

Just what sort of type annotation will convince a compiler that a
narrowing conversion which could produce an illegal value will, in
fact, never produce an illegal value?

[Other than "don't check this" of course.]

George
--
for email reply remove "/" from address

Jun 25 '06 #417

rossberg

Joachim Durchholz wrote:

> but (Standard) ML surely has none.

NLFFI?
> Same with Haskell as defined by its spec.

Um... I'm not 100% sure, but I dimly (mis?)remember having read that
UnsafePerformIO also offered some ways to circumvent the type system.

Neither NLFFI nor unsafePerformIO are official part of the respective
languages. Besides, foreign function interfaces are outside a language
by definition, and can hardly be taken as an argument - don't blame
language A that unsafety arises when you subvert it by interfacing with
unsafe language B on a lower level.

and even C is typesafe unless you use unsafe constructs.

Tautology. Every language is "safe unless you use unsafe constructs".

No tautology - the unsafe constructs aren't just typewise unsafe ;-p

That's exactly why I replaced Luca Cardelli's "safe/unsafe"
"typesafe/not typesafe". There was no definition to the original terms
attached, and this discussion is about typing anyway.

The Cardelli paper I was referring to discusses it in detail. And if
you look up the context of my posting: it was exactly whether safety is
to be equated with type safety.

IOW from a type-theoretic point of view, there is no real difference
between their typesafe and not typesafe languages in the "statically
typed" column; the difference is in the amount of unsafe construct usage
in practial programs.

Huh? There is a huge, fundamental difference: namely whether a type
system is sound or not.

I think you're overstating the case.

In type theory, of course, there's no such things as an "almost typesafe
language" - it's typesafe or it isn't.

Right, and you were claiming to argue from a type-theoretic POV.

[snipped the rest]

- Andreas

Jun 25 '06 #418

Darren New

Marshall wrote:

Also: has subtyping polymorphism or not, has parametric polymorphism or
not.

And covariant or contravariant.

--
Darren New / San Diego, CA, USA (PST)
Native Americans used every part
of the buffalo, including the wings.

Jun 25 '06 #419

Marshall

David Hopwood wrote:

Marshall wrote:
David Hopwood wrote:
A type system that required an annotation on all subprograms that do not
provably terminate, OTOH, would not impact expressiveness at all, and would
be very useful.
Interesting. I have always imagined doing this by allowing an
annotation on all subprograms that *do* provably terminate. If
you go the other way, you have to annotate every function that
uses general recursion (or iteration if you swing that way) and that
seems like it might be burdensome.

Not at all. Almost all subprograms provably terminate (with a fairly
easy proof), even if they use general recursion or iteration.

Well, um, hmmm. If a subprogram uses recursion, and it is not
structural recursion, then I don't know how to go about proving
it terminates. Are the proof-termination techniques I don't
know about?

If we can specify a total order on the domain or some
subcomponents of the domain, and show that recursive
calls are always invoked with arguments less than (by
the order) those of the parent call, (and the domain is
well founded) then we have a termination proof.

(If we don't use recursion at all, and we only invoke
proven-terminating functions, same thing; actually
this is a degenerate case of the above.)

For iteration? Okay, a bounded for-loop is probably easy,
but a while loop? What about a function that calculates
the next prime number larger than its argument? Do we
have to embed a proof of an infinity of primes somehow?
That seems burdensome.

If it were not the case that almost all functions provably terminate,
then the whole idea would be hopeless.
I agree!

If a subprogram F calls G, then
in order to prove that F terminates, we probably have to prove that G
terminates. Consider a program where only half of all subprograms are
annotated as provably terminating. In that case, we would be faced with
very many cases where the proof cannot be discharged, because an
annotated subprogram calls an unannotated one.
Right, and you'd have to be applying the non-terminating annotation
all over the place.
If, on the other hand, more than, say, 95% of subprograms provably
terminate, then it is much more likely that we (or the inference
mechanism) can easily discharge any particular proof involving more
than one subprogram. So provably terminating should be the default,
and other cases should be annotated.
Well, I can still imagine that the programmer doesn't care to have
non-termination examined for every part of his code. In which case,
he would still be required to add annotations even when he doesn't
care about a particular subprograms lack of a termination proof.

The pay-for-it-only-if-you-want-it approach has some benefits.
On the other hand, if it really is as common and easy as you
propose, then annotating only when no proof is available is
perhaps feasible.

I'm still a bit sceptical, though.

I do not know how well such a type system would work in practice; it may
be that typical programs would involve too many non-trivial proofs. This
is something that would have to be tried out in a research language.

Yeah.
Marshall

Jun 25 '06 #420

George Neuner <gneuner2/@comcast.net> wrote:

Undecidability can always be avoided by adding annotations, but of
course that would be gross overkill in the case of index type widening.

Just what sort of type annotation will convince a compiler that a
narrowing conversion which could produce an illegal value will, in
fact, never produce an illegal value?

The annotation doesn't make the narrowing conversion safe; it prevents
the narrowing conversion from happening. If, for example, I need to
subtract two numbers and all I know is that they are both between 2 and
40, then I only know that the result is between -38 and 38, which may
contain invalid array indices. However, if the numbers were part of a
pair, and I knew that the type of the pair was <pair of numbers, 2
through 40, where the first number is greater than the second>, then I
would know that the difference is between 0 and 38, and that may be a
valid index.

Of course, the restrictions on code that would allow me to retain
knowledge of the form [pair of numbers, 2 through 40, a > b] may be
prohibitive. That is an open question in type theory, as a matter of
fact; whether types of this level of power may be inferred by any
tractable procedure so that safe code like this may be written without
making giving the development process undue difficulty by requiring ten
times as much type annotations as actual code. There are attempts that
have been made, and they don't look too awfully bad.

--
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation

Jun 25 '06 #421

Chris F Clark

Chris F Clark <cf*@shell01.TheWorld.com> (I) wrote:

Do you reject that there could be something more general than what a
type theorist discusses? Or do you reject calling such things a type?
Chris Smith <cd*****@twu.net> wrote:
I think that the correspondence partly in the wrong direction to
describe it that way. .... What I can't agree to is that what you propose is actually more general.
It is more general in some ways, and more limited in others. As such,
the best you can say is that is analogous to formal types in some ways,
but certainly not that it's a generalization of them.
Yes, I see your point.

Me: I'm particularly interested if something unsound (and perhaps
ambiguous) could be called a type system.
Chris Smith:
Yes, although this is sort of a pedantic question in the first place, I
believe that an unsound type system would still generally be called a
type system in formal type theory. However, I have to qualify that with
a clarification of what is meant by unsound. We mean that it DOES
assign types to expressions, but that those types are either not
sufficient to prove what we want about program behavior, or they are not
consistent so that a program that was well typed may reduce to poorly
typed program during its execution. Obviously, such type systems don't
prove anything at all, but they probably still count as type systems.

Yes, and you see mine.

These informal systems, which may not prove what they claim to prove
are my concept of a "type system". That's because at some point,
there is a person reasoning informally about the program. (There may
also people reasoning formally about the program, but I am going to
neglect them.) It is to the people who are reasoning informally about
the program we wish to communicate that there is a type error. That
is, we want someone who is dealing with the program informally to
realize that there is an error, and that this error is somehow related
to some input not being kept within proper bounds (inside the domain
set) and that as a result the program will compute an unexpected and
incorrect result.

I stressed the informal aspect, because the intended client of the
program is almost universally *NOT* someone who is thinking rigorously
about some mathematical model, but is dealing with some "real world"
phenomena which they wish to model and may not have a completely
formed concrete representation of. Even the author of the program is
not likely to have a completely formed rigorous model in mind, only
some approxiamtion to such a model.

I also stress the informality, because beyond a certain nearly trivial
level of complexity, people are not capable of dealing with truly
formal systems. As a compiler person, I often have to read sections
of language standards to try to discern what the standard is requiring
the compiler to do. Many language standards attempt to describe some
formal model. However, most are so imprecise and ambiguous when they
attempt to do so that the result is only slightly better than not
trying at all. Only when one understands the standard in the context
of typical practice can one begin to fathom what the standard is
trying to say and why they say it the way they do.

A typical example of this is the grammars for most programming
languages, they are expressed in a BNF variant. However, most of the
BNF descriptions omit important details and gloss over certain
inconsistencies. Moreover, most BNF descriptions are totally
unsuitable to a formal translation (e.g. an LL or LR parser generator,
or even an Earley or CYK one). Yet, when considered in context of
typical implementations, the errors in the formal specification can
easily be overlooked and resolved to produce what the standard authors
intended.

For example, a few years ago I wrote a Verilog parser by
transliterating the grammar in the standard (IEEE 1364-1995) into the
notation used by my parser generator, Yacc++. It took a couple of
days to do so. The resulting parser essentially worked on the first
try. However, the reason it did so, is that I had a lot of outside
knowledge when I was making the transliteration. There were places
where I knew that this part of the grammar was lexical for lexing and
this part was for parsing and cut the grammar correctly into two
parts, despite there being nothing in the standard which described
that dichotomy. There were other parts, like the embedded
preprocessor, that I knew were borrowed from C, so that I could use
the reoultions of issues the way a C compiler would do so, to guide
resolving the ambiguities in the Verilog standard. There were other
parts, I could tell the standard was simply written wrong, that is
where the grammar did not specify the correct BNF, because the BNF
they had written did not specifiy something which would make
sense--for example, the endif on an if-then-else-statement was bound
to the else, so that one would write an if then without an endif, but
and if then else with the endif--and I knew that was not the correct
syntax.

The point of the above paragraph is that the very knowledgable people
writing the Verilog standard made mistakes in their formal
specification, and in fact these mistakes made it passed several
layers of review and had been formally ratified. Given the correct
outside knowledge these mistakes are relatively trivial to ignore and
correct. However, from a completely formal perspective, the standard
is simply incorrect.

More importantly, chip designers use knowledge of the standard to
guide the way they write chip descriptions. Thus, at some level these
formal erros, creep into chip designs. Fortunately, the chip
designers also look past the inconsistencies and "program" (design) in
a language which looks very like standard Verilog, but actually says
what they mean.

Moreover, not only is the grammar in the standard a place where
Verilog is not properly formalized. There are several parts of the
type model which suffer similar problems. Again, we have what the
standard specifies, and what is a useful (and expected) description of
the target model. There are very useful and informal models of how
the Verilog language work and what types are involved. There is also
an attempt in the standard to convey some of these aspects formally.
Occassionally, the formal description is write and illuminating, at
other times it is opaque and at still other times, it is inconsistent,
saying contradictory things in different parts of the standard.

Some of the more egregious errors were corrected in the 2001 revision.
However, at the same time, new features were added which were (as far
as I can tell) just as inconsistently specified. And, this is not the
case of incompetence or unprofessionalism. The Verilog standard is
not alone in these problems--it's just my most recent experience, and
thus the one I am most familiar with at the moment.

I would expect the Python, Scheme, Haskell, and ML standards to be
more precise and less inconsistent, because of their communities.
However, I would be surprised if they were entirely unambiguous and
error-free. In general, I think most non-trivial attempts at
formalization flounder on our inability.

Therefore, I do not want to exlcude from type systems, things wich are
informal and unsound. They are simply artifacts of human creation.

-Chris

Jun 25 '06 #422

Gabriel Dos Reis

Joachim Durchholz <jo@durchholz.org> writes:

[...]

| (Yes, I'm being silly. But the point is very serious. Even with less
| silly examples, whether a language or subset is "safe" entirely
| depends on what you define to be "safe", and these definitions tend to
| vary vastly across language communities.)

Yes, I know. I hoped we would escape sillyland for the sake of
productive conversation.

-- Gaby

Jun 25 '06 #423

Gabriel Dos Reis

ro******@ps.uni-sb.de writes:

| think that it is too relevant for the discussion at hand. Moreover,
| Harper talks about a relative concept of "C-safety".

Then, I believe you missed the entire point.

First point: "safety" is a *per-language* property. Each language
comes with its own notion of safety. ML is ML-safe; C is C-safe;
etc. I'm not being facetious; I think this is the core of the
confusion.

Safety is an internal consistency check on the formal definition of
a language. In a sense it is not interesting that a language is
safe, precisely because if it weren't, we'd change the language to
make sure it is! I regard safety as a tool for the language
designer, rather than a criterion with which we can compare
languages.
[...]

| Or are you trying to suggest that we should indeed consider C safe for
| the purpose of this discussion?

I'm essentially suggesting "silly arguments" (as noted by someone
in another message) be left out for the sake of productive
conversation. Apparently, I did not communicate that point well enough.

-- Gaby

Jun 25 '06 #424

David Hopwood

Scott David Daniels wrote:

ro******@ps.uni-sb.de wrote:
Huh? There is a huge, fundamental difference: namely whether a type
system is sound or not. A soundness proof is obligatory for any serious
type theory, and failure to establish it simply is a bug in the theory.

So you claim Java and Objective C are "simply bugs in the theory."

Java's type system was unsound for much of its life. I think that supports
the point that it's inadequate to simply "wish and hope" for soundness,
and that a proof should be insisted on.

--
David Hopwood <da******************@blueyonder.co.uk>

Jun 25 '06 #425

Darren New

Gabriel Dos Reis wrote:

I would suggest you give more thoughts to the claims made in
http://www.seas.upenn.edu/~sweirich/.../msg00298.html

I'm not sure I understand this. Looking at "Example 2", where C is
claimed to be "C-safe", he makes two points that I disagree with.

First, he says Unix misimplements the operations semantics of C. It
seems more like Unix simply changes the transition system such that
programs which access unmapped memory are terminal.

Second, he says that he hasn't seen a formal definition of C which
provides precide operational semantics. However, the language standard
itself specified numerous "undefined" results from operations such as
accessing unallocated memory, or even pointing pointers to unallocated
memory. So unless he wants to modify the C standard to indicate what
should happen in such situations, I'm not sure why he thinks it's "safe"
by his definition. There is no one "q" that a "p" goes to based on the
language. He's confusing language with particular implementations. He
seems to be saying "C is safe, oh, except for all the parts of the
language that are undefined and therefore unsafe. But if you just
clearly defined every part of C, it would be safe." It would also seem
to be the case that one could not say that a program "p" is either
terminal or not, as that would rely on input, right? gets() is "safe" as
long as you don't read more than the buffer you allocated.

What's the difference between "safe" and "well-defined semantics"?
(Ignoring for the moment things like two threads modifying the same
memory at the same time and other such hard-to-define situations?)

--
Darren New / San Diego, CA, USA (PST)
Native Americans used every part
of the buffalo, including the wings.

Jun 26 '06 #426

Chris F Clark <cf*@shell01.TheWorld.com> wrote:

These informal systems, which may not prove what they claim to prove
are my concept of a "type system".
Okay, that works. I'm not sure where it gets us, though, except for
gaining you the right to use "type system" in a completely uninteresting
sense to describe your mental processes. You still need to be careful,
since most statements from type theory about type systems tend to
implicitly assume the property of being sound.

I'm not exactly convinced that it's possible to separate that set of
reasoning people do about programs into types and not-types, though. If
you want to stretch the definition like this, then I'm afraid that you
end up with the entire realm of human thought counting as a type system.
I seem to recall that there are epistemologists who would be happy with
that conclusion (I don't recall who, and I'm not actually interested
enough to seek them out), but it doesn't lead to a very helpful sort of
characterization.
It is to the people who are reasoning informally about
the program we wish to communicate that there is a type error. That
is, we want someone who is dealing with the program informally to
realize that there is an error, and that this error is somehow related
to some input not being kept within proper bounds (inside the domain
set) and that as a result the program will compute an unexpected and
incorrect result.
I've lost the context for this part of the thread. We want this for
what purpose and in what situation?

I don't think we necessarily want this as the goal of a dynamic type
system, if that's what you mean. There is always a large (infinite? I
think so...) list of potential type systems under which the error would
not have been a type error. How does the dynamic type system know which
informally defined, potentially unsound, mental type system to check
against? If it doesn't know, then it can't possibly know that something
is a type error. If it does know, then the mental type system must be
formally defined (if nothing else, operationally by the type checker.)
I also stress the informality, because beyond a certain nearly trivial
level of complexity, people are not capable of dealing with truly
formal systems.
I think (hope, anyway) that you overstate the case here. We can deal
with formal systems. We simply make errors. When those errors are
found, they are corrected. Nevertheless, I don't suspect that a large
part of the established base of theorems of modern mathematics are
false, simply because they are sometimes rather involved reasoning in a
complex system.

Specifications are another thing, because they are often developed under
time pressure and then extremely difficult to change. I agree that we
should write better specs, but I think the main challenges are political
and pragmatic, rather than fundamental to human understanding.
Therefore, I do not want to exlcude from type systems, things wich are
informal and unsound. They are simply artifacts of human creation.

Certainly, though, when it is realized that a type system is unsound, it
should be fixed as soon as possible. Otherwise, the type system doesn't
do much good. It's also true, I suppose, that as soon as a person
realizes that their mental thought process of types is unsound, they
would want to fix it. The difference, though, is that there is then no
formal definition to fix.

--
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation

Jun 26 '06 #427

Matthias Blume

Gabriel Dos Reis <gd*@integrable-solutions.net> writes:

ro******@ps.uni-sb.de writes:

| think that it is too relevant for the discussion at hand. Moreover,
| Harper talks about a relative concept of "C-safety".

Then, I believe you missed the entire point.

First point: "safety" is a *per-language* property. Each language
comes with its own notion of safety. ML is ML-safe; C is C-safe;
etc. I'm not being facetious; I think this is the core of the
confusion.

Safety is an internal consistency check on the formal definition of
a language. In a sense it is not interesting that a language is
safe, precisely because if it weren't, we'd change the language to
make sure it is! I regard safety as a tool for the language
designer, rather than a criterion with which we can compare
languages.

I agree with Bob Harper about safety being language-specific and all
that. But, with all due respect, I think his characterization of C is
not accurate. In particular, the semantics of C (as specified by the
standard) is *not* just a mapping from memories to memories. Instead,
there are lots of situations that are quite clearly marked in C's
definition as "undefined" (or whatever the technical term may be). In
other words, C's specified transition system (to the extend that we
can actually call it "specified") has lots of places where programs
can "get stuck", i.e., where there is no transition to take. Most
actual implementations of C (including "Unix") are actually quite
generous by filling in some of the transitions, so that programs that
are officially "legal C" will run anyway.

Example:

The following program (IIRC) is not legal C, even though I expect it
to run without problem on almost any machine/OS, printing 0 to stdout:

#include <stdio.h>

int a[] = { 0, 1, 2 };

int main (void)
{
int *p, *q;
p = a;
q = p-1; /** illegal **/
printf("%d\n", q[1]);
return 0;
}

Nevertheless, the line marked with the comment is illegal because it
creates a pointer that is not pointing into a memory object. Still, a
C compiler is not required to reject this program. It is allowed,
though, to make the program behave in unexpected ways. In particular,
you can't really blame the compiler if the program does not print 0,
or if it even crashes.

AFAIC, C is C-unsafe by Bob's reasoning.

---

Of course, C can be made safe quite easily:

Define a state "undefined" that is considered "safe" and add a
transition to "undefined" wherever necessary.

Kind regards,
Matthias

Jun 26 '06 #428

John Thingstad

On Sun, 25 Jun 2006 20:11:22 +0200, Anton van Straaten
<an***@appsolutions.com> wrote:

ro******@ps.uni-sb.de wrote:
In this context, the term "latently-typed language" refers to the
language that a programmer experiences, not to the subset of that
language which is all that we're typically able to formally define.

That language is not a subset, if at all, it's the other way round,
but
I'd say they are rather incomparable. That is, they are different
languages.

The "subset" characterization is not important for what I'm saying. The
fact that they are different languages is what's important. If you
agree about that, then you can at least understand which language I'm
referring to when I say "latently-typed language".

Besides, many dynamically-typed languages have no formal models, in
which case the untyped formal model I've referred to is just a
speculative construct. The language I'm referring to with
"latently-typed language" is the language that programmers are familiar
with, and work with.
That is starting to get a bit too mystical for my tastes.

I have to agree.
\sarcasm One step further, and somebody starts calling C a "latently
memory-safe language", because a real programmer "knows" that his code
is in a safe subset... And where he is wrong, dynamic memory page
protection checks will guide him.

That's a pretty apt comparison, and it probably explains how it is that
the software we all use, which relies so heavily on C, works as well as
it does.

But the comparison critiques the practice of operating without static
guarantees, it's not a critique of the terminology.

Anton

Actually I have never developed a C/C++ program
without a bounds checker the last 15 years.
It checks all memory references and on program shutdown
checks for memory leaks. What is it about you guys that make you blind
to these fact's. Allocation problem's haven't really bugged me at all
since forever. Now debugging fluky templates on the other hands..
But then debugging Lisp macro's isn't much better.

--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Jun 26 '06 #429

Chris Smith wrote:

What makes static type systems interesting is not the fact that these
logical processes of reasoning exist; it is the fact that they are
formalized with definite axioms and rules of inference, performed
entirely on the program before execution, and designed to be entirely
evaluatable in finite time bounds according to some procedure or set of
rules, so that a type system may be checked and the same conclusion
reached regardless of who (or what) is doing the checking. All that,
and they still reach interesting conclusions about programs.
There's only one issue mentioned there that needs to be negotiated
w.r.t. latent types: the requirement that they are "performed entirely
on the program before execution". More below.
If
informal reasoning about types doesn't meet these criteria (and it
doesn't), then it simply doesn't make sense to call it a type system
(I'm using the static terminology here). It may make sense to discuss
some of the ways that programmer reasoning resembles types, if indeed
there are resemblances beyond just that they use the same basic rules of
logic. It may even make sense to try to identify a subset of programmer
reasoning that even more resembles... or perhaps even is... a type
system. It still doesn't make sense to call programmer reasoning in
general a type system.
I'm not trying to call programmer reasoning in general a type system.
I'd certainly agree that a programmer muddling his way through the
development of a program, perhaps using a debugger to find all his
problems empirically, may not be reasoning about anything that's
sufficiently close to types to call them that. But "latent" means what
it implies: someone who is more aware of types can do better at
developing the latent types.

As a starting point, let's assume we're dealing with a disciplined
programmer who (following the instructions found in books such as the
one at htdp.org) reasons about types, writes them in his comments, and
perhaps includes assertions (or contracts in the sense of "Contracts for
Higher Order Functions[1]), to help check his types at runtime (and I'm
talking about real type-checking, not just tag checking).

When such a programmer writes a type signature as a comment associated
with a function definition, in many cases he's making a claim that's
provable in principle about the inputs and outputs of that function.

For example, if the type in question is "int -> int", then the provable
claim is that given an int, the function cannot return anything other
than an int. Voila, assuming we can actually prove our claim, then
we've satisfied the requirement in Pierce's definition of type system,
i.e. "proving the absence of certain program behaviors by classifying
phrases according to the kinds of values they compute."

The fact that the proof in question may not be automatically verified is
no more relevant than the fact that so many proofs in mathematics have
not been automatically verified. Besides, if the proof turns out to be
wrong, we at least have an automated mechanism for finding witnesses to
the error: runtime tag checks will generate an error. Such detection is
not guaranteed, but again, "proof" does not imply "automation".

What I've just described walks like a type and talks like a type, or at
least it appears to do so according to Pierce's definition. We can
classify many terms in a given dynamically-typed program on this basis
(although some languages may be better at supporting this than others).

So, on what grounds do you reject this as not being an example of a
type, or at the very least, something which has clear and obvious
connections to formal types, as opposed to simply being arbitrary
programmer reasoning?

Is it the lack of a formalized type system, perhaps? I assume you
recognize that it's not difficult to define such a system.

Regarding errors, we haven't proved that the function in question can
never be called with something other than an int, but we haven't claimed
to prove that, so there's no problem there. I've already described how
errors in our proofs can be detected.

Another possible objection is that I'm talking about local cases, and
not making any claims about extending it to an entire program (other
than to observe that it can be done). But let's take this a step at a
time: considered in isolation, if we can assign types to terms at the
local level in a program, do you agree that these are really types, in
the type theory sense?

If you were to agree, then we could move on to looking at what it means
to have many local situations in which we have fairly well-defined
types, which may all be defined within a common type system.

As an initial next step, we could simply rely on the good old universal
type everywhere else in the program - it ought to be possible to make
the program well-typed in that case, it would just mean that the
provable properties would be nowhere near as pervasive as with a
traditional statically-typed program. But the point is we would now
have put our types into a formal context.
In the same post here, you simultaneously suppose that there's something
inherently informal about the type reasoning in dynamic languages; and
then talk about the type system "in the static sense" of a dynamic
language. How is this possibly consistent?
The point about inherent informality is just that if you fully formalize
a static type system for a dynamic language, such that entire programs
can be statically typed, you're likely to end up with a static type
system with the same disadvantages that the dynamic language was trying
to avoid.

However, that doesn't preclude type system being defined which can be
used to reason locally about dynamic programs. Some people actually do
this, albeit informally.

So we actually have quite a bit of evidence about the presence of static
types in dynamically-typed programs.

No. What we have is quite a bit of evidence about properties remaining
true in dynamically typed programs, which could have been verified by
static typing.

Using my example above, at least some of those properties would have
been verified by manual static typing. So those properties, at least,
seem to be types, at least w.r.t. the local fragment they're proved within.

We're currently discussing something that so far has only been captured
fairly informally. If we restrict ourselves to only what we can say
about it formally, then the conversation was over before it began.

I agree with that statement. However, I think the conversation
regarding static typing is also over when you decide that the answer is
to weaken static typing until the word applies to informal reasoning.
If the goal isn't to reach a formal understanding, then the word "static
type" shouldn't be used

Well, I'm trying to use the term "latent type", as a way to avoid the
ambiguity of "type" alone, to distinguish it from "static type", and to
get away from the obvious problems with "dynamic type".

But I'm much less interested in the term itself than in the issues
surrounding dealing with "real" types in dynamically-checked programs -
if someone had a better term, I'd be all in favor of it.
and when that is the goal, it still shouldn't
be applied to process that aren't formalized until they manage to become
formalized.
This comes back to the phrase I highlighted at the beginning of this
message: "performed entirely on the program before execution".
Formalizing local reasoning about types is pretty trivial, in many
cases. It's just that there are other cases where that's less
straightforward.

So when well-typed program fragments are considered as part of a larger
untyped program, you're suggesting that this so radically changes the
picture, that we can no longer distinguish the types we identified as
being anything beyond programmer reasoning in general? All the hard
work we did figuring out the types, writing them down, writing
assertions for them, and testing the program to look for violations
evaporates into the epistemological black hole that exists outside the
boundaries of formal type theory?
Hopefully, this isn't perceived as too picky. I've already conceded
that we can use "type" in a way that's incompatible with all existing
research literature.
Not *all* research literature. There's literature on various notions of
dynamic type.
I would, however, like to retain some word with
actually has that meaning. Otherwise, we are just going to be
linguistically prevented from discussing it.

For now, you're probably stuck with "static type". But I think it was
never likely to be the case that type theory would succeed in absconding
with the only technically valid use of the English word "type".
Although not for want of trying!

Besides, I think it's worth considering why Bertrand Russell used the
term "types" in the first place. His work had deliberate connections to
the real world. Type theory in computer science unavoidably has similar
connections. You could switch to calling it Zoltar Theory, and you'd
still have to deal with the fact that a zoltar would have numerous
connections to the English word "type". In CS, we don't have the luxury
of using the classic mathematician's excuse when confronted with
inconvenient philosophical issues, of claiming that type theory is just
a formal symbolic game, with no meaning outside the formalism.

Anton

[1]
http://people.cs.uchicago.edu/~robby...techreport.pdf

Jun 26 '06 #430

John Thingstad wrote:

On Sun, 25 Jun 2006 20:11:22 +0200, Anton van Straaten
<an***@appsolutions.com> wrote:
ro******@ps.uni-sb.de wrote: ....
\sarcasm One step further, and somebody starts calling C a "latently
memory-safe language", because a real programmer "knows" that his code
is in a safe subset... And where he is wrong, dynamic memory page
protection checks will guide him.

That's a pretty apt comparison, and it probably explains how it is
that the software we all use, which relies so heavily on C, works as
well as it does.

But the comparison critiques the practice of operating without static
guarantees, it's not a critique of the terminology.

Anton

Actually I have never developed a C/C++ program
without a bounds checker the last 15 years.
It checks all memory references and on program shutdown
checks for memory leaks. What is it about you guys that make you blind
to these fact's.

You misunderstand -- for the purposes of the above comparison, a bounds
checker serves essentially the same purpose as "dynamic memory page
protection checks". The point is that it happens dynamically, i.e. at
runtime, and that there's a lack of static guarantees about memory
safety in C or C++. That's why, as I said, the comparison to latent vs.
static typing is an apt one.

Anton

Jun 26 '06 #431

Joachim Durchholz wrote:

Anton van Straaten schrieb:
Marshall wrote:
Can you be more explicit about what "latent types" means?

Sorry, that was a huge omission. (What I get for posting at 3:30am.)

The short answer is that I'm most directly referring to "the types in
the programmer's head".

Ah, finally that terminology starts to make sense to me. I have been
wondering whether there's any useful difference between "latent" and
"run-time" typing. (I tend to avoid the term "dynamic typing" because
it's overloaded with too many vague ideas.)

Right, that's the reason for using a different term. Runtime and
dynamic types are both usually associated with the idea of tagged
values, and don't address the types of program phrases.

there are usually many possible static type schemes that can be
assigned to a given program.

This seems to apply to latent types as well.

That's what I meant, yes.
Actually the set of latent types seems to be the set of possible static
type schemes.
Um, well, a superset of these - static type schemes tend to be slightly
less expressive than what the programmer in his head. (Most type schemes
cannot really express things like "the range of this index variable is
such-and-so", and the boundary to general assertions about the code is
quite blurry anyway.)
Yes, although this raises the type theory objection of how you know
something is a type if you haven't formalized it. For the purposes of
comparison to static types, I'm inclined to be conservative and stick to
things that have close correlates in traditional static types.

There's a close connection between latent types in the sense I've
described, and the "tagged values" present at runtime. However, as
type theorists will tell you, the tags used to tag values at runtime,
as e.g. a number or a string or a FooBar object, are not the same
thing as the sort of types which statically-typed languages have.

Would that be a profound difference, or is it just that annotating a
value with a full type expression would cause just too much runtime
overhead?

It's a profound difference. The issue is that it's not just the values
that need to be annotated with types, it's also other program terms. In
addition, during a single run of a program, all it can ever normally do
is record the types seen on the path followed during that run, which
doesn't get you to static types of terms. To figure out the static
types, you really need to do static analysis.

Of course, static types give an approximation of the actual types of
values that flow at runtime, and if you come at it from the other
direction, recording the types of values flowing through terms on
multiple runs, you can get an approximation to the static type. Some
systems use that kind of thing for method lookup optimization.
In your terminology:
So, where do tagged values fit into this? Tags help to check types at
runtime, but that doesn't mean that there's a 1:1 correspondence
between tags and types.

Would it be possible to establish such a correspondence, would it be
common consensus that such a system should be called "tags" anymore, or
are there other, even more profound differences?

There's a non 1:1 correspondence which I gave an example of in the
quoted message. For 1:1 correspondence, you'd need to associate types
with terms, which would need some static analysis. It might result in
an interesting program browsing and debugging tool...

Anton

Jun 26 '06 #432

"Rob Thorpe" <ro***********@antenova.com> writes:

I think statements like this are confusing, because there are
different interpretations of what a "value" is.
But I mean the value as the semantics of the program itself sees it.
Which mostly means the datum in memory.

I don't agree with that. Generally, a language specifies a virtual
machine and doesn't need to concern itself with things like "memory"
at all. Although langauges like C tries to, look at all the undefined
behavior you get when you make assumptions about memory layout etc.

Memory representation is just an artifact of a particular
implementation of the language for a particular architecture.

-k
--
If I haven't seen further, it is by standing in the footprints of giants

Jun 26 '06 #433

Anton van Straaten <an***@appsolutions.com> wrote:

I'm not trying to call programmer reasoning in general a type system.
I'd certainly agree that a programmer muddling his way through the
development of a program, perhaps using a debugger to find all his
problems empirically, may not be reasoning about anything that's
sufficiently close to types to call them that.
Let me be clear, then. I am far less agreeable than you that this
couldn't be called a type system. One of my goals here is to try to
understand, if we are going to work out what can and can't be called a
type system in human thought, how we can possible draw a boundary
between kinds of logic and say that one of them is a type system while
the other is not. I haven't seen much of a convincing way to do it.
The only things that I can see rejecting out of hand are the empirical
things; confidence gained through testing or observation of the program
in a debugger. Everything else seems to qualify, and that's the
problem.

So let's go ahead and agree about everything up until...

[...]
As an initial next step, we could simply rely on the good old universal
type everywhere else in the program - it ought to be possible to make
the program well-typed in that case, it would just mean that the
provable properties would be nowhere near as pervasive as with a
traditional statically-typed program.
It would, in fact, mean that interesting provable properties would only
be provable if data can't flow outside of those individual small bits
where the programmer reasoned with types. Of course that's not true, or
the programmer wouldn't have written the rest of the program in the
first place. If we define the universal type as a variant type
(basically, tagged or something like it, so that the type can be tested
at runtime) then certain statements become provable, of the form "either
X or some error condition is raised because a value is inappropriate".
That is indeed a very useful property; but wouldn't it be easier to
simply prove it by appealing to the language semantics that say that "if
not X, the operation raises an exception," or to the existence of
assertions within the function that verify X, or some other such thing
(which must exist, or the statement wouldn't be true)?
But the point is we would now
have put our types into a formal context.
Yes, but somewhat vacuously so...
The point about inherent informality is just that if you fully formalize
a static type system for a dynamic language, such that entire programs
can be statically typed, you're likely to end up with a static type
system with the same disadvantages that the dynamic language was trying
to avoid.
Okay, so you've forced the admission that you have a static type system
that isn't checked and doesn't prove anything interesting. If you
extended the static type system to cover the whole program, then you'd
have a statically typed language that lacks an implementation of the
type checker.

I don't see what we lose in generality by saying that the language lacks
a static type system, since it isn't checked, and doesn't prove anything
anyway. The remaining statement is that the programmer may use static
types to reason about the code. But, once again, the problem here is
that I don't see any meaning to that. It would basically mean the same
thing as that the programmer may use logic to reason about code. That
isn't particularly surprising to me. I suppose this is why I'm
searching for what is meant by saying that programmers reason about
types, and am having that conversation in several places throughout this
thread... because otherwise, it doesn't make sense to point out that
programmers reason with types.
Well, I'm trying to use the term "latent type", as a way to avoid the
ambiguity of "type" alone, to distinguish it from "static type", and to
get away from the obvious problems with "dynamic type".
I replied to your saying something to Marshall about the "type systems
(in the static sense) of dynamically-typed languages." That's what my
comments are addressing. If you somehow meant something other than the
static sense, then I suppose this was all a big misunderstanding.
But I'm much less interested in the term itself than in the issues
surrounding dealing with "real" types in dynamically-checked programs -
if someone had a better term, I'd be all in favor of it.
I will only repeat again that in static type systems, the "type" becomes
a degenerate concept when it is removed from the type system. It is
only the type system that makes something a type. Therefore, if you
want "real" types, then my first intuition is that you should avoid
looking in the static types of programming languages. At best, they
happen to coincide frequently with "real" types, but that's only a
pragmatic issue if it means anything at all.
So when well-typed program fragments are considered as part of a larger
untyped program, you're suggesting that this so radically changes the
picture, that we can no longer distinguish the types we identified as
being anything beyond programmer reasoning in general?
Is this so awfully surprising, when the type system itself is just
programmer reasoning? When there ceases to be anything that's provable
about the program, there also ceases to be anything meaningful about the
type system. You still have the reasoning, and that's all you ever had.
You haven't really lost anything, except a name for it. Since I doubt
the name had much meaning at all (as explained above), I don't count
that as a great loss. If you do, though, then this simple demonstrates
that you didn't really mean "type systems (in the static sense)".

Hopefully, this isn't perceived as too picky. I've already conceded
that we can use "type" in a way that's incompatible with all existing
research literature.

Not *all* research literature. There's literature on various notions of
dynamic type.

I haven't seen it, but I'll take your word that some of it exists. I
wouldn't tend to read research literature on dynamic types, so that
could explain the discrepancy.

I would, however, like to retain some word with
actually has that meaning. Otherwise, we are just going to be
linguistically prevented from discussing it.

For now, you're probably stuck with "static type".

That's fine. I replied to your attempting to apply "static type" in a
sense where it doesn't make sense.
In CS, we don't have the luxury
of using the classic mathematician's excuse when confronted with
inconvenient philosophical issues, of claiming that type theory is just
a formal symbolic game, with no meaning outside the formalism.

As it turns out, my whole purpose here is to address the idea that type
theory for programming languages is limited to some subset of problems
which people happen to think of as type problems. If type theorists
fourty years ago had limited themselves to considering what people
thought of as types back then, we'd be in a considerably worse position
today with regard to the expressive power of types in commonly used
programming languages. Hence, your implication that insisting on a
formal rather than intuitive definition is opposed to the real-world
usefulness of the field is actually rather ironic. The danger lies in
assuming that unsolved problems are unsolvable, and therefore defining
them out of the discussion until the very idea that someone would solve
that problem with these techniques starts to seem strange.

--
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation

Jun 26 '06 #434

Chris Smith <cd*****@twu.net> writes:

I've since abandoned any attempt to be picky about use of the word
"type". That was a mistake on my part. I still think it's legitimate
to object to statements of the form "statically typed languages X, but
dynamically typed languages Y", in which it is implied that Y is
distinct from X. When I used the word "type" above, I was adopting the
working definition of a type from the dynamic sense.
Umm...what *is* the definition of a "dynamic type"? In this thread
it's been argued from tags and memory representation to (almost?)
arbitrary predicates.
That is, I'm considering whether statically typed languages may be
considered to also have dynamic types, and it's pretty clear to me
that they do.

Well, if you equate tags with dynamic types:

data Universal = Str String | Int Integer | Real Double
| List [Universal] ...

A pattern match failure can then be seen as a dynamic type error.

Another question is how to treat a system (and I think C++'s RTTI
qualifies, and probably related languages like Java) where static type
information is available at runtime. Does this fit with the term
"dynamic typing" with "typing" in the same sense as "static typing"?

-k
--
If I haven't seen further, it is by standing in the footprints of giants

Jun 26 '06 #435

Chris Smith <cd*****@twu.net> writes:

Joachim Durchholz <jo@durchholz.org> wrote:
Assume a language that
a) defines that a program is "type-correct" iff HM inference establishes
that there are no type errors
b) compiles a type-incorrect program anyway, with an establishes
rigorous semantics for such programs (e.g. by throwing exceptions as
appropriate).
So the compiler now attempts to prove theorems about the program, but
once it has done so it uses the results merely to optimize its runtime
behavior and then throws the results away.

Hmm... here's my compiler front end (for Haskell or other languages
that ..er.. define 'undefined'):

while(type error) {
replace an incorrectly typed function with 'undefined'
}

Wouldn't the resulting program still be statically typed?

In practice I prefer to (and do) define troublesome functions as
'undefined' manually during development.

-k
--
If I haven't seen further, it is by standing in the footprints of giants

Jun 26 '06 #436

Anton van Straaten <an***@appsolutions.com> writes:

But a program as seen by the programmer has types: the programmer
performs (static) type inference when reasoning about the program, and
debugs those inferences when debugging the program, finally ending up
with a program which has a perfectly good type scheme.

I'd be tempted to go further, and say that the programmer performs
(informally, incompletely, and probably incorrectly ) a proof of
correctness, and that type correctness is just a part of general
correctness.

-k
--
If I haven't seen further, it is by standing in the footprints of giants

Jun 26 '06 #437

Andreas Rossberg

Gabriel Dos Reis wrote:

ro******@ps.uni-sb.de writes:

| think that it is too relevant for the discussion at hand. Moreover,
| Harper talks about a relative concept of "C-safety".

Then, I believe you missed the entire point.

I think the part of my reply you snipped addressed it well enough.

Anyway, I can't believe that we actually need to argue about the fact
that - for any *useful* and *practical* notion of safety - C is *not* a
safe language. I refrain from continuing the discussion along this line,
because *that* is *really* silly.

- Andreas

Jun 26 '06 #438

David Hopwood

Matthias Blume wrote:

I agree with Bob Harper about safety being language-specific and all
that. But, with all due respect, I think his characterization of C is
not accurate. [...] AFAIC, C is C-unsafe by Bob's reasoning.
Agreed.
Of course, C can be made safe quite easily:

Define a state "undefined" that is considered "safe" and add a
transition to "undefined" wherever necessary.

I wouldn't say that was "quite easy" at all.

C99 4 #2:
# If a "shall" or "shall not" requirement that appears outside of a constraint
# is violated, the behavior is undefined. Undefined behavior is otherwise
# indicated in this International Standard by the words "undefined behavior"
# *or by the omission of any explicit definition of behavior*. [...]

In other words, to fix C to be a safe language (compatible with Standard C89
or C99), you first have to resolve all the ambiguities in the standard where
the behaviour is *implicitly* undefined. There are a lot of them.

--
David Hopwood <da******************@blueyonder.co.uk>

Jun 26 '06 #439

Matthias Blume

David Hopwood <da******************@blueyonder.co.uk> writes:

Matthias Blume wrote:
I agree with Bob Harper about safety being language-specific and all
that. But, with all due respect, I think his characterization of C is
not accurate.

[...]
AFAIC, C is C-unsafe by Bob's reasoning.

Agreed.
Of course, C can be made safe quite easily:

Define a state "undefined" that is considered "safe" and add a
transition to "undefined" wherever necessary.

I wouldn't say that was "quite easy" at all.

C99 4 #2:
# If a "shall" or "shall not" requirement that appears outside of a constraint
# is violated, the behavior is undefined. Undefined behavior is otherwise
# indicated in this International Standard by the words "undefined behavior"
# *or by the omission of any explicit definition of behavior*. [...]

In other words, to fix C to be a safe language (compatible with Standard C89
or C99), you first have to resolve all the ambiguities in the standard where
the behaviour is *implicitly* undefined. There are a lot of them.

Yes, if you want to make the transition system completely explict, it
won't be easy. I was thinking of a catch-all rule that says:
transition to "undefined" unless specified otherwise. (Note that I am
not actually advocating this approach to making a language "safe".
For practical purposes, C is unsafe. (And so is C++.))

Jun 26 '06 #440

Dirk Thierbach

David Hopwood <da******************@blueyonder.co.uk> wrote:

Marshall wrote:
David Hopwood wrote:
A type system that required an annotation on all subprograms that
do not provably terminate, OTOH, would not impact expressiveness
at all, and would be very useful.
Interesting. I have always imagined doing this by allowing an
annotation on all subprograms that *do* provably terminate.

Maybe the paper "Linear types and non-size-increasing polynomial time
computation" by Martin Hofmann is interesting in this respect.
From the abstract:

We propose a linear type system with recursion operators for inductive
datatypes which ensures that all definable functions are polynomial
time computable. The system improves upon previous such systems in
that recursive definitions can be arbitrarily nested; in particular,
no predicativity or modality restrictions are made.

It does not only ensure termination, but termination in polynomial time,
so you can use those types to carry information about this as well.
If the annotation marks not-provably-terminating subprograms, then it
calls attention to those subprograms. This is what we want, since it is
less safe/correct to use a nonterminating subprogram than a terminating
one (in some contexts).

That would be certainly nice, but it may be almost impossible to do in
practice. It's already hard enough to guarantee termination with the
extra information present in the type annotation. If this information
is not present, then the language has to be probably restricted so
severely to ensure termination that it is more or less useless.

- Dirk

Jun 26 '06 #441

Joe Marshall

Marshall wrote:

I stand corrected: if one is using C and writing self-modifying
code, then one *can* zip one's pants.

Static proofs notwithstanding, I'd prefer a dynamic check just prior to
this operation.

I want my code to be the only self-modifying thing around here.

Jun 26 '06 #442

Joe Marshall

David Hopwood wrote:

Joe Marshall wrote:

I do this quite often. Sometimes I'll develop `in the debugger'. I'll
change some piece of code and run the program until it traps. Then,
without exiting the debugger, I'll fix the immediate problem and
restart the program at the point it trapped. This technique takes a
bit of practice, but if you are working on something that's complex and
has a lot of state, it saves a lot of time because you don't have to
reconstruct the state every time you make a change.

The problem with this is that from that point on, what you're running
is neither the old nor the new program, since its state may have been
influenced by the bug before you corrected it.

Yes. The hope is that it is closer to the new program than to the old.
I find it far simpler to just restart the program after correcting
anything. If this is too difficult, I change the design to make it
less difficult.
In the most recent case where I was doing this, I was debugging
transaction rollback that involved multiple database extents. It was
somewhat painful to set up a clean database to the point where I wanted
to try the rollback, and I wanted a pristine database for each trial so
I could examine the raw bits left by a rollback. It was pretty easy to
deal with simple errors in the debugger, so I chose to do that instead.

Wow, interesting.

(I say the following only to point out differing strategies of
development, not to say one or the other is right or bad or
whatever.)

I occasionally find myself doing the above, and when I do,
I take it as a sign that I've screwed up. I find that process
excruciating, and I do everything I can to avoid it. Over the
years, I've gotten more and more adept at trying to turn
as many bugs as possible into type errors, so I don't have
to run the debugger.

Now, there might be an argument to be made that if I had
been using a dynamic language, the process wouldn't be
so bad, and I woudn't dislike it so much. But mabe:

As a strawman: suppose there are two broad categories
of mental strategies for thinking about bugs, and people
fall naturally into one way or the other, the way there
are morning people and night people. One group is
drawn to the static analysis, one group hates it.
One group hates working in the debugger, one group
is able to use that tool very effectively and elegantly.

Anyway, it's a thought.

I don't buy this -- or at least, I am not in either group.

A good debugger is invaluable regardless of your attitude to type
systems. Recently I was debugging two programs written to do similar
things in the same statically typed language (C), but where a debugger
could not be used for one of the programs. It took maybe 5 times
longer to find and fix each bug without the debugger, and I found it
a much more frustrating experience.

--
David Hopwood <da******************@blueyonder.co.uk>

Jun 26 '06 #443

Pascal Costanza

David Hopwood wrote:

Pascal Costanza wrote:
Chris Smith wrote:
While this effort to salvage the term "type error" in dynamic
languages is interesting, I fear it will fail. Either we'll all have
to admit that "type" in the dynamic sense is a psychological concept
with no precise technical definition (as was at least hinted by
Anton's post earlier, whether intentionally or not) or someone is
going to have to propose a technical meaning that makes sense,
independently of what is meant by "type" in a static system.

What about this: You get a type error when the program attempts to
invoke an operation on values that are not appropriate for this operation.

Examples: adding numbers to strings; determining the string-length of a
number; applying a function on the wrong number of parameters; applying
a non-function; accessing an array with out-of-bound indexes; etc.

This makes essentially all run-time errors (including assertion failures,
etc.) "type errors". It is neither consistent with, nor any improvement
on, the existing vaguely defined usage.

Nope. This is again a matter of getting the levels right.

Consider division by zero: appropriate arguments for division are
numbers, including the zero. The dynamic type check will typically not
check whether the second argument is zero, but will count on the fact
that the processor will raise an exception one level deeper.

This is maybe better understandable in user-level code. Consider the
following class definition:

class Person {
String name;
int age;

void buyPorn() {
if (< this.age 18) throw new AgeRestrictionException();
...
}
}

The message p.buyPorn() is a perfectly valid message send and will pass
both static and dynamic type tests (given p is of type Person in the
static case). However, you will still get a runtime error.
Pascal

--
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/

Jun 26 '06 #444

Pascal Costanza <pc@p-cos.net> wrote:

Consider division by zero: appropriate arguments for division are
numbers, including the zero. The dynamic type check will typically not
check whether the second argument is zero, but will count on the fact
that the processor will raise an exception one level deeper.
Of course zero is not appropriate as a second argument to the division
operator! I can't possibly see how you could claim that it is. The
only reasonable statement worth making is that there doesn't exist a
type system in widespread use that is capable of checking this.
This is maybe better understandable in user-level code. Consider the
following class definition:

class Person {
String name;
int age;

void buyPorn() {
if (< this.age 18) throw new AgeRestrictionException();
...
}
}

The message p.buyPorn() is a perfectly valid message send and will pass
both static and dynamic type tests (given p is of type Person in the
static case).

It appears you've written the code above to assume that the type system
can't cerify that age >= 18... otherwise, the if statement would not
make sense. It also looks like Java, in which the type system is indeed
not powerfule enough to do that check statically. However, it sounds as
if you're claiming that it wouldn't be possible for the type system to
do this? If so, that's not correct. If such a thing were checked at
compile-time by a static type check, then failing to actually provide
that guarantee would be a type error, and the compiler would tell you
so.

Whether you'd choose to call an equivalent runtime check a "dynamic type
check" is a different matter, and highlights the fact that again there's
something fundamentally different about the definition of a "type" in
the static and dynamic sense.

--
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation

Jun 26 '06 #445

Pascal Costanza

Chris Smith wrote:

Pascal Costanza <pc@p-cos.net> wrote:
Consider division by zero: appropriate arguments for division are
numbers, including the zero. The dynamic type check will typically not
check whether the second argument is zero, but will count on the fact
that the processor will raise an exception one level deeper.
Of course zero is not appropriate as a second argument to the division
operator! I can't possibly see how you could claim that it is. The
only reasonable statement worth making is that there doesn't exist a
type system in widespread use that is capable of checking this.

....and this is typically not even checked at the stage where type tags
are checked in dynamically-typed languages. Hence, it is not a type
error. (A type error is always what you define to be a type error within
a given type system, right?)

Note, this example was in response to David's reply that my definition
turns every runtime error into a type error. That's not the case, and
that's all I want to say.

This is maybe better understandable in user-level code. Consider the
following class definition:

class Person {
String name;
int age;

void buyPorn() {
if (< this.age 18) throw new AgeRestrictionException();
...
}
}

The message p.buyPorn() is a perfectly valid message send and will pass
both static and dynamic type tests (given p is of type Person in the
static case).

It appears you've written the code above to assume that the type system
can't cerify that age >= 18... otherwise, the if statement would not
make sense.

Right.
It also looks like Java, in which the type system is indeed
not powerfule enough to do that check statically.
Right.
However, it sounds as
if you're claiming that it wouldn't be possible for the type system to
do this?
No. I only need an example where a certain error is not a type error in
_some_ language. I don't need to make a universal claim here.
If so, that's not correct. If such a thing were checked at
compile-time by a static type check, then failing to actually provide
that guarantee would be a type error, and the compiler would tell you
so.

That's fine, but not relevant in this specific subplot.
Pascal

--
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/

Jun 26 '06 #446

George Neuner

On Sun, 25 Jun 2006 14:28:22 -0600, Chris Smith <cd*****@twu.net>
wrote:

George Neuner <gneuner2/@comcast.net> wrote:
>Undecidability can always be avoided by adding annotations, but of
>course that would be gross overkill in the case of index type widening.
Just what sort of type annotation will convince a compiler that a
narrowing conversion which could produce an illegal value will, in
fact, never produce an illegal value?

The annotation doesn't make the narrowing conversion safe; it prevents
the narrowing conversion from happening.

That was my point ... you get a program that won't compile.

If, for example, I need to
subtract two numbers and all I know is that they are both between 2 and
40, then I only know that the result is between -38 and 38, which may
contain invalid array indices. However, if the numbers were part of a
pair, and I knew that the type of the pair was <pair of numbers, 2
through 40, where the first number is greater than the second>, then I
would know that the difference is between 0 and 38, and that may be a
valid index.

Of course, the restrictions on code that would allow me to retain
knowledge of the form [pair of numbers, 2 through 40, a > b] may be
prohibitive. That is an open question in type theory, as a matter of
fact; whether types of this level of power may be inferred by any
tractable procedure so that safe code like this may be written without
making giving the development process undue difficulty by requiring ten
times as much type annotations as actual code. There are attempts that
have been made, and they don't look too awfully bad.

I worked in signal and image processing for many years and those are
places where narrowing conversions are used all the time - in the form
of floating point calculations reduced to integer to value samples or
pixels, or to value or index lookup tables. Often the same
calculation needs to be done for several different purposes.

I can know that my conversion of floating point to integer is going to
produce a value within a certain range ... but, in general, the
compiler can't determine what that range will be. All it knows is
that a floating point value is being truncated and the stupid
programmer wants to stick the result into some type too narrow to
represent the range of possible values.

Like I said to Ben, I haven't seen any _practical_ static type system
that can deal with things like this. Writing a generic function is
impossible under the circumstances, and writing a separate function
for each narrow type is ridiculous and a maintenance nightmare even if
they can share the bulk of the code.

George
--
for email reply remove "/" from address

Jun 26 '06 #447

Pascal Costanza <pc@p-cos.net> wrote:

Chris Smith wrote:
Of course zero is not appropriate as a second argument to the division
operator! I can't possibly see how you could claim that it is. The
only reasonable statement worth making is that there doesn't exist a
type system in widespread use that is capable of checking this.
...and this is typically not even checked at the stage where type tags
are checked in dynamically-typed languages. Hence, it is not a type
error. (A type error is always what you define to be a type error within
a given type system, right?)

Sure, it's not a type error for that language.
Note, this example was in response to David's reply that my definition
turns every runtime error into a type error. That's not the case, and
that's all I want to say.

But your definition does do that. Your definition of a type error was
when a program attempts to invoke an operation on values that are not
appropriate for this operation. Clearly, in this example, the program
is invoking an operation (division) on values that are not appropriate
(zero for the second argument). Hence, if your definition really is a
definition, then this must qualify.

However, it sounds as
if you're claiming that it wouldn't be possible for the type system to
do this?

No. I only need an example where a certain error is not a type error in
_some_ language. I don't need to make a universal claim here.

Definitions are understood to be statements of the form "if and only
if". They assert that /everything/ that fits the definition is
describable by the word, and /everything/ that doesn't is not
describable by the word. If that's not what you meant, then we are
still in search of a definition.

If you want to make a statement instead of the sort you've implied
above... namely that a type error is *any* error that's raised by a type
system, then that's fine. It certainly brings us closer to static
types. Now, though, the task is to define a type system without making
a circular reference to types. You've already rejected the statement
that all runtime errors are type errors, because you specifically reject
the division by zero case as a type error for most languages. What is
that distinction?

--
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation

Jun 26 '06 #448

George Neuner <gneuner2/@comcast.net> wrote:

On Sun, 25 Jun 2006 14:28:22 -0600, Chris Smith <cd*****@twu.net>
wrote:
George Neuner <gneuner2/@comcast.net> wrote:
>Undecidability can always be avoided by adding annotations, but of
>course that would be gross overkill in the case of index type widening.

Just what sort of type annotation will convince a compiler that a
narrowing conversion which could produce an illegal value will, in
fact, never produce an illegal value?
The annotation doesn't make the narrowing conversion safe; it prevents
the narrowing conversion from happening.

That was my point ... you get a program that won't compile.

That's not actually the case here. If we're talking only about type
conversions and not value conversions (see below), then narrowing
conversions are only necessary because you've forgotten some important
bit of type information. By adding annotations, you can preserve that
piece of information and thus avoid the conversion and get a program
that runs fine.
I worked in signal and image processing for many years and those are
places where narrowing conversions are used all the time - in the form
of floating point calculations reduced to integer to value samples or
pixels, or to value or index lookup tables. Often the same
calculation needs to be done for several different purposes.
These are value conversions, not type conversions. Basically, when you
convert a floating point number to an integer, you are not simply
promising the compiler something about the type; you are actually asking
the compiler to convert one value to another -- i.e., see to it that
whatever this is now, it /becomes/ an integer by the time we're done.
This also results in a type conversion, but you've just converted the
value to the appropriate form. There is a narrowing value conversion,
but the type conversion is perfectly safe.
I can know that my conversion of floating point to integer is going to
produce a value within a certain range ... but, in general, the
compiler can't determine what that range will be.
If you mean "my compiler can't", then this is probably the case. If you
mean "no possible compiler could", then I'm not sure this is really very
likely at all.
Like I said to Ben, I haven't seen any _practical_ static type system
that can deal with things like this.

I agree. Such a thing doesn't currently exist for general-purpose
programming languages, although it does exist in limited languages for
some specific domains.

--
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation

Jun 26 '06 #449