What is Expressiveness in a Computer Language

Andreas Rossberg wrote:

Rob Thorpe wrote:
Andreas Rossberg wrote:
Rob Thorpe wrote:

>>"A language is latently typed if a value has a property - called it's
>>type - attached to it, and given it's type it can only represent values
>>defined by a certain class."
>
>"it [= a value] [...] can [...] represent values"?

???

I just quoted, in condensed form, what you said above: namely, that a
value represents values - which I find a strange and circular definition.
Yes, but the point is, as the other poster mentioned: values defined by
a class.

I'm sorry, but I still think that the definition makes little sense.
Obviously, a value simply *is* a value, it does not "represent" one, or
several ones, regardless how you qualify that statement.

You've clipped a lot of context. I'll put some back in, I said:-

Yes, but the point is, as the other poster mentioned: values defined by
a class.
For example, in lisp:
"xyz" is a string, #(1 2 3) is an array, '(1 2 3) is a list, 45 is a
fixed-point number.
Each item that can be stored has a type, no item can be stored without
one. The types can be tested. Most dynamic typed languages are like
that.
Compare this to an untyped language where types cannot generally be
tested.
I think this should make it clear. If I have a "xyz" in lisp I know it
is a string.
If I have "xyz" in an untyped language like assembler it may be
anything, two pointers in binary, an integer, a bitfield. There is no
data at compile time or runtime to tell what it is, the programmer has
to remember.

(I'd point out this isn't true of all assemblers, there are some typed
assemblers)

Well I haven't programmed in any statically typed language where values
have types themselves.

They all have - the whole purpose of a type system is to ensure that any
expression of type T always evaluates to a value of type T.

But it only gaurantees this because the variables themselves have a
type,

No, variables are insignificant in this context. You can consider a
language without variables at all (such languages exist, and they can
even be Turing-complete) and still have evaluation, values, and a
non-trivial type system.

Hmm. You're right, ML is no-where in my definition since it has no
variables.

But the value itself has no type

You mean that the type of the value is not represented at runtime? True,
but that's simply because the type system is static. It's not the same
as saying it has no type.

Well, is it even represented at compile time?
The compiler doesn't know in general what values will exist at runtime,
it knows only what types variables have. Sometimes it only has partial
knowledge and sometimes the programmer deliberately overrides it. From
what knowledge it you could say it know what types values will have.

in a C program for example I can take
the value from some variable and recast it in any way I feel and the
language cannot correct any errors I make because their is no
information in the variable to indicate what type it is.

Nothing in the C spec precludes an implementation from doing just that.

True, that would be an interesting implementation.
The problem with C rather is that its semantics is totally
underspecified. In any case, C is about the worst example to use when
discussing type systems. For starters, it is totally unsound - which is
what your example exploits.

Yes. Unfortunately it's often necessary to break static type systems.

Regarding C the problem is, what should we discuss instead that would
be understood in all these newsgroups we're discussing this in.

Jun 20 '06 #102

George Neuner

On Mon, 19 Jun 2006 22:02:55 +0000 (UTC), Dimitri Maziuk
<di**@127.0.0.1> wrote:

Yet Another Dan sez:

... Requiring an array index to be an integer is considered a typing
problem because it can be checked based on only the variable itself,
whereas checking whether it's in bounds requires knowledge about the array.

You mean like
subtype MyArrayIndexType is INTEGER 7 .. 11
type MyArrayType is array (MyArrayIndexType) of MyElementType

If the index computation involves wider types it can still produce
illegal index values. The runtime computation of an illegal index
value is not prevented by narrowing subtypes and cannot be statically
checked.

George
--
for email reply remove "/" from address

Jun 20 '06 #103

Joe Marshall

Chris Smith wrote:

Joe Marshall <ev********@gmail.com> wrote:

Agreed. That is why there is the qualifier `dynamic'. This indicates
that it is a completely different thing from static types.

If we agree about this, then there is no need to continue this
discussion. I'm not sure we do agree, though, because I doubt we'd be
right here in this conversation if we did.

I think we do agree.

The issue of `static vs. dynamic types' comes up about twice a year in
comp.lang.lisp It generally gets pretty heated, but eventually people
come to understand what the other person is saying (or they get bored
and drop out of the conversation - I'm not sure which). Someone always
points out that the phrase `dynamic types' really has no meaning in the
world of static type analysis. (Conversely, the notion of a `static
type' that is available at runtime has no meaning in the dynamic
world.) Much confusion usually follows.

You'll get much farther in your arguments by explaining what you mean
in detail rather than attempting to force a unification of teminology.
You'll also get farther by remembering that many of the people here
have not had much experience with real static type systems. The static
typing of C++ or Java is so primitive that it is barely an example of
static typing at all, yet these are the most common examples of
statically typed languages people typically encounter.

Jun 20 '06 #104

Matthias Blume wrote:

Pascal Costanza <pc@p-cos.net> writes:
- In a dynamically typed language, you can run programs successfully
that are not acceptable by static type systems.
This statement is false.

The example I have given is more important than this statement.
For every program that can run successfully to completion there exists
a static type system which accepts that program. Moreover, there is
at least one static type system that accepts all such programs.

What you mean is that for static type systems that are restrictive
enough to be useful in practice there always exist programs which
(after type erasure in an untyped setting, i.e., by switching to a
different language) would run to completion, but which are rejected by
the static type system.

No, that's not what I mean.

Here is an example in Common Lisp:

; A class "person" with no superclasses and with the only field "name":
(defclass person ()
(name))

; A test program:
(defun test ()
(let ((p (make-instance 'person)))
(eval (read))
(slot-value p 'address)))

(slot-value p 'address) is an attempt to access the field 'address in
the object p. In many languages, the notation for this is p.address.

Although the class definition for person doesn't mention the field
address, the call to (eval (read)) allows the user to change the
definition of the class person and update its existing
instances. Therefore at runtime, the call to (slot-value p 'adress)
has a chance to succeed.

I am quite comfortable with the thought that this sort of evil would
get rejected by a statically typed language. :-)

This sort of feature is clearly not meant for you. ;-P
Pascal

--
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/

Jun 20 '06 #105

Darren New wrote:

Rob Thorpe wrote:
The compiler
relys entirely on the types of the variables to know how to correctly
operate on the values. The values themselves have no type information
associated with them.

int x = (int) (20.5 / 3);

What machine code operations does the "/" there invoke? Integer
division, or floating point division? How did the variables involved in
the expression affect that?

In that case it knew because it could see at compile time. In general
though it doesn't.
If I divide x / y it only knows which to use because of types declared
for x and y.

Casting in C takes values of one type to values of another type.

No it doesn't. Casting reinterprets a value of one type as a value of
another type.

No it doesn't.
int x = (int) 20.5;
There's no point at which bits from the floating point representation
appear in the variable x.

int * x = (int *) 0;
There's nothing that indicates all the bits of "x" are zero, and indeed
in some hardware configurations they aren't.

I suppose some are conversions and some reinterpretations. What I
should have said it that there are cases where cast reinterprets.

Jun 20 '06 #106

Pascal Costanza wrote:

Matthias Blume wrote:
Pascal Costanza <pc@p-cos.net> writes:
(slot-value p 'address) is an attempt to access the field 'address in
the object p. In many languages, the notation for this is p.address.

Although the class definition for person doesn't mention the field
address, the call to (eval (read)) allows the user to change the
definition of the class person and update its existing
instances. Therefore at runtime, the call to (slot-value p 'adress)
has a chance to succeed.

I am quite comfortable with the thought that this sort of evil would
get rejected by a statically typed language. :-)

This sort of feature is clearly not meant for you. ;-P

To be fair though that kind of thing would only really be used while
debugging a program.
Its no different than adding a new member to a class while in the
debugger.

There are other places where you might add a slot to an object at
runtime, but they would be done in tidier ways.

Jun 20 '06 #107

Darren New

Rob Thorpe wrote:

Darren New wrote:
Rob Thorpe wrote:
The values themselves have no type information associated with them.int x = (int) (20.5 / 3);

In that case it knew because it could see at compile time.

Well, yes. That's the point of static typing.
In general though it doesn't.
Sure it does. There are all kinds of formal rules about type promotions
and operator version selection.

In deed, in *general* every value has a type associated with it (at
compile time). Some values have a type associated with them due to the
declaration of the variable supplying the value, but that's far from the
general case.

Note that in
main() { char x = 'x'; foo(x); }
the value passed to "foo" is not even the same type as the declaration
of "x", so it's far from the general case that variables even determine
the values they provide to the next part of the calculation.
If I divide x / y it only knows which to use because of types declared
for x and y.
Yes? So? All you're saying is that the value of the expression "x" is
based on the declared type for the variable "x" in scope at that point.
That doesn't mean values don't have types. It just means that *some*
values' types are determined by the type of the variable the value is
stored in. As soon as you do anything *else* with that value, such as
passing it to an operator, a function, or a cast, the value potentially
takes on a type different from that of the variable from which it came.
I suppose some are conversions and some reinterpretations. What I
should have said it that there are cases where cast reinterprets.

There are cases where the cast is defined to return a value that has the
same bit pattern as its argument, to the extent the hardware can support
it. However, this is obviously limited to the values for which C
actually defines the bit patterns of its values, namely the scalar
integers.

Again, you're taking a special case and treating all the rest (which are
a majority) as aberations. For the most part, casts do *not* return the
same bit pattern as the value. For the most part
union {T1 x; T2 y;};
can be used to do transformations that
T2 y; T1 x = (T1) y;
does not. Indeed, the only bit patterns that don't change are when
casting from signed to unsigned versions of the same underlying scalar
type and back. Everything else must change, except perhaps pointers,
depending on your architecture. (I've worked on machines where char* had
more bits than long*, for example.)

(Funny that comp.lang.c isn't on this thread. ;-)

--
Darren New / San Diego, CA, USA (PST)
My Bath Fu is strong, as I have
studied under the Showerin' Monks.

Jun 20 '06 #108

Chris Smith wrote:

Easy, any statically typed language is not latently typed.
I'm actually not sure I agree with this at all. I believe that
reference values in Java may be said to be latently typed. Practically
all class-based OO
languages are subject to similar consideration, as it turns out.

Quite probably true of GC-ed statically typed languages in general, at least up
to a point (and provided you are not using something like a tagless ML
implementation). I think Rob is assuming a rather too specific implementation
of statically typed languages.

I'm unsure whether to consider explicitly stored array lengths, which
are present in most statically typed languages, to be part of a "type"
in this sense or not.

If I understand your position correctly, wouldn't you be pretty much forced to
reject the idea of the length of a Java array being part of its type ? If you
want to keep the word "type" bound to the idea of static analysis, then --
since Java doesn't perform any size-related static analysis -- the size of a
Java array cannot be part of its type.

That's assuming that you would want to keep the "type" connected to the actual
type analysis performed by the language in question. Perhaps you would prefer
to loosen that and consider a different (hypothetical) language (perhaps
producing identical bytecode) which does do compile time size analysis.

But then you get into an area where you cannot talk of the type of a value (or
variable) without relating it to the specific type system under discussion.
Personally, I would be quite happy to go there -- I dislike the idea that a
value has a specific inherent type.

It would be interesting to see what a language designed specifically to support
user-defined, pluggable, and perhaps composable, type systems would look like.
Presumably the syntax and "base" semantics would be very simple, clean, and
unrestricted (like Lisp, Smalltalk, or Forth -- not that I'm convinced that any
of those would be ideal for this), with a defined result for any possible
sequence of operations. The type-system(s) used for a particular run of the
interpreter (or compiler) would effectively reduce the space of possible
sequences. For instance, one could have a type system which /only/ forbade
dereferencing null, or another with the job of ensuring that mutability
restrictions were respected, or a third which implemented access control...

But then, I don't see a semantically critically distinction between such space
reduction being done at compile time vs. runtime. Doing it at compile time
could be seen as an optimisation of sorts (with downsides to do with early
binding etc). That's particularly clear if the static analysis is /almost/
able to prove that <some sequence> is legal (by its own rules) but has to make
certain assumptions in order to construct the proof. In such a case the
compiler might insert a few runtime checks to ensure that it's assumptions were
valid, but do most of its checking statically.

There would /be/ a distinction between static and dynamic checks in such a
system, and it would be an important distinction, but not nearly as important
as the distinctions between the different type systems. Indeed I can imagine
categorising type systems by /whether/ (or to what extent) a tractable static
implementation exists.

-- chris

P.S Apologies Chris, btw, for dropping out of a conversation we were having on
this subject a little while ago -- I've now said everything that I /would/ have
said in reply to your last post if I'd got around to it in time...

Jun 20 '06 #109

genea

Joe Marshall wrote:
{...}

The issue of `static vs. dynamic types' comes up about twice a year in
comp.lang.lisp It generally gets pretty heated, but eventually people
come to understand what the other person is saying (or they get bored
and drop out of the conversation - I'm not sure which). {...}

I think that the thing about "Language Expressiveness" is just so
elusive, as it is based on each programmers way of thinking about
things, and also the general types of problems that that programmer is
dealing with on a daily basis. There are folks out there like the Paul
Grahams of the world, that do wonderfully complex things in Lisp,
eschew totally the facilities of the CLOS, the lisp object system, and
get the job done ... just because they can hack and have a mind picture
of what all the type matches are "in there head". I used to use forth,
where everything goes on a stack, and it is up to the programmer to
remember what the heck the type of a thing was that was stored there...
maybe an address of a string, another word {read function}, or an
integer... NO TYPING AT ALL, but can you be expressive in forth... You
can if you are a good forth programmer... NOW that being said, I think
that the reason I like Haskell, a very strongly typed language, is that
because of it's type system, the language is able to do things like
lazy evaluation, and then though it is strongly typed, and has
wonderful things like type classes, a person can write wildly
expressive code, and NEVER write a thing like:
fromtoby :: forall b a.
(Num a, Enum a) =>
a -> a -> a -> (a -> b) -> [b]
The above was generated by the Haskell Compiler for me... and it does
that all the time, without any fuss. I just wrote the function and it
did the rest for me... By the way the function was a replacement for
the { for / to / by } construct of like a C language and it was done in
one line... THAT TO ME IS EXPRESSIVE, when I can build whole new
features into my language in just a few lines... usually only one
line.. I think that is why guys who are good to great in dynamic or
if it floats your boat, type free languages like Lisp and Scheme find
their languages so expressive, because they have found the macro's or
whatever else facility to give them the power... to extend their
language to meet a problem, or fit maybe closer to an entire class of
problems.. giving them the tool box.. Haskeller', folks using Ruby,
Python, ML, Ocaml, Unicon.... even C or whatever... By building either
modules, or libraries... and understand that whole attitude of tool
building.. can be equally as expressive "for their own way of doing
things". Heck, I don't use one size of hammer out in my workshop, and
I sure as hell don't on my box... I find myself picking up Icon and
it's derivative Unicon to do a lot of data washing chores.. as it
allows functions as first class objects... any type can be stored in a
list... tables... like a hash ... with any type as the data and the
key... you can do things like
i := 0
every write(i+:=1 || read())
which will append a sequence of line numbers to the lines read in from
stdin.. pretty damn concise and expressive in my book... Now for other
problems .. Haskell with it's single type lists... which of course can
have tuples, which themselves have tuples in them with a list as one
element of that tuple... etc.. and you can build accessors for all of
that for function application brought to bear on one element of one
tuple allowing mappings of that function to all of that particular
element of ... well you get the idea.. It is all about how you see
things and how your particular mindset is... I would agree with
someone that early on in the discussion said the thing about "type's
warping your mind" and depending on the individual, it is either a good
warp or a bad warp, but that is why there is Ruby and Python, and
Haskell and even forth... for a given problem, and a given programmer,
any one of those or even Cobol or Fortran might be the ideal tool... if
nothing else based on that persons familiarity or existing tool and
code base.

enough out of me...
-- gene

Jun 20 '06 #110

Darren New

Chris Uppal wrote:

Personally, I would be quite happy to go there -- I dislike the idea that a
value has a specific inherent type.

Interestingly, Ada defines a type as a collection of values. It works
quite well, when one consistantly applies the definition. For example,
it makes very clear the difference between a value of (type T) and a
value of (type T or one of its subtypes).

--
Darren New / San Diego, CA, USA (PST)
My Bath Fu is strong, as I have
studied under the Showerin' Monks.

Jun 20 '06 #111

"Rob Thorpe" <ro***********@antenova.com> writes:

I think we're discussing this at cross-purposes. In a language like C
or another statically typed language there is no information passed
with values indicating their type.
You seem to be confusing "does not have a type" with "no type
information is passed at runtime".
Have a look in a C compiler if you don't believe me.
Believe me, I have.
No it doesn't. Casting reinterprets a value of one type as a value of
another type.
There is a difference. If I cast an unsigned integer 2000000000 to a
signed integer in C on the machine I'm using then the result I will get
will not make any sense.

Which result are you getting? What does it mean to "make sense"?

Jun 20 '06 #112

Rob Thorpe wrote:

Pascal Costanza wrote:
Matthias Blume wrote:
Pascal Costanza <pc@p-cos.net> writes:
(slot-value p 'address) is an attempt to access the field 'address in
the object p. In many languages, the notation for this is p.address.

Although the class definition for person doesn't mention the field
address, the call to (eval (read)) allows the user to change the
definition of the class person and update its existing
instances. Therefore at runtime, the call to (slot-value p 'adress)
has a chance to succeed.
I am quite comfortable with the thought that this sort of evil would
get rejected by a statically typed language. :-)

This sort of feature is clearly not meant for you. ;-P

To be fair though that kind of thing would only really be used while
debugging a program.
Its no different than adding a new member to a class while in the
debugger.

There are other places where you might add a slot to an object at
runtime, but they would be done in tidier ways.

Yes, but the question remains how a static type system can deal with
this kind of updates.
Pascal

--
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/

Jun 20 '06 #113

Chris Smith

Chris Uppal <ch*********@metagnostic.REMOVE-THIS.org> wrote:

I'm unsure whether to consider explicitly stored array lengths, which
are present in most statically typed languages, to be part of a "type"
in this sense or not.
If I understand your position correctly, wouldn't you be pretty much forced to
reject the idea of the length of a Java array being part of its type ?

I've since abandoned any attempt to be picky about use of the word
"type". That was a mistake on my part. I still think it's legitimate
to object to statements of the form "statically typed languages X, but
dynamically typed languages Y", in which it is implied that Y is
distinct from X. When I used the word "type" above, I was adopting the
working definition of a type from the dynamic sense. That is, I'm
considering whether statically typed languages may be considered to also
have dynamic types, and it's pretty clear to me that they do.
If you
want to keep the word "type" bound to the idea of static analysis, then --
since Java doesn't perform any size-related static analysis -- the size of a
Java array cannot be part of its type.
Yes, I agree. My terminology has been shifting constantly throughout
this thread as I learn more about how others are using terms. I realize
that probably confuses people, but my hope is that this is still
superior to stubbornly insisting that I'm right. :)
That's assuming that you would want to keep the "type" connected to the actual
type analysis performed by the language in question. Perhaps you would prefer
to loosen that and consider a different (hypothetical) language (perhaps
producing identical bytecode) which does do compile time size analysis.
In the static sense, I think it's absolutely critical that "type" is
defined in terms of the analysis done by the type system. Otherwise,
you miss the definition entirely. In the dynamic sense, I'm unsure; I
don't have any kind of deep understanding of what's meant by "type" in
this sense. Certainly it could be said that there are somewhat common
cross-language definitions of "type" that get used.
But then you get into an area where you cannot talk of the type of a value (or
variable) without relating it to the specific type system under discussion.
Which is entirely what I'd expect in a static type system.
Personally, I would be quite happy to go there -- I dislike the idea that a
value has a specific inherent type.
Good! :)
It would be interesting to see what a language designed specifically to support
user-defined, pluggable, and perhaps composable, type systems would look like.
Presumably the syntax and "base" semantics would be very simple, clean, and
unrestricted (like Lisp, Smalltalk, or Forth -- not that I'm convinced that any
of those would be ideal for this), with a defined result for any possible
sequence of operations. The type-system(s) used for a particular run of the
interpreter (or compiler) would effectively reduce the space of possible
sequences. For instance, one could have a type system which /only/ forbade
dereferencing null, or another with the job of ensuring that mutability
restrictions were respected, or a third which implemented access control...
You mean in terms of a practical programming language? If not, then
lambda calculus is used in precisely this way for the static sense of
types.
But then, I don't see a semantically critically distinction between such space
reduction being done at compile time vs. runtime. Doing it at compile time
could be seen as an optimisation of sorts (with downsides to do with early
binding etc). That's particularly clear if the static analysis is /almost/
able to prove that <some sequence> is legal (by its own rules) but has to make
certain assumptions in order to construct the proof. In such a case the
compiler might insert a few runtime checks to ensure that it's assumptions were
valid, but do most of its checking statically.

I think Marshall got this one right. The two are accomplishing
different things. In one case (the dynamic case) I am safeguarding
against negative consequences of the program behaving in certain non-
sensical ways. In the other (the static case) I am proving theorems
about the impossibility of this non-sensical behavior ever happening.
You mention static typing above as an optimization to dynamic typing;
that is certainly one possible applications of these theorems.

In some sense, though, it is interesting in its own right to know that
these theorems have been proven. Of course, there are important doubts
to be had whether these are the theorems we wanted to prove in the first
place, and whether the effort of proving them was worth the additional
confidence I have in my software systems.

I acknowledge those questions. I believe they are valid. I don't know
the answers. As an intuitive judgement call, I tend to think that
knowing the correctness of these things is of considerable benefit to
software development, because it means that I don't have as much to
think about at any one point in time. I can validly make more
assumptions about my code and KNOW that they are correct. I don't have
to trace as many things back to their original source in a different
module of code, or hunt down as much documentation. I also, as a
practical matter, get development tools that are more powerful.
(Whether it's possible to create the same for a dynamically typed
language is a potentially interesting discussion; but as a practical
matter, no matter what's possible, I still have better development tools
for Java than for JavaScript when I do my job.)

In the end, though, I'm just not very interested in them at the moment.
For me, as a practical matter, choices of programming language are
generally dictated by more practical considerations. I do basically all
my professional "work" development in Java, because we have a gigantic
existing software system written in Java and no time to rewrite it. On
the other hand, I do like proving theorems, which means I am interested
in type theory; if that type theory relates to programming, then that's
great! That's probably not the thing to say to ensure that my thoughts
are relevant to the software development "industry", but it's
nevertheless the truth.

--
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation

Jun 20 '06 #114

Torben Ægidius Mogensen schrieb:

That's not really the difference between static and dynamic typing.
Static typing means that there exist a typing at compile-time that
guarantess against run-time type violations. Dynamic typing means
that such violations are detected at run-time.
Agreed.
This is orthogonal to
strong versus weak typing, which is about whether such violations are
detected at all. The archetypal weakly typed language is machine code
-- you can happily load a floating point value from memory, add it to
a string pointer and jump to the resulting value.
I'd rather call machine code "untyped".
("Strong typing" and "weak typing" don't have a universally accepted
definition anyway, and I'm not sure that this terminology is helpful
anyway.)
Anyway, type inference for statically typed langauges don't make them
any more dynamically typed. It just moves the burden of assigning the
types from the programmer to the compiler. And (for HM type systems)
the compiler doesn't "guess" at a type -- it finds the unique most
general type from which all other legal types (within the type system)
can be found by instantiation.
Hmm... I think this distinction doesn't cover all cases.

Assume a language that
a) defines that a program is "type-correct" iff HM inference establishes
that there are no type errors
b) compiles a type-incorrect program anyway, with an establishes
rigorous semantics for such programs (e.g. by throwing exceptions as
appropriate).
The compiler might actually refuse to compile type-incorrect programs,
depending on compiler flags and/or declarations in the code.

Typed ("strongly typed") it is, but is it statically typed or
dynamically typed?
("Softly typed" doesn't capture it well enough - if it's declarations in
the code, then those part of the code are statically typed.)
You miss some of the other benefits of static typing,
though, such as a richer type system -- soft typing often lacks
features like polymorphism (it will find a set of monomorphic
instances rather than the most general type) and type classes.

That's not a property of soft typing per se, it's a consequence of
tacking on type inference on a dynamically-typed language that wasn't
designed for allowing strong type guarantees.

Regards,
Jo

Jun 20 '06 #115

Matthias Blume schrieb:

Perhaps better: A language is statically typed if its definition
includes (or ever better: is based on) a static type system, i.e., a
static semantics with typing judgments derivable by typing rules.
Usually typing judgmets associate program phrases ("expressions") with
types given a typing environment.

This is defining a single term ("statically typed") using three
undefined terms ("typing judgements", "typing rules", "typing environment").

Regards,
Jo

Jun 20 '06 #116

Chris F Clark schrieb:

In that sense, a static type system is eliminating tags, because the
information is pre-computed and not explicitly stored as a part of the
computation. Now, you may not view the tag as being there, but in my
mind if there exists a way of perfoming the computation that requires
tags, the tag was there and that tag has been eliminated.
On a semantic level, the tag is always there - it's the type (and
definitely part of an axiomatic definition of the language).
Tag elimination is "just" an optimization.
To put it another way, I consider the tags to be axiomatic. Most
computations involve some decision logic that is driven by distinct
values that have previously been computed. The separation of the
values which drive the compuation one-way versus another is a tag.
That tag can potentially be eliminated by some apriori computation.

Um... just as precomputing constants, I'd say.
Are the constants that went into a precomputed constant eliminated?
On the implementation level, yes. On the semantic/axiomatic level, no.
Or, well, maybe - since that's just an optimization, the compiler may
have decided to no precompute the constant at all.

(Agreeing with the snipped parts.)

Regards,
Jo

Jun 20 '06 #117

Joachim Durchholz <jo@durchholz.org> writes:

Matthias Blume schrieb:
Perhaps better: A language is statically typed if its definition
includes (or ever better: is based on) a static type system, i.e., a
static semantics with typing judgments derivable by typing rules.
Usually typing judgmets associate program phrases ("expressions") with
types given a typing environment.

This is defining a single term ("statically typed") using three
undefined terms ("typing judgements", "typing rules", "typing
environment").

This was not meant to be a rigorous definition. Also, I'm not going
to repeat the textbook definitions for those three standard terms
here. Next thing you are going to ask me to define the meaning of the
word "is"...

Jun 20 '06 #118

Chris Smith wrote:

Chris Uppal <ch*********@metagnostic.REMOVE-THIS.org> wrote:
I'm unsure whether to consider explicitly stored array lengths, which
are present in most statically typed languages, to be part of a "type"
in this sense or not.

If I understand your position correctly, wouldn't you be pretty much forced to
reject the idea of the length of a Java array being part of its type ?

I've since abandoned any attempt to be picky about use of the word "type".

I think you should stick to your guns on that point. When people talk about
"types" being associated with values in a "latently typed" or "dynamically typed"
language, they really mean *tag*, not type.

It is remarkable how much of the fuzzy thinking that often occurs in the
discussion of type systems can be dispelled by insistence on this point (although
much of the benefit can be obtained just by using this terminology in your own
mind and translating what other people are saying to it). It's a good example of
the weak Sapir-Whorf hypothesis, I think.

--
David Hopwood <da******************@blueyonder.co.uk>

Jun 21 '06 #119

Pascal Costanza wrote:

Chris Smith wrote:
Knowing that it'll cause a lot of strenuous objection, I'll
nevertheless interject my plea not to abuse the word "type" with a
phrase like "dynamically typed". If anyone considers "untyped" to be
perjorative, as some people apparently do, then I'll note that another
common term is "type-free," which is marketing-approved but doesn't
carry the misleading connotations of "dynamically typed." We are
quickly losing any rational meaning whatsoever to the word "type," and
that's quite a shame.

The words "untyped" or "type-free" only make sense in a purely
statically typed setting. In a dynamically typed setting, they are
meaningless, in the sense that there are _of course_ types that the
runtime system respects.

Types can be represented at runtime via type tags. You could insist on
using the term "dynamically tagged languages", but this wouldn't change
a lot. Exactly _because_ it doesn't make sense in a statically typed
setting, the term "dynamically typed language" is good enough to
communicate what we are talking about - i.e. not (static) typing.

Oh, but it *does* make sense to talk about dynamic tagging in a statically
typed language.

That's part of what makes the term "dynamically typed" harmful: it implies
a dichotomy between "dynamically typed" and "statically typed" languages,
when in fact dynamic tagging and static typing are (mostly) independent
features.

--
David Hopwood <da******************@blueyonder.co.uk>

Jun 21 '06 #120

Ben Morrow

Quoth David Hopwood <da******************@blueyonder.co.uk>:

Pascal Costanza wrote:
Chris Smith wrote:

Types can be represented at runtime via type tags. You could insist on
using the term "dynamically tagged languages", but this wouldn't change
a lot. Exactly _because_ it doesn't make sense in a statically typed
setting, the term "dynamically typed language" is good enough to
communicate what we are talking about - i.e. not (static) typing.
Oh, but it *does* make sense to talk about dynamic tagging in a statically
typed language.

Though I'm *seriously* reluctant to encourage this thread...

A prime example of this is Perl, which has both static and dynamic
typing. Variables are statically typed scalar/array/hash, and then
scalars are dynamically typed string/int/unsigned/float/ref.
That's part of what makes the term "dynamically typed" harmful: it implies
a dichotomy between "dynamically typed" and "statically typed" languages,
when in fact dynamic tagging and static typing are (mostly) independent
features.

Nevertheless, I see no problem in calling both of these 'typing'. They
are both means to the same end: causing a bunch of bits to be
interpreted in a meaningful fashion. The only difference is whether the
distinction is made a compile- or run-time. The above para had no
ambiguities...

Ben

--
Every twenty-four hours about 34k children die from the effects of poverty.
Meanwhile, the latest estimate is that 2800 people died on 9/11, so it's like
that image, that ghastly, grey-billowing, double-barrelled fall, repeated
twelve times every day. Full of children. [Iain Banks] be*******@tiscali.co.uk

Jun 21 '06 #121

Pascal Costanza wrote:

Rob Thorpe wrote:
Pascal Costanza wrote:
Matthias Blume wrote:
Pascal Costanza <pc@p-cos.net> writes:

> (slot-value p 'address) is an attempt to access the field 'address in
> the object p. In many languages, the notation for this is p.address.
>
> Although the class definition for person doesn't mention the field
> address, the call to (eval (read)) allows the user to change the
> definition of the class person and update its existing
> instances. Therefore at runtime, the call to (slot-value p 'adress)
> has a chance to succeed.

I am quite comfortable with the thought that this sort of evil would
get rejected by a statically typed language. :-)

This sort of feature is clearly not meant for you. ;-P

To be fair though that kind of thing would only really be used while
debugging a program.
Its no different than adding a new member to a class while in the
debugger.

There are other places where you might add a slot to an object at
runtime, but they would be done in tidier ways.

Yes, but the question remains how a static type system can deal with
this kind of updates.

It's not difficult in principle:

- for each class[*], define a function which converts an 'old' value of
that class to a 'new' value (the ability to do this is necessary anyway
to support some kinds of upgrade). A default conversion function may be
autogenerated if the class definition has changed only in minor ways.

- typecheck the new program and the conversion functions, using the old
type definitions for the argument of each conversion function, and the
new type definitions for its result.

- have the debugger apply the conversions to all values, and then resume
the program.

[*] or nearest equivalent in a non-OO language.

--
David Hopwood <da******************@blueyonder.co.uk>

Jun 21 '06 #122

genea wrote:

[...] NOW that being said, I think
that the reason I like Haskell, a very strongly typed language, is that
because of it's type system, the language is able to do things like
lazy evaluation, [...]

Lazy evaluation does not depend on, nor is it particularly helped by
static typing (assuming that's what you mean by "strongly typed" here).

An example of a non-statically-typed language that supports lazy evaluation
is Oz. (Lazy functions are explicitly declared in Oz, as opposed to Haskell's
implicit lazy evaluation, but that's not because of the difference in type
systems.)

--
David Hopwood <da******************@blueyonder.co.uk>

Jun 21 '06 #123

Anton van Straaten

Marshall wrote:

Joe Marshall wrote:
They *do* have a related meaning. Consider this code fragment:
(car "a string")
[...]
Both `static typing' and `dynamic typing' (in the colloquial sense) are
strategies to detect this sort of error.

The thing is though, that putting it that way makes it seems as
if the two approaches are doing the same exact thing, but
just at different times: runtime vs. compile time. But they're
not the same thing. Passing the static check at compile
time is universally quantifying the absence of the class
of error; passing the dynamic check at runtime is existentially
quantifying the absence of the error. A further difference is
the fact that in the dynamically typed language, the error is
found during the evaluation of the expression; in a statically
typed language, errors are found without attempting to evaluate
the expression.

I find everything about the differences between static and
dynamic to be frustratingly complex and subtle.

Let me add another complex subtlety, then: the above description misses
an important point, which is that *automated* type checking is not the
whole story. I.e. that compile time/runtime distinction is a kind of
red herring.

In fact, automated type checking is orthogonal to the question of the
existence of types. It's perfectly possible to write fully typed
programs in a (good) dynamically-checked language.

In a statically-checked language, people tend to confuse automated
static checking with the existence of types, because they're thinking in
a strictly formal sense: they're restricting their world view to what
they see "within" the language.

Then they look at programs in a dynamically-checked language, and see
checks happening at runtime, and they assume that this means that the
program is "untyped".

It's certainly close enough to say that the *language* is untyped. One
could also say that a program, as seen by the language, is untyped.

But a program as seen by the programmer has types: the programmer
performs (static) type inference when reasoning about the program, and
debugs those inferences when debugging the program, finally ending up
with a program which has a perfectly good type scheme. It's may be
messy compared to say an HM type scheme, and it's usually not proved to
be perfect, but that again is an orthogonal issue.

Mathematicians operated for thousands of years without automated
checking of proofs, so you can't argue that because a
dynamically-checked program hasn't had its type scheme proved correct,
that it somehow doesn't have types. That would be a bit like arguing
that we didn't have Math until automated theorem provers came along.

These observations affect the battle over terminology in various ways.
I'll enumerate a few.

1. "Untyped" is really quite a misleading term, unless you're talking
about something like the untyped lambda calculus. That, I will agree,
can reasonably be called untyped.

2. "Type-free" as suggested by Chris Smith is equally misleading. It's
only correct in a relative sense, in a narrow formal domain which
ignores the process of reasoning about types which is inevitably
performed by human programmers, in any language.

3. A really natural term to refer to types which programmers reason
about, even if they are not statically checked, is "latent types". It
captures the situation very well intuitively, and it has plenty of
precedent -- e.g. it's mentioned in the Scheme reports, R5RS and its
predecessors, going back at least a decade or so (haven't dug to check
when it first appeared).

4. Type theorists like to say that "universal" types can be used in a
statically-typed language to subsume "dynamic types". Those theorists
are right, the term "dynamic type", with its inextricable association
with runtime checks, definitely gets in the way here. It might be
enlightening to rephrase this: what's really happening is that universal
types allow you to embed a latently-typed program in a
statically-checked language. The latent types don't go anywhere,
they're still latent in the program with universal types. The program's
statically-checked type scheme doesn't capture the latent types.
Describing it in these terms clarifies what's actually happening.

5. Dynamic checks are only part of the mechanism used to verify latent
types. They shouldn't be focused on as being the primary equivalent to
static checks. The closest equivalent to the static checks is a
combination of human reasoning and testing, in which dynamic checks play
an important but ultimately not a fundamental part. You could debug a
program and get the type scheme correct without dynamic checks, it would
just be more difficult.

So, will y'all just switch from using "dynamically typed" to "latently
typed", and stop talking about any real programs in real programming
languages as being "untyped" or "type-free", unless you really are
talking about situations in which human reasoning doesn't come into
play? I think you'll find it'll help to reason more clearly about this
whole issue.

Thanks for your cooperation!!

Anton

Jun 21 '06 #124

David Hopwood wrote:

Pascal Costanza wrote:
Rob Thorpe wrote:
Pascal Costanza wrote:
Matthias Blume wrote:
> Pascal Costanza <pc@p-cos.net> writes:
>
>> (slot-value p 'address) is an attempt to access the field 'address in
>> the object p. In many languages, the notation for this is p.address.
>>
>> Although the class definition for person doesn't mention the field
>> address, the call to (eval (read)) allows the user to change the
>> definition of the class person and update its existing
>> instances. Therefore at runtime, the call to (slot-value p 'adress)
>> has a chance to succeed.
> I am quite comfortable with the thought that this sort of evil would
> get rejected by a statically typed language. :-)
This sort of feature is clearly not meant for you. ;-P
To be fair though that kind of thing would only really be used while
debugging a program.
Its no different than adding a new member to a class while in the
debugger.

There are other places where you might add a slot to an object at
runtime, but they would be done in tidier ways. Yes, but the question remains how a static type system can deal with
this kind of updates.

It's not difficult in principle:

- for each class[*], define a function which converts an 'old' value of
that class to a 'new' value (the ability to do this is necessary anyway
to support some kinds of upgrade). A default conversion function may be
autogenerated if the class definition has changed only in minor ways.

Yep, this is more or less exactly how CLOS does it. (The conversion
function is called update-instance-for-redefined-class, and you can
provide your own methods on it.)
- typecheck the new program and the conversion functions, using the old
type definitions for the argument of each conversion function, and the
new type definitions for its result.
The problem here is: The program is already executing, so this typecheck
isn't performed at compile-time, in the strict sense of the word (i.e.,
before the program is deployed). It may still be a syntactic analysis,
but you don't get the kind of guarantees anymore that you typically
expect from a static type checker _before_ the program is started in the
first place.

(It's really important to understand that the idea is to use this for
deployed programs - albeit hopefully in a more structured fashion - and
not only for debugging. The example I have given is an extreme one that
you would probably not use as such in a "real-world" setting, but it
shows that there is a boundary beyond which static type systems cannot
be used in a meaningful way anymore, at least as far as I can tell.)
- have the debugger apply the conversions to all values, and then resume
the program.
In CLOS, this conversion is defined as part of the language proper, but
this is mostly because Common Lisp doesn't make a sharp distinction
between debugging capabilities and "regular" language features. (I think
it's a good thing that there is no strong barrier against having
debugging capabilities in a deployed program.)

[*] or nearest equivalent in a non-OO language.

Pascal

--
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/

Jun 21 '06 #125

Torben Ægidius Mogensen

"Rob Thorpe" <ro***********@antenova.com> writes:

Andreas Rossberg wrote:
No, variables are insignificant in this context. You can consider a
language without variables at all (such languages exist, and they can
even be Turing-complete) and still have evaluation, values, and a
non-trivial type system.

Hmm. You're right, ML is no-where in my definition since it has no
variables.

That's not true. ML has variables in the mathematical sense of
variables -- symbols that can be associated with different values at
different times. What it doesn't have is mutable variables (though it
can get the effect of those by having variables be immutable
references to mutable memory locations).

What Andreas was alluding to was presumably FP-style languages where
functions or relations are built by composing functions or relations
without ever naming values.

Torben

Jun 21 '06 #126

Andreas Rossberg

David Hopwood wrote:

Oh, but it *does* make sense to talk about dynamic tagging in a statically
typed language.

It even makes perfect sense to talk about dynamic typing in a statically
typed language - but keeping the terminology straight, this rather
refers to something like described in the well-known paper of the same
title (and its numerous follow-ups):

Martin Abadi, Luca Cardelli, Benjamin Pierce, Gordon Plotkin
Dynamic typing in a statically-typed language.
Proc. 16th Symposium on Principles of Programming Languages, 1989
/ TOPLAS 13(2), 1991

Note how this is totally different from simple tagging, because it deals
with real types at runtime.

- Andreas

Jun 21 '06 #127

Matthias Blume wrote:

"Rob Thorpe" <ro***********@antenova.com> writes:
I think we're discussing this at cross-purposes. In a language like C
or another statically typed language there is no information passed
with values indicating their type.

You seem to be confusing "does not have a type" with "no type
information is passed at runtime".
Have a look in a C compiler if you don't believe me.

Believe me, I have.

In a C compiler the compiler has no idea what the values are in the
program.
It knows only their type in that it knows the type of the variable they
are contained within.
Would you agree with that?

No it doesn't. Casting reinterprets a value of one type as a value of
another type.
There is a difference. If I cast an unsigned integer 2000000000 to a
signed integer in C on the machine I'm using then the result I will get
will not make any sense.

Which result are you getting? What does it mean to "make sense"?

Well the right one actually, bad example.

But, if I cast an unsigned int 2500000000 to signed I get -1794967296.

Jun 21 '06 #128

Andreas Rossberg

Rob Thorpe wrote:

I think this should make it clear. If I have a "xyz" in lisp I know it
is a string.
If I have "xyz" in an untyped language like assembler it may be
anything, two pointers in binary, an integer, a bitfield. There is no
data at compile time or runtime to tell what it is, the programmer has
to remember.
You have to distinguish between values (at the level of language
semantics) and their low-level representation (at the implementation
level). In a high-level language, the latter should be completely
immaterial to the semantics, and hence not interesting for the discussion.

No, variables are insignificant in this context. You can consider a
language without variables at all (such languages exist, and they can
even be Turing-complete) and still have evaluation, values, and a
non-trivial type system.

Hmm. You're right, ML is no-where in my definition since it has no
variables.

Um, it has. Mind you, it has no /mutable/ variables, but that was not
even what I was talking about.

But the value itself has no type

You mean that the type of the value is not represented at runtime? True,
but that's simply because the type system is static. It's not the same
as saying it has no type.

Well, is it even represented at compile time?
The compiler doesn't know in general what values will exist at runtime,
it knows only what types variables have. Sometimes it only has partial
knowledge and sometimes the programmer deliberately overrides it. From
what knowledge it you could say it know what types values will have.

Again, variables are insignificant. From the structure of an expression
the type system derives the type of the resulting value. An expression
may contain variables, and then the type system generally must know (or
be able to derive) their types too, but that's a separate issue. Most
values are anonymous. Nevertheless their types are known.
Unfortunately it's often necessary to break static type systems.

Your definitely using the wrong static language then. ;-)

- Andreas

Jun 21 '06 #129

Andreas Rossberg schrieb:

Rob Thorpe wrote:
Hmm. You're right, ML is no-where in my definition since it has no
variables.

Um, it has. Mind you, it has no /mutable/ variables, but that was not
even what I was talking about.

Indeed. A (possibly nonexhaustive) list of program entities that (can)
have type would comprise of mutable variables, immutable variables (i.e.
constants and parameter names), and functions resp. their results.

Regards,
Jo

Jun 21 '06 #130

Matthias Blume schrieb:

Joachim Durchholz <jo@durchholz.org> writes:
Matthias Blume schrieb:
Perhaps better: A language is statically typed if its definition
includes (or ever better: is based on) a static type system, i.e., a
static semantics with typing judgments derivable by typing rules.
Usually typing judgmets associate program phrases ("expressions") with
types given a typing environment. This is defining a single term ("statically typed") using three
undefined terms ("typing judgements", "typing rules", "typing
environment").

This was not meant to be a rigorous definition.

Rigorous or not, introducing additional undefined terms doesn't help
with explaining a term.
Also, I'm not going to repeat the textbook definitions for those
three standard terms here.

These terms certainly aren't standard for Perl, Python, Java, or Lisp,
and they aren't even standard for topics covered on comp.lang.functional
(which includes dynamically-typed languages after all).

Regards,
Jo

Jun 21 '06 #131

Pascal Costanza schrieb:

(It's really important to understand that the idea is to use this for
deployed programs - albeit hopefully in a more structured fashion - and
not only for debugging. The example I have given is an extreme one that
you would probably not use as such in a "real-world" setting, but it
shows that there is a boundary beyond which static type systems cannot
be used in a meaningful way anymore, at least as far as I can tell.)

As soon as the running program can be updated, the distinction between
"static" (compile time) and "dynamic" (run time) blurs.
You can still erect a definition for such a case, but it needs to refer
to the update process, and hence becomes language-specific. In other
words, language-independent definitions of dynamic and static typing
won't give any meaningful results for such languages.

I'd say it makes more sense to talk about what advantages of static vs.
dynamic typing can be applied in such a situation.
E.g. one interesting topic would be the change in trade-offs: making
sure that a type error cannot occur becomes much more difficult
(particularly if the set of available types can change during an
update), so static typing starts to lose some of its appeal; OTOH a good
type system can give you a lot of guarantees even in such a situation,
even if it might have to revert to the occasional run-time type check,
so static checking still has its merits.

Regards,
Jo

Jun 21 '06 #132

Darren New wrote:

[me:]

Personally, I would be quite happy to go there -- I dislike the idea
that a value has a specific inherent type.

Interestingly, Ada defines a type as a collection of values. It works
quite well, when one consistantly applies the definition.

I have never been very happy with relating type to sets of values (objects,
whatever). I'm not saying that it's formally wrong (but see below), but it
doesn't fit with my intuitions very well -- most noticeably in that the sets
are generally unbounded so you have to ask where the (intentional) definitions
come from.

Two other notions of what "type" means might be interesting, both come from
attempts to create type-inference mechanisms for Smalltalk or related
languages. Clearly one can't use the set-of-values approach for these purposes
;-) One approach takes "type" to mean "set of classes" the other takes a
finer-grained approach and takes it to mean "set of selectors" (where
"selector" is Smalltalk for "name of a method" -- or, more accurately, name of
a message).

But I would rather leave the question of what a type "is" open, and consider
that to be merely part of the type system. For instance the hypothetical
nullability analysis type system I mentioned might have only three types
NULLABLE, ALWAYSNULL, and NEVERNULL.

It's worth noting, too, that (in some sense) the type of an object can change
over time[*]. That can be handled readily (if not perfectly) in the informal
internal type system(s) which programmers run in their heads (pace the very
sensible post by Anton van Straaten today in this thread -- several branches
away), but cannot be handled by a type system based on sets-of-values (and is
also a counter-example to the idea that "the" dynamic type of an object/value
can be identified with its tag).

([*] if the set of operations in which it can legitimately partake changes.
That can happen explicitly in Smalltalk (using DNU proxies for instance if the
proxied object changes, or even using #becomeA:), but can happen anyway in less
"free" languages -- the State Pattern for instance, or even (arguably) in the
difference between an empty list and a non-empty list).

-- chris

Jun 21 '06 #133

Anton van Straaten wrote:

But a program as seen by the programmer has types: the programmer
performs (static) type inference when reasoning about the program, and
debugs those inferences when debugging the program, finally ending up
with a program which has a perfectly good type scheme. It's may be
messy compared to say an HM type scheme, and it's usually not proved to
be perfect, but that again is an orthogonal issue.

I like this way of looking at it.

-- chris

Jun 21 '06 #134

David Hopwood wrote:

When people talk
about "types" being associated with values in a "latently typed" or
"dynamically typed" language, they really mean *tag*, not type.

I don't think that's true. Maybe /some/ people do confuse the two, but I am
certainly a counter-example ;-)

The tag (if any) is part of the runtime machinery (or, if not, then I don't
understand what you mean by the word), and while that is certainly a reasonably
approximation to the type of the object/value, it is only an approximation,
and -- what's more -- is only an approximation to the type as yielded by one
specific (albeit abstract, maybe even hypothetical) type system.

If I send #someMessage to a proxy object which has not had its referent set
(and assuming the default value, presumably some variant of nil, does not
understand #someMessage), then that's just as much a type error as sending
#someMessage to a variable holding a nil value. If I then assign the referent
of the proxy to some object which does understand #someMessage, then it is not
a type error to send #someMessage to the proxy. So the type has changed, but
nothing in the tag system of the language implementation has changed.

-- chris

Jun 21 '06 #135

Chris Smith wrote:

It would be interesting to see what a language designed specifically to
support user-defined, pluggable, and perhaps composable, type systems
would look like. [...]
You mean in terms of a practical programming language? If not, then
lambda calculus is used in precisely this way for the static sense of
types.

Good point. I was actually thinking about what a practical language might look
like, but -- hell -- why not start with theory for once ? ;-)

I think Marshall got this one right. The two are accomplishing
different things. In one case (the dynamic case) I am safeguarding
against negative consequences of the program behaving in certain non-
sensical ways. In the other (the static case) I am proving theorems
about the impossibility of this non-sensical behavior ever happening.
And so conflating the two notions of type (-checking) as a kind of category
error ? If so then I see what you mean, and it's a useful distinction, but am
unconvinced that it's /so/ helpful a perspective that I would want to exclude
other perspectives which /do/ see the two as more-or-less trivial variants on
the same underlying idea.
I acknowledge those questions. I believe they are valid. I don't know
the answers. As an intuitive judgement call, I tend to think that
knowing the correctness of these things is of considerable benefit to
software development, because it means that I don't have as much to
think about at any one point in time. I can validly make more
assumptions about my code and KNOW that they are correct. I don't have
to trace as many things back to their original source in a different
module of code, or hunt down as much documentation. I also, as a
practical matter, get development tools that are more powerful.
Agreed that these are all positive benefits of static declarative (more or
less) type systems.

But then (slightly tongue-in-cheek) shouldn't you be agitating for Java's type
system to be stripped out (we hardly /need/ it since the JVM does latent typing
anyway), leaving the field free for more powerful or more specialised static
analysis ?

(Whether it's possible to create the same for a dynamically typed
language is a potentially interesting discussion; but as a practical
matter, no matter what's possible, I still have better development tools
for Java than for JavaScript when I do my job.)
Acknowledged. Contrary-wise, I have better development tools in Smalltalk than
I ever expect to have in Java -- in part (only in part) because of the late
binding in Smalltalk and it's lack of insistence on declared types from an
arbitrarily chosen type system.

On
the other hand, I do like proving theorems, which means I am interested
in type theory; if that type theory relates to programming, then that's
great! That's probably not the thing to say to ensure that my thoughts
are relevant to the software development "industry", but it's
nevertheless the truth.

Saying it will probably win you more friends in comp.lang.functional than it
looses in comp.lang.java.programmer ;-)

-- chris

Jun 21 '06 #136

> So, will y'all just switch from using "dynamically typed" to "latently

typed", and stop talking about any real programs in real programming
languages as being "untyped" or "type-free", unless you really are
talking about situations in which human reasoning doesn't come into
play? I think you'll find it'll help to reason more clearly about this
whole issue.

I agree with most of what you say except regarding "untyped".

In machine language or most assembly the type of a variable is
something held only in the mind of the programmer writing it, and
nowhere else. In latently typed languages though the programmer can
ask what they type of a particular value is. There is a vast
difference to writing code in the latter kind of language to writing
code in assembly.

I would suggest that at least assembly should be referred to as
"untyped".

Jun 21 '06 #137

Andreas Rossberg

Chris Uppal wrote:

I have never been very happy with relating type to sets of values (objects,
whatever).
Indeed, this view is much too narrow. In particular, it cannot explain
abstract types, which is *the* central aspect of decent type systems.
There were papers observing this as early as 1970. A type system should
rather be seen as a logic, stating invariants about a program. This can
include operations supported by values of certain types, as well as more
advanced properties, e.g. whether something can cause a side-effect, can
diverge, can have a deadlock, etc.

(There are also theoretic problems with the types-as-sets view, because
sufficiently rich type systems can no longer be given direct models in
standard set theory. For example, first-class polymorphism would run
afoul the axiom of foundation.)
It's worth noting, too, that (in some sense) the type of an object can change
over time[*].

No. Since a type expresses invariants, this is precisely what may *not*
happen. If certain properties of an object may change then the type of
the object has to reflect that possibility. Otherwise you cannot
legitimately call it a type.

Taking your example of an uninitialised reference, its type is neither
"reference to nil" nor "reference to object that understands message X",
it is in fact the union of both (at least). And indeed, languages with
slightly more advanced type systems make things like this very explicit
(in ML for example you have the option type for that purpose).

- Andreas

Jun 21 '06 #138

Joachim Durchholz wrote:

Pascal Costanza schrieb:
(It's really important to understand that the idea is to use this for
deployed programs - albeit hopefully in a more structured fashion -
and not only for debugging. The example I have given is an extreme one
that you would probably not use as such in a "real-world" setting, but
it shows that there is a boundary beyond which static type systems
cannot be used in a meaningful way anymore, at least as far as I can
tell.)

As soon as the running program can be updated, the distinction between
"static" (compile time) and "dynamic" (run time) blurs.
You can still erect a definition for such a case, but it needs to refer
to the update process, and hence becomes language-specific. In other
words, language-independent definitions of dynamic and static typing
won't give any meaningful results for such languages.

I'd say it makes more sense to talk about what advantages of static vs.
dynamic typing can be applied in such a situation.
E.g. one interesting topic would be the change in trade-offs: making
sure that a type error cannot occur becomes much more difficult
(particularly if the set of available types can change during an
update), so static typing starts to lose some of its appeal; OTOH a good
type system can give you a lot of guarantees even in such a situation,
even if it might have to revert to the occasional run-time type check,
so static checking still has its merits.

I am not opposed to this view. The two examples I have given for things
that are impossible in static vs. dynamic type systems were
intentionally extreme to make the point that you have to make a choice,
that you cannot just blindly throw (instances of) both approaches
together. Static type systems potentially change the semantics of a
language in ways that cannot be captured by dynamically typed languages
anymore, and vice versa.

There is, of course, room for research on performing static type checks
in a running system, for example immediately after or before a software
update is applied, or maybe even on separate type checking on software
increments such that guarantees for their composition can be derived.
However, I am not aware of a lot of work in that area, maybe because the
static typing community is too focused on compile-time issues.

Personally, I also don't think that's the most interesting issue in that
area, but that's of course only a subjective opinion.
Pascal

--
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/

Jun 21 '06 #139

"Rob Thorpe" <ro***********@antenova.com> writes:

Matthias Blume wrote:
"Rob Thorpe" <ro***********@antenova.com> writes:
> I think we're discussing this at cross-purposes. In a language like C
> or another statically typed language there is no information passed
> with values indicating their type.
You seem to be confusing "does not have a type" with "no type
information is passed at runtime".
> Have a look in a C compiler if you don't believe me.

Believe me, I have.

In a C compiler the compiler has no idea what the values are in the
program.

It is no different from any other compiler, really. If the compiler
sees the literal 1 in a context that demands type int, then it knows
perfectly well what value that is.
It knows only their type in that it knows the type of the variable they
are contained within.
Would you agree with that?
> No it doesn't. Casting reinterprets a value of one type as a value of
> another type.
> There is a difference. If I cast an unsigned integer 2000000000 to a
> signed integer in C on the machine I'm using then the result I will get
> will not make any sense.

Which result are you getting? What does it mean to "make sense"?

Well the right one actually, bad example.

But, if I cast an unsigned int 2500000000 to signed I get -1794967296.

So, why do you think this "does not make sense"? And, as this example
illustrates, casting in C maps values to values. Depending on the
types of the source and the target, a cast might change the underlying
representation, or it might leave it the same. But it does produce a
value, and the result value is usually not the same as the argument
value, even if the representation is the same.

Matthias

Jun 21 '06 #140

Joachim Durchholz <jo@durchholz.org> writes:

Matthias Blume schrieb:
Joachim Durchholz <jo@durchholz.org> writes:
Matthias Blume schrieb:
Perhaps better: A language is statically typed if its definition
includes (or ever better: is based on) a static type system, i.e., a
static semantics with typing judgments derivable by typing rules.
Usually typing judgmets associate program phrases ("expressions") with
types given a typing environment.
This is defining a single term ("statically typed") using three
undefined terms ("typing judgements", "typing rules", "typing
environment").

This was not meant to be a rigorous definition.

Rigorous or not, introducing additional undefined terms doesn't help
with explaining a term.

I think you missed my point. My point was that a language is
statically typed IF IT IS DEFINED THAT WAY, i.e., if it has a static
type system that is PART OF THE LANGUAGE DEFINITION. The details are
up to each individual definition.

Also, I'm not going to repeat the textbook definitions for those
three standard terms here.

These terms certainly aren't standard for Perl, Python, Java, or Lisp,

Indeed. That's because these languages are not statically typed.

Jun 21 '06 #141

Chris Uppal schrieb:

Chris Smith wrote:
I think Marshall got this one right. The two are accomplishing
different things. In one case (the dynamic case) I am safeguarding
against negative consequences of the program behaving in certain non-
sensical ways. In the other (the static case) I am proving theorems
about the impossibility of this non-sensical behavior ever happening.

And so conflating the two notions of type (-checking) as a kind of category
error ? If so then I see what you mean, and it's a useful distinction, but am
unconvinced that it's /so/ helpful a perspective that I would want to exclude
other perspectives which /do/ see the two as more-or-less trivial variants on
the same underlying idea.

It is indeed helpful.
Just think of all the unit tests that you don't have to write.

Regards,
Jo

Jun 21 '06 #142

Matthias Blume schrieb:

Joachim Durchholz <jo@durchholz.org> writes:
Matthias Blume schrieb:
Joachim Durchholz <jo@durchholz.org> writes:

Matthias Blume schrieb:
> Perhaps better: A language is statically typed if its definition
> includes (or ever better: is based on) a static type system, i.e., a
> static semantics with typing judgments derivable by typing rules.
> Usually typing judgmets associate program phrases ("expressions") with
> types given a typing environment.
This is defining a single term ("statically typed") using three
undefined terms ("typing judgements", "typing rules", "typing
environment").
This was not meant to be a rigorous definition.

Rigorous or not, introducing additional undefined terms doesn't help
with explaining a term.

I think you missed my point. My point was that a language is
statically typed IF IT IS DEFINED THAT WAY, i.e., if it has a static
type system that is PART OF THE LANGUAGE DEFINITION. The details are
up to each individual definition.

Well, that certainly makes more sense to me.

Regards,
Jo

Jun 21 '06 #143

Matthias Blume wrote:

"Rob Thorpe" <ro***********@antenova.com> writes:
Matthias Blume wrote:
"Rob Thorpe" <ro***********@antenova.com> writes:

> I think we're discussing this at cross-purposes. In a language like C
> or another statically typed language there is no information passed
> with values indicating their type.

You seem to be confusing "does not have a type" with "no type
information is passed at runtime".

> Have a look in a C compiler if you don't believe me.

Believe me, I have.
In a C compiler the compiler has no idea what the values are in the
program.

It is no different from any other compiler, really. If the compiler
sees the literal 1 in a context that demands type int, then it knows
perfectly well what value that is.

Well, with a literal yes. But with the value of some variable x at
runtime it only know the type because it knows the type of the
variable. Similarly the type of values generated by an expression are
only known because the type the expression generates is known.

It knows only their type in that it knows the type of the variable they
are contained within.
Would you agree with that?
Would you?

> No it doesn't. Casting reinterprets a value of one type as a value of
> another type.
> There is a difference. If I cast an unsigned integer 2000000000 to a
> signed integer in C on the machine I'm using then the result I will get
> will not make any sense.

Which result are you getting? What does it mean to "make sense"?

Well the right one actually, bad example.

But, if I cast an unsigned int 2500000000 to signed I get -1794967296.

So, why do you think this "does not make sense"?

Well, it makes sense in terms of the C spec certainly.
It does not make sense in that it does not emit an error.
And, as this example
illustrates, casting in C maps values to values. Depending on the
types of the source and the target, a cast might change the underlying
representation, or it might leave it the same. But it does produce a
value, and the result value is usually not the same as the argument
value, even if the representation is the same.

Yes. I'm not arguing with that.

Jun 21 '06 #144

Pascal Costanza schrieb:

Static type systems potentially change the semantics of a
language in ways that cannot be captured by dynamically typed languages
anymore, and vice versa.
Very true.

I also suspect that's also why adding type inference to a
dynamically-typed language doesn't give you all the benefits of static
typing: the added-on type system is (usually) too weak to express really
interesting guarantees, usually because the language's semantics isn't
tailored towards making the inference steps easy enough.

Conversely, I suspect that adding dynamic typing to statically-typed
languages tends to miss the most interesting applications, mostly
because all the features that can "simply be done" in a
dynamically-typed language have to be retrofitted to the
statically-typed language on a case-by-case basis.

In both cases, the language designers often don't know the facilities of
the opposed camp well enough to really assess the trade-offs they are doing.
There is, of course, room for research on performing static type checks
in a running system, for example immediately after or before a software
update is applied, or maybe even on separate type checking on software
increments such that guarantees for their composition can be derived.
However, I am not aware of a lot of work in that area, maybe because the
static typing community is too focused on compile-time issues.

I think it's mostly because it's intimidating.

The core semantics of an ideal language fits on a single sheet of paper,
to facilitate proofs of language properties. Type checking
dynamically-loaded code probably wouldn't fit on that sheet of paper.
(The non-core semantics is then usually a set of transformation rules
that map the constructs that the programmer sees to constructs of the
core language.)

Regards,
Jo

Jun 21 '06 #145

Marshall

Chris Smith wrote:

When I used the word "type" above, I was adopting the
working definition of a type from the dynamic sense. That is, I'm
considering whether statically typed languages may be considered to also
have dynamic types, and it's pretty clear to me that they do.

I suppose this statement has to be evaluated on the basis of a
definition of "dynamic types." I don't have a firm definition for
that term, but my working model is runtime type tags. In which
case, I would say that among statically typed languages,
Java does have dynamic types, but C does not. C++ is
somewhere in the middle.
Marshall

Jun 21 '06 #146

Marshall

Joachim Durchholz wrote:

Hmm... I think this distinction doesn't cover all cases.

Assume a language that
a) defines that a program is "type-correct" iff HM inference establishes
that there are no type errors
b) compiles a type-incorrect program anyway, with an establishes
rigorous semantics for such programs (e.g. by throwing exceptions as
appropriate).
The compiler might actually refuse to compile type-incorrect programs,
depending on compiler flags and/or declarations in the code.

Typed ("strongly typed") it is, but is it statically typed or
dynamically typed?

I think what this highlights is the fact that our existing terminology
is not up to the task of representing all the possible design
choices we could make. Some parts of dynamic vs. static
a mutually exclusive; some parts are orthogonal. Maybe
we have reached the point where trying to cram everything
in two one of two possible ways of doing things isn't going
to cut it any more.

Could it be that the US two-party system has influenced our
thinking?</joke>
Marshall

Jun 21 '06 #147

"Rob Thorpe" <ro***********@antenova.com> writes:

>> > No it doesn't. Casting reinterprets a value of one type as a value of
>> > another type.
>> > There is a difference. If I cast an unsigned integer 2000000000 to a
>> > signed integer in C on the machine I'm using then the result I will get
>> > will not make any sense.
>>
>> Which result are you getting? What does it mean to "make sense"?
>
> Well the right one actually, bad example.
>
> But, if I cast an unsigned int 2500000000 to signed I get -1794967296.

So, why do you think this "does not make sense"?

Well, it makes sense in terms of the C spec certainly.
It does not make sense in that it does not emit an error.

Why not?

Jun 21 '06 #148