What is Expressiveness in a Computer Language

Xah Lee

in March, i posted a essay â€œWhat is Expressiveness in a Computer
Languageâ€, archived at:
http://xahlee.org/perl-python/what_i...esiveness.html

I was informed then that there is a academic paper written on this
subject.

On the Expressive Power of Programming Languages, by Matthias
Felleisen, 1990.
http://www.ccs.neu.edu/home/cobbe/pl...ive-slides.pdf

Has anyone read this paper? And, would anyone be interested in giving a
summary?

thanks.

Xah
xa*@xahlee.org
âˆ‘ http://xahlee.org/

Jun 9 '06

Subscribe Post Reply

669

25314

« First
<
4
12
13
14

Darren New

Joachim Durchholz wrote:

of no assertion language that can express such temporal relationships,
and even if there is (I'm pretty sure there is), I'm rather sceptical
that programmers would be able to write correct assertions, or correctly
interpret them - temporal logic offers several traps for the unwary.

FWIW, this is exactly the area to which LOTOS (Language Of Temporal
Orderering Specifications) is targetted at. It's essentially based on
CSP, but somewhat extended. It's pretty straightforward to learn and
understand, too. Some have even added "realtime" constraints to it.

--
Darren New / San Diego, CA, USA (PST)
This octopus isn't tasty. Too many
tentacles, not enough chops.

Jul 17 '06 #651

Chris Smith

Marshall <ma*************@gmail.comwrote:

We seem to have slipped back from the hypothetical relation language
with only assignement back to SQL.

I missed the point where we started discussing such a language. I
suspect it was while some of us were still operating under the
misconception that you assignment to attributes of tuples, rather than
to entire relations.

I don't see how such a language (limited to assignment of entire
relations) is particularly helpful to consider. If the relations are to
be considered opaque, then there's clearly no aliasing going on.
However, such a language doesn't seem to solve any actual problems. It
appears to be nothing other than a toy language, with a fixed finite set
of variables having only value semantics, no scope, etc. I assume that
relational databases will have the equivalent of SQL's update statement;
and if that's not the case, then I would need someone to explain how to
accomplish the same goals in the new relational language; i.e. it would
need some way of expressing transformations of relations, not just
complete replacement of them with new relations that are assumed to
appear out of thin air.

--
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation

Jul 17 '06 #652

Darren New

Marshall wrote:

I would propose that variables have identity, and values do not.
In part this is via the supplied definition of identity, in which, when
you change one thing, if something else changes as well, they
share identity.

Maybe you gave a better definition the first time, but this one doesn't
really hold up.

of equality here is too narrow; it is only necessary to show that
two things both change, not that they change in the same way.)

If I change X, then Y[X] changes also. I don't think X is identical to Y
or Y[X], nor is it an alias of either. I think that's where the
equality comes into it.

--
Darren New / San Diego, CA, USA (PST)
This octopus isn't tasty. Too many
tentacles, not enough chops.

Jul 17 '06 #653

Joe Marshall

Marshall wrote:

>
I am having a hard time with this very broad definition of aliasing.

How about this definition: Consider three variables, i, j, and k, and
a functional equivalence predicate (EQUIVALENT(i, j) returns true if
for every pure function F, F(i) = F(j)). Now suppose i and j are
EQUIVALENT at some point, then a side effecting function G is invoked
on k, after which i and j are no longer equivalent. Then there is
aliasing.

This is still a little awkward, but there are three main points:
1. Aliasing occurs between variables (named objects).
2. It is tied to the notion of equivalence.
3. You can detect it when a procedure that has no access to a value
can nonetheless modify the value.

In a call-by-value language, you cannot alias values directly, but if
the values are aggregate data structures (like in Java), you may be
able to modify a shared subcomponent.

Jul 17 '06 #654

Marshall

Chris Smith wrote:

Marshall <ma*************@gmail.comwrote:
We seem to have slipped back from the hypothetical relation language
with only assignement back to SQL.

[...]
I don't see how such a language (limited to assignment of entire
relations) is particularly helpful to consider.

I find it useful to consider various language features together and
individually, to see how these features interact. If nothing else,
it furthers my understanding of how languages work.

If the relations are to
be considered opaque, then there's clearly no aliasing going on.

Not certain I understand, but I think I agree.

However, such a language doesn't seem to solve any actual problems. It
appears to be nothing other than a toy language, with a fixed finite set
of variables having only value semantics, no scope, etc.

No, such a language is entirely useful, since relational assignment
is *more* expressive than insert/update/delete. (How did you get "no
scope"
by the way? And fixed variables? All I said was to change ins/upd/del
to
assignment.)

Also note one could fairly easily write a macro for ins/upd/del if one
had assignement. (See below.)

Now, there are some significant issues with trying to produce a
performant implementation in a distributed environment with strictness.
In fact I will go so far as to say it's impossible. However on a single
machine I don't see any reason to think it would perform any worse,
(although I haven't thought about it deeply.)

I assume that
relational databases will have the equivalent of SQL's update statement;
and if that's not the case, then I would need someone to explain how to
accomplish the same goals in the new relational language; i.e. it would
need some way of expressing transformations of relations, not just
complete replacement of them with new relations that are assumed to
appear out of thin air.

Consider:

i := i + 1;

Note that the new value of i didn't just appear out of thin air; it was
in fact based on the previous value of i.

Given:
variable T : relation of r;
update_fn : r -r // takes one record and returns new, updated
record
filter_fn : r -boolean // indicator function

insert(T, new_rows) ={ T := T union new_rows; }

update(T, update_fn, filter_fn) ={
transaction {
let Tmp = filter(T, filter_fn); // select only those rows to
update
T := T minus Tmp;
T := T union update_fn(Tmp);
}
}

delete(T,filter_fn) ={ T := T minus filter(T, filter_fn); }

So we can define insert, update and delete in terms of relational
assignment, relational subtraction, and relational union. Type
checking details omitted.
Marshall

Jul 17 '06 #655

David Hopwood

Darren New wrote:

From what I can determine, the authors seem to imply that typestate is
dataflow analysis modified in (at least) two ways:

1) When control flow joins, the new typestate is the intersection of
typestates coming into the join, where as dataflow analysis doesn't
guarantee that. (They imply they think dataflow analysis is allowed to
say "the variable might or might not be initialized here", while
typestate would ensure the variable is uninitialized.)

Right, but this is a disadvantage of their typestate algorithm. It is why
the example in Figure 2 of
<http://www.cs.ubc.ca/local/reading/proceedings/spe91-95/spe/vol25/issue4/spe950wk.pdf>
fails to check, even though it "obviously" initializes all variables.

Consider the equivalent Java program:

public class LoopInitTest {
public static void main(String[] args) {
boolean b;
int i;

while (true) {
b = true;
if (b) {
v = 1;
}
v = v + 1;
}
}
}

As it happens, javac rejects this:

LoopInitTest.java:12: variable v might not have been initialized
v = v + 1;
^

but for a different and more trivial reason than the Hermes algorithm.
Change "if (b) { v = 1; }" to just "v = 1;", and the Java version will be
accepted by its definite assignment analysis (which is a dataflow analysis),
but the equivalent Hermes program still would not.

--
David Hopwood <da******************@blueyonder.co.uk>

Jul 17 '06 #656

Chris Smith

Marshall <ma*************@gmail.comwrote:

If the relations are to
be considered opaque, then there's clearly no aliasing going on.

Not certain I understand, but I think I agree.

My condition, though, was that relations be opaque. Since you will be
violating that condition further on down, I just felt that it's useful
to point that out here.

No, such a language is entirely useful, since relational assignment
is *more* expressive than insert/update/delete.

Assignment is more powerful *as* assignment. However, it is less
powerful when the task at hand is deriving new relations from old ones.
Assignment provides absolutely no tools for doing that. I thought you
were trying to remove those tools from the language entirely in order to
remove the corresponding aliasing problems. I guess I was wrong, since
you make it clear below that you intend to keep at least basic set
operations on relations in your hypothetical language.

Consider:

i := i + 1;

Note that the new value of i didn't just appear out of thin air; it was
in fact based on the previous value of i.

Right. That's exactly the kind of thing I thought you were trying to
avoid.

So we can define insert, update and delete in terms of relational
assignment, relational subtraction, and relational union. Type
checking details omitted.

Then the problem is in the step where you assign the new relation to the
old relational variable. You need to check that the new relation
conforms to the invariants that are expressed on that relational
variable. If you model this as assignment of relations (or relation
values... I'm unclear on the terminology at this point) then naively
this requires scanning through an entire set of relations in the
constraint, to verify that the invariant holds. You've may have avoided
"aliasing" in any conventional sense of the word by stretching the word
itself beyond breaking... but you've only done it by proactively
accepting its negative consequences.

It remains non-trivial to scan through a 2 GB database table to verify
that some attribute of every tuple matches some attribute of another
table, even if you call the entire thing one relational variable. The
implementation, of course, isn't at all going to make a copy of the
entire (possibly several GB) relation and rewrite it all every time it
makes a change, and it isn't going to give up and rescan all possible
invariants every time every change is made. In other words, you've
risen to a layer of abstraction where the aliasing problem does not
exist. The implementation is still going to deal with the aliasing
problem, which will resurface once you pass over to the other side of
the abstraction boundary.

--
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation

Jul 17 '06 #657

David Hopwood

David Hopwood wrote:

Darren New wrote:

>>From what I can determine, the authors seem to imply that typestate is
dataflow analysis modified in (at least) two ways:

1) When control flow joins, the new typestate is the intersection of
typestates coming into the join, where as dataflow analysis doesn't
guarantee that. (They imply they think dataflow analysis is allowed to
say "the variable might or might not be initialized here", while
typestate would ensure the variable is uninitialized.)

Right, but this is a disadvantage of their typestate algorithm. It is why
the example in Figure 2 of
<http://www.cs.ubc.ca/local/reading/proceedings/spe91-95/spe/vol25/issue4/spe950wk.pdf>
fails to check, even though it "obviously" initializes all variables.

Consider the equivalent Java program:

I mixed up Figures 1 and 2. Here is the Java program that Figure 2 should
be compared to:

public class LoopInitTest {
public static String getString() { return "foo"; }

public static void main(String[] args) {
String line = getString();
boolean is_last = false;

while (!is_last) {
if (line.charAt(0) == 'q') {
is_last = true;
}

// insert line into inputs (not important for analysis)

if (!is_last) {
line = getString();
}
}
}
}

which compiles without error, because is_last is definitely initialized.

--
David Hopwood <da******************@blueyonder.co.uk>

Jul 17 '06 #658

Darren New

David Hopwood wrote:

>
public class LoopInitTest {
public static String getString() { return "foo"; }

public static void main(String[] args) {
String line = getString();
boolean is_last = false;

while (!is_last) {
if (line.charAt(0) == 'q') {
is_last = true;
}

// insert line into inputs (not important for analysis)

if (!is_last) {
line = getString();
}
}
}
}

which compiles without error, because is_last is definitely initialized.

At what point do you think is_last or line would seem to not be
initialized? They're both set at the start of the function, and (given
that it's Java) nothing can unset them.

At the start of the while loop, it's initialized. At the end of the
while loop, it's initialized. So the merge point of the while loop has
it marked as initialized.

Now, if the "insert line into inputs" actually unset "line", then yes,
you're right, Hermes would complain about this.

Alternately, if you say
if (x) v = 1;
if (x) v += 1;
then Hermes would complain when it wouldn't need to. However, that's
more a limitation of the typestate checking algorithms than the concept
itself; that is to say, clearly the typestate checker could be made
sufficiently intelligent to track most simple versions of this problem
and not complain, by carrying around conditionals in the typestate
description.

--
Darren New / San Diego, CA, USA (PST)
This octopus isn't tasty. Too many
tentacles, not enough chops.

Jul 17 '06 #659

Darren New

Darren New wrote:

Now, if the "insert line into inputs" actually unset "line", then yes,
you're right, Hermes would complain about this.

Oh, I see. You translated from Hermes into Java, and Java doesn't have
the "insert into" statement. Indeed, the line you commented out is
*exactly* what's important for analysis, as it unsets line.

Had it been
insert copy of line into inputs
then you would not have gotten any complaint from Hermes, as it would
not have unset line there.

In this case, it's equivalent to
if (!is_line) line = getString();
if (!is_line) use line for something...
except the second test is at the top of the loop instead of the bottom.

--
Darren New / San Diego, CA, USA (PST)
This octopus isn't tasty. Too many
tentacles, not enough chops.

Jul 17 '06 #660

Marshall

Chris Smith wrote:

Marshall <ma*************@gmail.comwrote:

If the relations are to
be considered opaque, then there's clearly no aliasing going on.
Not certain I understand, but I think I agree.

My condition, though, was that relations be opaque. Since you will be
violating that condition further on down, I just felt that it's useful
to point that out here.

No, such a language is entirely useful, since relational assignment
is *more* expressive than insert/update/delete.

Assignment is more powerful *as* assignment. However, it is less
powerful when the task at hand is deriving new relations from old ones.

At the implementation level, it makes some things harder, however
as a logical model, it is more powerful. While this is very much a real
world issue, it is worth noting that it is a performance issue *merely*
and not a semantic issue.

Assignment provides absolutely no tools for doing that. I thought you
were trying to remove those tools from the language entirely in order to
remove the corresponding aliasing problems. I guess I was wrong, since
you make it clear below that you intend to keep at least basic set
operations on relations in your hypothetical language.

Consider:

i := i + 1;

Note that the new value of i didn't just appear out of thin air; it was
in fact based on the previous value of i.

Right. That's exactly the kind of thing I thought you were trying to
avoid.

I was under the impression tat Joachim, for example, did not
consider "i+1" as an alias for i.

So we can define insert, update and delete in terms of relational
assignment, relational subtraction, and relational union. Type
checking details omitted.

Then the problem is in the step where you assign the new relation to the
old relational variable. You need to check that the new relation
conforms to the invariants that are expressed on that relational
variable. If you model this as assignment of relations (or relation
values... I'm unclear on the terminology at this point) then naively
this requires scanning through an entire set of relations in the
constraint, to verify that the invariant holds. You've may have avoided
"aliasing" in any conventional sense of the word by stretching the word
itself beyond breaking... but you've only done it by proactively
accepting its negative consequences.

It remains non-trivial to scan through a 2 GB database table to verify
that some attribute of every tuple matches some attribute of another
table, even if you call the entire thing one relational variable. The
implementation, of course, isn't at all going to make a copy of the
entire (possibly several GB) relation and rewrite it all every time it
makes a change, and it isn't going to give up and rescan all possible
invariants every time every change is made. In other words, you've
risen to a layer of abstraction where the aliasing problem does not
exist. The implementation is still going to deal with the aliasing
problem, which will resurface once you pass over to the other side of
the abstraction boundary.

Yes, these *performance* issues make assignment prohibitive for
real-world use, at least if we are talking about data management
in the large. This is not the same thing as saying the resulting
language is a toy language, though; its semantics are quite
interesting and possibly a better choice for *defining* the semantics
of the imperative operations than directly modelling the imperative
operations. (Or maybe not.) In any event, it's worth thinking about,
even if performance considerations make it not worth implementing.
Marshall

Jul 17 '06 #661

Chris Smith

Marshall <ma*************@gmail.comwrote:

Yes, these *performance* issues make assignment prohibitive for
real-world use, at least if we are talking about data management
in the large. This is not the same thing as saying the resulting
language is a toy language, though; its semantics are quite
interesting and possibly a better choice for *defining* the semantics
of the imperative operations than directly modelling the imperative
operations. (Or maybe not.) In any event, it's worth thinking about,
even if performance considerations make it not worth implementing.

My "toy language" comment was directed at a language that I mistakenly
thought you were proposing, but that you really weren't. You can ignore
it, and all the corresponding comments about assignment being less
powerful, etc. I was apparently not skilled at communication when I
tried to say that in the last message.

It is, perhaps, worth thinking about. My assertion here (which I think
I've backed up, but there's been enough confusion that I'm not surprised
if it was missed) is that the underlying reasons that performance might
be poor for this language are a superset of the performance problems
caused by aliasing. Hence, when discussing the problems caused by
aliasing for the performance of language implementations (which I
believe was at some point the discussion here), this isn't a
particularly useful example.

It does, though, have the nice property of hiding the aliasing from the
semantic model. That is interesting and worth considering, but is a
different conversation; and I don't know how to start it.

--
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation

Jul 17 '06 #662

David Hopwood

Darren New wrote:

David Hopwood wrote:

>public class LoopInitTest {
public static String getString() { return "foo"; }

public static void main(String[] args) {
String line = getString();
boolean is_last = false;

while (!is_last) {
if (line.charAt(0) == 'q') {
is_last = true;
}

// insert line into inputs (not important for analysis)

if (!is_last) {
line = getString();
}
}
}
}

which compiles without error, because is_last is definitely initialized.

At what point do you think is_last or line would seem to not be
initialized? They're both set at the start of the function, and (given
that it's Java) nothing can unset them.

At the start of the while loop, it's initialized. At the end of the
while loop, it's initialized. So the merge point of the while loop has
it marked as initialized.

Apparently, Hermes (at least the version of it described in that paper)
essentially forgets that is_last has been initialized at the top of the
loop, and so when it does the merge, it is merging 'not necessarily initialized'
with 'initialized'.

This sounds like a pretty easy thing to fix to me (and maybe it was fixed
later, since there are other papers on Hermes' typestate checking that I
haven't read yet).

--
David Hopwood <da******************@blueyonder.co.uk>

Jul 18 '06 #663

Joachim Durchholz

Marshall schrieb:

Chris Smith wrote:
>Joachim Durchholz <jo@durchholz.orgwrote:
I *think* I understand Marshall here. When you are saying "assignment",
you mean assignment to values of attributes within tuples of the cell.
When Marshall is saying "assignment", he seems to mean assigning a
completely new *table* value to a relation; i.e., wiping out the entire
contents of the relation and replacing it with a whole new set of
tuples. Your assignment is indeed less powerful than DML, whereas
Marshall's assignment is more powerful than DML.

Exactly.

Ah well. I never meant that kind of assignment.

Besides, all the aliasing possibilities already happen at the field
level, so it never occurred to me that whole-table assignment might
enter into the picture.

Regards,
Jo

Jul 18 '06 #664

Joachim Durchholz

Marshall schrieb:

No. The variable is the table, not the records.

In your examples, yes.

Relations are not arrays.

They are, in all ways that matter for aliasing:
They are a collection of mutable data, accessible via selector values.

Records are not lvalues.

Agreed, but they have identity. An lvalue is an identity that you can
track in software; record identity is untrackable in standard SQL, but
it's real enough to create aliasing problems.

I would propose that variables have identity, and values do not.

What's a "variable", by that definition?
It can't be "something that has a name in a program", since that would
exclude variables on the heap.
Another definition would be "something that may have a different value
if I look at it sometime later", but then that would include SQL records
as well. Besides, the temporal aspect is problematic, since it's unclear
whether the "it" some time later is still the same as the "it" before
(if you have a copying garbage collector, you'd have to somehow
establish identity over time, and we're full circle back at defining
identity).

Personally, I say "two things are identical / have the same identity",
if they are (a) equal and (b) stay equal after changing just one of
them. That's a nice local definition, and captures exactly those cases
where aliasing is relevant in the first place (it's actually that
"changing here also changes there" aspect that makes aliasing so dangerous).
It also covers SQL records. Well, they do have aliasing problems, or
problems that are similar enough to them, so I'd say this is a
surprising but actually interesting and useful result of having that
definition.

In part this is via the supplied definition of identity, in which, when
you change one thing, if something else changes as well, they
share identity. Since one cannot change values, they necessarily
lack identity.

I agree that immutable values don't have identity (or, rather, that
identity isn't an interesting concept for them since it's already
subsumed under equality).

>It's certainly not storage location:
if you DELETE a record and then INSERT it with the same values, it may
be allocated somewhere entirely else, and our intuition would say it's
not "the same" (i.e. not identical).

Well, it would depend on how our intuition had been primed. If it
was via implementation techniques in low level languages, we
might reach a different conclusion than if our intuition was primed
via logical models and relation theory.

Sure.

However, since I'm interested in practical consequences for programming,
I'm looking for definitions that allow me to capture those effects that
create bugs.
In the case of identity, I'd say that's aliasing.

>Since intuition gives me ambivalent results here, I'll go back to my
semiformal definition (and take the opportunity to make it a bit more
precise):
Two path expressions (lvalues, ...) are aliases if and only if the
referred-to values compare equal, and if they stay equal after applying
any operation to the referred-to value through either of the path
expressions.

Alas, this leaves me confused. I don't see how a path expression
(in this case, SELECT ... WHERE) can be an l-value. You cannot
apply imperative operations to the result.

It's not SELECT...WHERE..., it's WHERE... that's an lvalue.
And you can indeed use it with UPDATE, so yes it's an lvalue by my book.
(A given WHERE clause may become invalid or refer to a different record
after some updates, so it's not the kind of lvalue that you have in a C
program.)

(Also I think the use
of equality here is too narrow; it is only necessary to show that
two things both change, not that they change in the same way.)

Sorry for using misleading terminology.
An "alias" is when two names refer to an identical object (both in real
life and IMHO in programming), and that's what I wrote.
"Aliasing" is the kind of problems that you describe.

I was under the impression you agred that "i+2" was not
a "path expression".

Yes, but [i+2] is one.

In the same vein, "name = 'John'" is an expression, "WHERE name =
'John'" is a path expression.

If our hypothetical language lacks record
identity, then I would say that any query is simply an expression
that returns a value, as in "i+2."

As long as the data is immutable, yes.
The question is how equality behaves under mutation.

>On the other hand, variable addresses as tangible identities don't hold
much water anyway.
Imagine data that's written out to disk at program end, and read back
in. Further imagine that while the data is read into main memory,
there's a mechanism that redirects all further reads and writes to the
file into the read-in copy in memory, i.e. whenever any program changes
the data, all other programs see the change, too.
Alternatively, think about software agents that move from machine to
machine, carrying their data with them. They might be communicating with
each other, so they need some means of establishing identity
("addressing") the memory buffers that they use for communication.

These are exactly why content-based addressing is so important.
Location addressing depends on an address space, and this
concept does not distribute well.

I think that's an overgeneralization. URLs have distributed well in the
past.

> I don't know what "new" would be in a value-semantics, relational
world.
It would be INSERT.

Um, my idea of "value semantics" is associated with immutable values.
SQL with INSERT/DELETE/UPDATE certainly doesn't match that definition.

Sorry, I was vague. Compare, in OOP, the difference between a value
object and a "regular" object.

Yes, one if mutable, the other is not.
I'm not sure how that relates to SQL though.

>>>Filters are just like array indexing: both select a subset of variables
from a collection.
I can't agree with this wording. A filter produces a collection
value from a collection value. I don't see how variables
enter in to it.
A collection can consist of values or variables.

And yes, I do think that WHERE is a selection over a bunch of variables
- you can update records after all, so they are variables! They don't
have a name, at least none which is guaranteed to be constant over their
lifetime, but they can be mutated!

We seem to have slipped back from the hypothetical relation language
with only assignement back to SQL.

This all started with your claim that SQL does not have identities and
aliasing problems, which grew out of a discussion of what identities and
aliases and aliasing really are.
Hypothetical relation languages are interesting to illustrate fine
points, but they don't help with that.

Anyway, if a hypothetical relational language is what you wish to
discuss, let me return to an even more general model: I'm pretty sure
that any language that has
a) collections
b) means of addressing individual items
c) means of changing individiual items
has aliasing problems, and that the indidual items have identity.

(Actually all programming languages with mutation and any kind of
reference fit that description: the "collection" is the pool of
variables, the "means of addressing" is the references, and "means of
changing" is the mutation.)

> One can filter either a collection constant or
a collection variable; if one speaks of filtering a collection
variable, on is really speaking of filtering the collection value
that the variable currently contains; filtering is not an operation
on the variable as such, the way the "address of" operator is.
Note you can't update the result of a filter.
If that's your definition of a filter, then WHERE is not a filter,
simple as that.

Fair enough! Can you correct my definition of filter, though?

I can't. You were using the term "filter", and seemed to apply it to
WHERE but assume something that didn't match WHERE.
I wasn't rejecting your definition but that you applied the term to
something that didn't match it.

Oh, and personally, I'd say a filter is something that selects items
from a collection, be collection or items mutable or not. But that's
just my personal definition, and I'll be happy to stick with anything
that's remotely reasonable ;-)

Your English is extraordinary. I could easily conclude that you
were born in Boston and educated at Harvard, and either have
Germanic ancestry or have simply adopted a Germanic name
out of whimsy. If English is not your native tongue, there is no
way to detect it.

Thanks :-)

But, no, my last visit to an English-speaking country was decades ago.
Attribute my English to far too much time in technical newsgroups, and
far too much googling ;-)
I'm pretty sure that my oral English is much, much worse, and that I've
got a heavy German accent. I'd just try to improve on that if I were to
relocate to GB or the US (and maybe that's just what made the difference
for my English: constant striving for improvement).

Regards,
Jo

Jul 18 '06 #665

Darren New

David Hopwood wrote:

Darren New wrote:

>>David Hopwood wrote:

>>>public class LoopInitTest {
public static String getString() { return "foo"; }

public static void main(String[] args) {
String line = getString();
boolean is_last = false;

while (!is_last) {
if (line.charAt(0) == 'q') {
is_last = true;
}

// insert line into inputs (not important for analysis)

if (!is_last) {
line = getString();
}
}
}
}

which compiles without error, because is_last is definitely initialized.

At what point do you think is_last or line would seem to not be
initialized? They're both set at the start of the function, and (given
that it's Java) nothing can unset them.

At the start of the while loop, it's initialized. At the end of the
while loop, it's initialized. So the merge point of the while loop has
it marked as initialized.

Apparently, Hermes (at least the version of it described in that paper)
essentially forgets that is_last has been initialized at the top of the
loop, and so when it does the merge, it is merging 'not necessarily initialized'
with 'initialized'.

No, it's not that it "forgets". It's that the "insert line into inputs"
unitializes "line". Hence, "line" is only conditionally set at the
bottom of the loop, so it's only conditionally set at the top of the loop.

This sounds like a pretty easy thing to fix to me (and maybe it was fixed
later, since there are other papers on Hermes' typestate checking that I
haven't read yet).

You simply misunderstand the "insert line into inputs" semantics. Had
that line actually been commented out in the Hermes code, the loop would
have compiled without a problem.

That said, in my experience, finding this sort of typestate error
invariably led me to writing clearer code.

boolean found_ending = false;
while (!found_ending) {
string line = getString();
if (line.charAt(0) == 'q')
found_ending = true;
else
insert line into inputs;
}

It seems that's much clearer to me than the contorted version presented
as the example. If you want it to work like the Java code, where you can
still access the "line" variable after the loop, the rearrangement is
trivial and transparent as well.

--
Darren New / San Diego, CA, USA (PST)
This octopus isn't tasty. Too many
tentacles, not enough chops.

Jul 18 '06 #666

David Hopwood

Darren New wrote:

David Hopwood wrote:

[...]

>Apparently, Hermes (at least the version of it described in that paper)
essentially forgets that is_last has been initialized at the top of the
loop, and so when it does the merge, it is merging 'not necessarily
initialized' with 'initialized'.

No, it's not that it "forgets". It's that the "insert line into inputs"
unitializes "line". Hence, "line" is only conditionally set at the
bottom of the loop, so it's only conditionally set at the top of the loop.

>This sounds like a pretty easy thing to fix to me (and maybe it was fixed
later, since there are other papers on Hermes' typestate checking that I
haven't read yet).

You simply misunderstand the "insert line into inputs" semantics.

Yes, you're right, I did misunderstand this.

Had that line actually been commented out in the Hermes code, the loop would
have compiled without a problem.

That said, in my experience, finding this sort of typestate error
invariably led me to writing clearer code.

boolean found_ending = false;
while (!found_ending) {
string line = getString();
if (line.charAt(0) == 'q')
found_ending = true;
else
insert line into inputs;
}

It seems that's much clearer to me than the contorted version presented
as the example. If you want it to work like the Java code, where you can
still access the "line" variable after the loop, the rearrangement is
trivial and transparent as well.

I agree.

--
David Hopwood <da******************@blueyonder.co.uk>

Jul 18 '06 #667

Marshall

Joachim Durchholz wrote:

Marshall schrieb:
Chris Smith wrote:
Joachim Durchholz <jo@durchholz.orgwrote:
I *think* I understand Marshall here. When you are saying "assignment",
you mean assignment to values of attributes within tuples of the cell.
When Marshall is saying "assignment", he seems to mean assigning a
completely new *table* value to a relation; i.e., wiping out the entire
contents of the relation and replacing it with a whole new set of
tuples. Your assignment is indeed less powerful than DML, whereas
Marshall's assignment is more powerful than DML.
Exactly.

Ah well. I never meant that kind of assignment.

Sorry for the confusion.
Marshall

Jul 19 '06 #668

Xah Lee

Hi all,

in the past years, i have written few hundreds of essays and tutorials
on computing. Now, i've but a index page to this collection:
http://xahlee.org/Periodic_dosage_dir/skami_prosa.html

many of these, originated from online forum. The writing style is
enticing and the content astute.

also, as many of you may know, in the past month there's some
controversy regarding me, that resulted in a small web hosting company
kicking me out, giving the legal reason of _no reason_ as by the
contract agreement (with a 30 days notice in advance). I feel there's
some public obligation to give this update. More details is at
http://xahlee.org/Periodic_dosage_di...arassment.html

i have been busy in the past month so i haven't wrote anything new
(regarding computing). Recently i started to do Java again, and will
probably soon complete the other sections of this exposition:
What are OOP's Jargons and Complexities
http://xahlee.org/Periodic_dosage_dir/t2/oop.html

Xah
xa*@xahlee.org
âˆ‘ http://xahlee.org/

Jul 21 '06 #669

Benjamin Franksen

Darren New wrote:

Chris Smith wrote:
>Specialized
language already exist that reliably (as in, all the time) move array
bounds checking to compile time;

It sounds like this means the programmer has to code up what it means to
index off an array, yes? Otherwise, I can't imagine how it would work.

x := read_integer_from_stdin();
write_to_stdout(myarray[x]);

What does the programmer have to do to implement this semantic in the
sort of language you're talking about? Surely something somewhere along
the line has to "fail" (for some meaning of failure) at run-time, yes?

You should really take a look at Epigram. One of its main features is that
it makes it possible not only to statically /check/ invariants, but also
to /establish/ them.

In your example, of course the program has to check the integer at runtime.
However, in a dependent type system, the type of the value returned from
the check-function can very well indicate whether it passed the test (i.e.
being a valid index for myarray) or not.

Ben

Jul 22 '06 #670

Similar topics