program surgery vs. type safety

Aaron Watters

I'm doing a heart/lung bypass procedure on a largish Python
program at the moment and it prompted the thought that the
methodology I'm using would be absolutely impossible with a
more "type safe" environment like C++, C#, java, ML etcetera.

Basically I'm ripping apart the organs and sewing them back
together, testing all the while and the majority of the program
at the moment makes no sense in a type safe world... Nevertheless,
since I've done this many times before I'm confident that it
will rapidly all get fixed and I will ultimately come up with
something that could be transliterated into a type safe system
(with some effort). It's the intermediate development stage
which would be impossible without being able to "cheat". A type
conscious compiler would go apopleptic attempting to make sense of
the program in its present form.

If I were forced to do the transformation in a type safe way
I would not be able to do as much experimentation and backtracking
because each step between type safe snapshots that could be tested
would be too painful and expensive to throw away and repeat.

This musing is something of a relief for me because I've lately
been evolving towards the view that type safety is much more
important in software development than I have pretended in the past.

ah well... back to work...

-- Aaron Watters

===
You were so cool
back in highschool
what happened? -- Tom Petty

Jul 18 '05 #1

Subscribe Post Reply

2040

Jeremy Fincher

aa***@reportlab.com (Aaron Watters) wrote in message news:<9a**************************@posting.google. com>...

If I were forced to do the transformation in a type safe way
I would not be able to do as much experimentation and backtracking
because each step between type safe snapshots that could be tested
would be too painful and expensive to throw away and repeat.

This entire post is the "musing" of a dynamic typing enthusiast and
primarily subsists of simply assuming the point you're apparently
attempting to prove, that dynamic typing allows you to do what you're
doing, and static typing would not.

I am curious, however -- what are these "type unsafe" stages you have
to go through to refactor your program? I've refactored my personal
project several times and haven't yet gone through what I'd consider a
type-unsafe stage, where I'm *fundamentally* required to use
operations that aren't type-safe.

Jeremy

Jul 18 '05 #2

Donn Cave

In article <9a**************************@posting.google.com >,
aa***@reportlab.com (Aaron Watters) wrote:

I'm doing a heart/lung bypass procedure on a largish Python
program at the moment and it prompted the thought that the
methodology I'm using would be absolutely impossible with a
more "type safe" environment like C++, C#, java, ML etcetera.

Basically I'm ripping apart the organs and sewing them back
together, testing all the while and the majority of the program
at the moment makes no sense in a type safe world... Nevertheless,
since I've done this many times before I'm confident that it
will rapidly all get fixed and I will ultimately come up with
something that could be transliterated into a type safe system
(with some effort). It's the intermediate development stage
which would be impossible without being able to "cheat". A type
conscious compiler would go apopleptic attempting to make sense of
the program in its present form.

If I were forced to do the transformation in a type safe way
I would not be able to do as much experimentation and backtracking
because each step between type safe snapshots that could be tested
would be too painful and expensive to throw away and repeat.

This musing is something of a relief for me because I've lately
been evolving towards the view that type safety is much more
important in software development than I have pretended in the past.

It's interesting that you lump ML in with the rest of those
languages. There are at least a few people around who reject
any thinking on type safety if it's cast in the context of
C++ or Java, because the strict static typing, type inference
and other tools you don't get with either of those languages
make them poor representatives. But ML has that stuff.

I have the sources here for a largeish Python program. We have
been using it here in production for some months, and I have
a collection of changes to adapt it to our environment. A lot
of changes, by my standards - 4560 lines of context diff, plus
some new modules and programs. I have a minor upgrade from
the author, and this afternoon I finished patching in our changes.

That is, I have run the context diffs through patch, and hand
patched what it couldn't deal with. So I have one automated
structural analysis tool helping me out here - patch. I will
also be able to run them through the "compiler" to verify that
they're still syntactically correct, but that won't help much
here - patch already noticed the kind of local changes that would
make for syntactical breakage. I'm more concerned about non-local
changes - some other function that now behaves differently than
it did in when we wrote a change around it.

There's no guarantee that if this program were written in ML
instead, I'd find every upgrade error, but it would be a hell
of a lot better than patch.

If I were as confident as you that ``it will rapidly all get
fixed,'' then I guess it would not be an issue. But my
experience is that too much of it won't get fixed until it
breaks in production, and I hate to mess with it for that
reason. I find Haskell and Objective CAML kind of liberating
in this way - I can go in and really tear it up, and the
compiler won't let me call it finished until all the boards
are back, wires and fixtures re-connected - stuff that I can't
see but it can.

Donn Cave, do**@u.washington.edu

Jul 18 '05 #3

Alex Martelli

Donn Cave wrote:
...

There's no guarantee that if this program were written in ML
instead, I'd find every upgrade error, but it would be a hell
of a lot better than patch.
Yep, but nowhere as good as unit-tests (and acceptance tests
and stress tests and whateveryouwant-tests). Particularly if
I could run it with DBC-constructs (preconditions, postconditions,
invariants) enabled (but I admit I've never yet done that on
any serious, large Python program -- it's more of a memory of
how we did it with C++ a few years ago -- _tests_, however, I
_am_ deadly serious about, not just "nostalgic"...:-).

If I were as confident as you that ``it will rapidly all get
fixed,'' then I guess it would not be an issue. But my
Having a good battery of tests gives me that confidence --
quite independent of the language. (Having asserts makes it
a bit better, having systematic pre/post conditions and
invariants better still -- but tests are It, all in all).

experience is that too much of it won't get fixed until it
breaks in production, and I hate to mess with it for that
reason. I find Haskell and Objective CAML kind of liberating
in this way - I can go in and really tear it up, and the
compiler won't let me call it finished until all the boards
are back, wires and fixtures re-connected - stuff that I can't
see but it can.

I wish I was still young and optimistic enough to believe that
the compiler's typechecking (even if as strong and clean as in
ML or Haskell) was able to spot "all" the whatevers. Sure,
"tests can only show the _presence_ of errors, not their
_absence_". But so can static, compiler-enforced typing -- it
can show the presence of some errors, but never the absence of
others ("oops I meant a+b, not a-b"! and the like...). In my
experience, the errors that static type-checking reliably catches
are a subset of those caught by systematic tests, particularly
with test-driven design. But systematic use of tests also
catches quite a few other kinds of errors, so, it gives me MORE
confidence than static type-checking would; and the "added
value" of static type-checking _given_ that I'll have good
batteries of tests anyway is too small for me to yearn to go
back to statically checked languages.

I _know_ I'd feel differently if e.g. management didn't LET
me do TDD and systematic testing because of deadline pressures.
But I don't think I'd stay long in a job managed like that;-).
Alex

Jul 18 '05 #4

Donn Cave

In article <QA*******************@news2.tin.it>,
Alex Martelli <al***@aleax.it> wrote:
....

I _know_ I'd feel differently if e.g. management didn't LET
me do TDD and systematic testing because of deadline pressures.
But I don't think I'd stay long in a job managed like that;-).

Nice for you. Depending on the application, that's an option
for me too - but unlikely to be combined with the option to
write in Python. Sometimes I believe a lot of the disparity
of reactions to programming languages comes from the fact that
we live in different worlds and have no idea what it's like to
work in other collaborative models, development tools, etc.

It makes little difference here anyway, since as far as I can
tell no language remotely like Python could productively be
adapted to static typing. In my world, the applications I write
and the free open source software I get from elsewhere and have
to modify, static type analysis looks like more help than burden
to me. See you next time around.

Donn Cave, do**@u.washington.edu

Jul 18 '05 #5

Aaron Watters

tw*********@hotmail.com (Jeremy Fincher) wrote in message news:<69*************************@posting.google.c om>...

aa***@reportlab.com (Aaron Watters) wrote in message news:<9a**************************@posting.google. com>...>
I am curious, however -- what are these "type unsafe" stages you have
to go through to refactor your program? I've refactored my personal
project several times and haven't yet gone through what I'd consider a
type-unsafe stage, where I'm *fundamentally* required to use
operations that aren't type-safe.

It's a bit hard to describe, but it works something like this:
there is a problem in one particular path through the code which
requires a fundamental data structure/interface change...
to fix, try several approaches
(which each invalidate bighunks of code not on the test path)
and once you find the one that works best, THEN retrofit the
remaining code.

Now if I wrote zillions of tiny modules with zillions of tiny
functions, methods and classes in them this procedure might be type safe.
But I don't do that.
-- Aaron Watters
===
If I haven't seen as far as others it is because
giants have been standing on my shoulders. -- Gerald Sussman

Jul 18 '05 #6

Jeremy Fincher

aa***@reportlab.com (Aaron Watters) wrote in message news:<9a*************************@posting.google.c om>...

Now if I wrote zillions of tiny modules with zillions of tiny
functions, methods and classes in them this procedure might be type safe.
But I don't do that.

Well that's it then. Static typing isn't your problem; coupling is.

Jeremy

Jul 18 '05 #7

Jeremy Fincher

Alex Martelli <al***@aleax.it> wrote in message news:<QA*******************@news2.tin.it>...

Sure,
"tests can only show the _presence_ of errors, not their
_absence_". But so can static, compiler-enforced typing -- it
can show the presence of some errors, but never the absence of
others ("oops I meant a+b, not a-b"! and the like...).
But it *does* show the absence of type errors, and almost any
invariant can be coded into the Hindley-Milner typesystem. Writing to
a file opened for reading, multiplying matrices with improper
dimensions, etc. are both (among others) valid for encoding in the
typesystem. Too many dynamic typing advocates look at a typesystem
and see only a jail (or a padded room ;)) to restrict them. A good
static typesystem isn't a jail, but the raw material for building
compiler-enforced invariants into your code. Think DBC that the
compiler won't compile unless it can *prove* the contract is never
violated.

The main point, however, you made yourself: tests can only show the
*presence* of errors, whereas static typing can prove their absence.
In my
experience, the errors that static type-checking reliably catches
are a subset of those caught by systematic tests, particularly
with test-driven design.
But does the compiler write the tests for you? At the very least, one
could argue that static typing saves the programmer from having to
write a significant number of tests.
But systematic use of tests also
catches quite a few other kinds of errors, so, it gives me MORE
confidence than static type-checking would; and the "added
value" of static type-checking _given_ that I'll have good
batteries of tests anyway is too small for me to yearn to go
back to statically checked languages.

You make it seem like static typing and tests are mutually exclusive.
Obviously, they're not, though admittedly when I programmed in O'Caml
I felt far less *need* for tests because I saw far fewer bugs.

Good thing, too -- the testing libraries available for O'Caml (like
most everything else for that language) are pretty nasty :)

Jeremy

Jul 18 '05 #8

Gonçalo Rodrigues

On 14 Nov 2003 04:17:08 -0800, tw*********@hotmail.com (Jeremy
Fincher) wrote:

[text snipped]

The main point, however, you made yourself: tests can only show the
*presence* of errors, whereas static typing can prove their absence.

Huh? Surely you mean proving the absence of *type errors*. And on the
total amount of errors how many are type errors? Many people on this
ng have argued based on their experience that type errors are a tiny
fraction of the errors found in programs.

With my best regards,
G. Rodrigues

Jul 18 '05 #9

Alex Martelli

Jeremy Fincher wrote:

Alex Martelli <al***@aleax.it> wrote in message
news:<QA*******************@news2.tin.it>...
Sure,
"tests can only show the _presence_ of errors, not their
_absence_". But so can static, compiler-enforced typing -- it
can show the presence of some errors, but never the absence of
others ("oops I meant a+b, not a-b"! and the like...).
But it *does* show the absence of type errors, and almost any
invariant can be coded into the Hindley-Milner typesystem. Writing to

How do most _typical_ invariants of application programs, such
as "x > y", get coded e.g. in Haskell's HM typesystem? I don't
think "almost any invariant" makes any real sense here. When I'm
doing geometry I need to ensure that any side of a triangle is
always less than the sum of the other two; when I'm computing a
payroll I need to ensure that the amount of tax to pay does not
exceed the gross on which it is to be paid; etc, etc. Simple
inequalities of this sort _are_ "most invariants" in many programs.
Others include "there exists at least one x in xs and at least
one y in ys such that f(x,y) holds" and other such combinations
of simple propositional logic with quantifiers.
a file opened for reading, multiplying matrices with improper
dimensions, etc. are both (among others) valid for encoding in the
typesystem. Too many dynamic typing advocates look at a typesystem
and see only a jail (or a padded room ;)) to restrict them. A good
And (IMHO) too many static typing advocates have a hammer (a static
typesystem) and look at the world as a collection of nails (the very
restricted kinds of invariants they actually can have that system
check at compile-time), conveniently ignoring (most likely in good
faith) the vast collection of non-nails which happen to fill, by
far, most of the real world.
static typesystem isn't a jail, but the raw material for building
compiler-enforced invariants into your code. Think DBC that the
compiler won't compile unless it can *prove* the contract is never
violated.
What I want is actually a DBC which will let me state invariants I
"know" should hold even when it's not able to check them *at run
time*, NOT one that is the very contrary -- so restrictive that it
won't let me even state things that would easily be checkable at
run time, just because it can't check them at _compile_ time.

If I state "function f when called with parameter x will terminate
and return a result r such that pred(r,x) holds", it may well be
that even the first part can't be proven or checked without solving
the Halting Problem. I don't care, I'd like to STATE it explicitly
anyway in certain cases, perhaps have some part of the compiler
insert a comment about what it's not been able to prove (maybe it
IS able to prove that _IF_ f terminates _THEN_ pred(r, x) holds,
that's fine, it might be helpful to a maintainer to read the (very
hypothetical) computer-generated comment about having proven that
but not having been able to prove the antecedent.

But I'm not going to be willing to pay very much for this kind of
neat features -- either in terms of money (or equivalents thereof,
such as time) or convenience and comfort. I would no doubt feel
otherwise, if the kinds of applications I code and the environments
in which I work were vastly different. But they aren't, haven't
been for the > 1/4 century I've been programming, and aren't at all
likely to change drastically any time soon. So, I see static typing
as a theoretically-interesting field of no real applicability to my
work. If I felt otherwise about it, I would most likely be coding
in Haskell or some kind of ML, of course -- nobody's come and FORCED
me to choose a dynamically-typed language, you know?
The main point, however, you made yourself: tests can only show the
*presence* of errors, whereas static typing can prove their absence.
Static typing *cannot* "prove the absence of errors": it can prove the
absence of "static typing errors", just like a compilation phase can
prove the absence of "syntax errors", and the tests can equally well
prove the absence of the EQUALLY SPECIFIC errors they're testing for.

NONE of these techniques can "prove the absence of errors". CS
theoreticians have been touting theorem-proving techniques that IN
THEORY should be able to do so for the last, what, 40+ years? So
far, the difference from theory and practice in practice has proven
larger than the difference between practice and theory in theory.

Incidentally, at least as much of this theoretical work has been
done with such dynamic-typing languages as Scheme as with such
static-typing languages as ML. Static typing doesn't seem to be
particularly necessary for THAT purpose, either.

In my
experience, the errors that static type-checking reliably catches
are a subset of those caught by systematic tests, particularly
with test-driven design.

But does the compiler write the tests for you? At the very least, one
could argue that static typing saves the programmer from having to
write a significant number of tests.

One could, and one would be dead wrong. That is not just my own
real-life experience -- check out Robert Martin's blog for much more
of the same, for example. Good unit tests are not type tests: they
are _functionality_ tests, and types get exercised as a side effect.
(This might break down in a weakly-typed language, such as Forth or
BCPL: I don't have enough practical experience using TDD with such
weakly-typed -- as opposed to dynamically strongly-typed -- languages
to know, and as I'm not particularly interested in switching over to
any such language at the moment, that's pretty academic to me now).

But systematic use of tests also
catches quite a few other kinds of errors, so, it gives me MORE
confidence than static type-checking would; and the "added
value" of static type-checking _given_ that I'll have good
batteries of tests anyway is too small for me to yearn to go
back to statically checked languages.

You make it seem like static typing and tests are mutually exclusive.

No: the fact that the added value is too small does not mean it's zero,
i.e., that it would necessarly be "irrational" to use both if the
costs were so tiny as to be even smaller than the added value. Say
that I'm typing in some mathematical formulas from one handbook and
checking them on a second one; it's not necessarily _irrational_ to
triple check on a third handbook just in case both of the first should
happen to have the same error never noticed before -- it's just such
a tiny added value that you have to be valuing your time pretty low,
compared to the tiny probability of errors slipping by otherwise, to
make this a rational strategy. There may be cases of extremely costly
errors and/or extremely low-paid workforce in which it could be so (e.g.,
if the N-uple checking was fully automated and thus only cost dirt-cheap
computer-time, NO human time at all, then, why not).

In practice, I see test-driven design practiced much more widely by
users of dynamically typed languages (Smalltalk, Python, Ruby, &c),
maybe in part because of the psychological effect you mention...:
Obviously, they're not, though admittedly when I programmed in O'Caml
I felt far less *need* for tests because I saw far fewer bugs.
....but also, no doubt, because for most people using dynamically
typed languages is so much faster and more productive, that TDD is
a breeze. The scarcity of TDD in (e.g.) O'Caml then in turn produces:
Good thing, too -- the testing libraries available for O'Caml (like
most everything else for that language) are pretty nasty :)

....this kind of effect further discouraging sound testing practices.

(There are exceptions -- for reasons that escape me, many Java shops
appear to have decent testing practices, compared to C++ shops -- I
don't know of any FP-based shop on which to compare, though).
Alex

Jul 18 '05 #10

dman

On 14 Nov 2003 04:17:08 -0800, Jeremy Fincher wrote:

Alex Martelli <al***@aleax.it> wrote in message news:<QA*******************@news2.tin.it>...
Sure,
"tests can only show the _presence_ of errors, not their
_absence_". But so can static, compiler-enforced typing -- it
can show the presence of some errors, but never the absence of
others ("oops I meant a+b, not a-b"! and the like...).

But it *does* show the absence of type errors,

Not all the time. Casting (a la C, C++, Java) allows the programmer
to say "silly compiler, you don't know what you're saying" (usually,
it also converts int<->float and such, but apart from that). That
results in a runtime type error the compiler didn't detect. A Java
runtime will detect that later, but C and C++ will just behave wrong.

-D

--
What good is it for a man to gain the whole world, yet forfeit his
soul? Or what can a man give in exchange for his soul?
Mark 8:36-37

www: http://dman13.dyndns.org/~dman/ jabber: dm**@dman13.dyndns.org

Jul 18 '05 #11

Patrick Maupin

aa***@reportlab.com wrote:

I'm doing a heart/lung bypass procedure on a largish Python
program at the moment and it prompted the thought that the
methodology I'm using would be absolutely impossible with a
more "type safe" environment like C++, C#, java, ML etcetera.

Python is so wonderful for this that in a few cases, I have
actually converted code *to* Python for the express purpose
of refactoring it, even when it has to be converted _back_
to use in the final system! (Most of the code I have done
this for is control code which runs on a DSP inside a modem.)

Most people who have done serious refactoring would probably
agree that the number of interrelated items a human can consider
at a single time is quite small (probably because the number of
relationships between the items goes up roughly as the square of
the number of items). For this reason, I have found that for
complicated systems, it can be extremely useful to take very
tiny, iterative steps when refactoring (especially when full,
robust unit tests are not available on the original code).

Tiny iterative steps can be examined and reasoned about in
isolation very successfully, in cases where the sum total
of the changes is beyond this human's comprehension capacity.

However, in some cases (as in perhaps the heart/lung scenario
discussed by Aaron), code requires a fundamental shift in its
structure that is impossible (or at least impractical) to
capture with small iterative steps.

Even when faced with this scenario, I try to design my _process_
for refactoring _this particular piece of code_ in such a
fashion that the scope of this fundamental shift is as small
as possible, e.g. by taking lots of small steps before making
the fundamental shift, and lots of small steps after making it.

So (if you're still with me :) the most interesting thing about
the process is: The actual conversion of source code to and
from Python can be among the tiniest of iterative steps!

Treating a code conversion to Python as a tiny step in a
refactoring process allows all the hard work of the fundamental
shift to be done _in Python_, which gives you access to all
the wonderful facilities of the language for designing and
testing your new code. The first runs of your unit tests will
basically insure that you have successfully captured the essence
of the original code during the conversion process.

Python is so malleable that I have very successfully used
it to "look like" C and a few different kinds of assembly
language. It particularly shines (e.g. in comparison to
C) for modelling assembly language. Have a function which
returns stuff in registers AX and BX? No problem:

def myfunc():
...
return ax,bx

...

ax,bx = myfunc()

Some preexisting code will not convert as nicely as other
code to Python, but this is not a huge problem because, as
described above, you can immediately write Python unit tests
to verify that you have accurately captured the existing code.

Conversion back to the target system can be slightly more
problematic in that it may be impossible to unit-test the
software in its native environment. The good news here is
that it is almost always possible (in my experience) to make
Python code look arbitrarily close to the new assembly
language I am authoring.

In fact, for the conversion back to assembly language, I tend
to iterate on both the Python and assembly versions simultaneously.
I'll start coding the assembly language to look like the Python,
then realize that I have a construct which doesn't flow very
well in assembler, go back and iterate on (and unit test!) the
Python again to make it look more like the final result, and then
recapture those changes in assembler.

At the end of the process, I will have a fully tested Python version
(with a unit test for subsequent changes) and some assembler which
almost any programmer would agree looks _just like_ the Python (which
admittedly doesn't look like very good Python any more :)

In some cases I just slap the assembly language back into the
system and run system tests on it; in other cases I have used
the Python unit tests to generate test data which can be fed
to a test harness for the assembly language version in a
simulator. (In either case, I will have finished more quickly and
have more faith in the resultant code than if I had just tried
to refactor in the original language, using the available tools.)

In the Python version (which doesn't run in a real system at speed),
I am prone to inserting the sort of assertions which Alex asserts
(heh -- got you for yesterday's "reduce") a real design by contract
system would easily enforce, e.g. assert x > y, "The frobowitz fritzed out!"

Given the fact that assembly language is basically untyped and the
fact that I can make the corresponding Python arbitrarily similar
to the assembly language while fully testing and instrumenting it,
I could argue that, for my purposes, the _lack_ of static typing
in Python _contributes heavily_ to its viability as a code refactoring
tool, which seems to parallel your experience.

Regards,
Pat

Jul 18 '05 #12

Alex Martelli

dm**@dman13.dyndns.org wrote:

On 14 Nov 2003 04:17:08 -0800, Jeremy Fincher wrote:
Alex Martelli <al***@aleax.it> wrote in message
news:<QA*******************@news2.tin.it>...
Sure,
"tests can only show the _presence_ of errors, not their
_absence_". But so can static, compiler-enforced typing -- it
can show the presence of some errors, but never the absence of
others ("oops I meant a+b, not a-b"! and the like...).

But it *does* show the absence of type errors,

Not all the time. Casting (a la C, C++, Java) allows the programmer
to say "silly compiler, you don't know what you're saying" (usually,
it also converts int<->float and such, but apart from that). That
results in a runtime type error the compiler didn't detect. A Java
runtime will detect that later, but C and C++ will just behave wrong.

Jeremy was arguing for a _GOOD_ static typing system, as in ML or Haskell,
not the travesty thereof found in those other languages.
I do not think I've seen anybody defending the "staticoid almost-typing"
approach in this thread.
Alex

Jul 18 '05 #13

program surgery vs. type safety

Similar topics