program surgery vs. type safety

Aaron Watters

I'm doing a heart/lung bypass procedure on a largish Python
program at the moment and it prompted the thought that the
methodology I'm using would be absolutely impossible with a
more "type safe" environment like C++, C#, java, ML etcetera.

Basically I'm ripping apart the organs and sewing them back
together, testing all the while and the majority of the program
at the moment makes no sense in a type safe world... Nevertheless,
since I've done this many times before I'm confident that it
will rapidly all get fixed and I will ultimately come up with
something that could be transliterated into a type safe system
(with some effort). It's the intermediate development stage
which would be impossible without being able to "cheat". A type
conscious compiler would go apopleptic attempting to make sense of
the program in its present form.

If I were forced to do the transformation in a type safe way
I would not be able to do as much experimentation and backtracking
because each step between type safe snapshots that could be tested
would be too painful and expensive to throw away and repeat.

This musing is something of a relief for me because I've lately
been evolving towards the view that type safety is much more
important in software development than I have pretended in the past.

ah well... back to work...

-- Aaron Watters

===
You were so cool
back in highschool
what happened? -- Tom Petty

Jul 18 '05 #1

Subscribe Reply

2069

Jeremy Fincher

aa***@reportlab .com (Aaron Watters) wrote in message news:<9a******* *************** ****@posting.go ogle.com>...

If I were forced to do the transformation in a type safe way
I would not be able to do as much experimentation and backtracking
because each step between type safe snapshots that could be tested
would be too painful and expensive to throw away and repeat.

This entire post is the "musing" of a dynamic typing enthusiast and
primarily subsists of simply assuming the point you're apparently
attempting to prove, that dynamic typing allows you to do what you're
doing, and static typing would not.

I am curious, however -- what are these "type unsafe" stages you have
to go through to refactor your program? I've refactored my personal
project several times and haven't yet gone through what I'd consider a
type-unsafe stage, where I'm *fundamentally* required to use
operations that aren't type-safe.

Jeremy

Jul 18 '05 #2

Donn Cave

In article <9a************ **************@ posting.google. com>,
aa***@reportlab .com (Aaron Watters) wrote:

I'm doing a heart/lung bypass procedure on a largish Python
program at the moment and it prompted the thought that the
methodology I'm using would be absolutely impossible with a
more "type safe" environment like C++, C#, java, ML etcetera.

Basically I'm ripping apart the organs and sewing them back
together, testing all the while and the majority of the program
at the moment makes no sense in a type safe world... Nevertheless,
since I've done this many times before I'm confident that it
will rapidly all get fixed and I will ultimately come up with
something that could be transliterated into a type safe system
(with some effort). It's the intermediate development stage
which would be impossible without being able to "cheat". A type
conscious compiler would go apopleptic attempting to make sense of
the program in its present form.

If I were forced to do the transformation in a type safe way
I would not be able to do as much experimentation and backtracking
because each step between type safe snapshots that could be tested
would be too painful and expensive to throw away and repeat.

This musing is something of a relief for me because I've lately
been evolving towards the view that type safety is much more
important in software development than I have pretended in the past.

It's interesting that you lump ML in with the rest of those
languages. There are at least a few people around who reject
any thinking on type safety if it's cast in the context of
C++ or Java, because the strict static typing, type inference
and other tools you don't get with either of those languages
make them poor representatives . But ML has that stuff.

I have the sources here for a largeish Python program. We have
been using it here in production for some months, and I have
a collection of changes to adapt it to our environment. A lot
of changes, by my standards - 4560 lines of context diff, plus
some new modules and programs. I have a minor upgrade from
the author, and this afternoon I finished patching in our changes.

That is, I have run the context diffs through patch, and hand
patched what it couldn't deal with. So I have one automated
structural analysis tool helping me out here - patch. I will
also be able to run them through the "compiler" to verify that
they're still syntactically correct, but that won't help much
here - patch already noticed the kind of local changes that would
make for syntactical breakage. I'm more concerned about non-local
changes - some other function that now behaves differently than
it did in when we wrote a change around it.

There's no guarantee that if this program were written in ML
instead, I'd find every upgrade error, but it would be a hell
of a lot better than patch.

If I were as confident as you that ``it will rapidly all get
fixed,'' then I guess it would not be an issue. But my
experience is that too much of it won't get fixed until it
breaks in production, and I hate to mess with it for that
reason. I find Haskell and Objective CAML kind of liberating
in this way - I can go in and really tear it up, and the
compiler won't let me call it finished until all the boards
are back, wires and fixtures re-connected - stuff that I can't
see but it can.

Donn Cave, do**@u.washingt on.edu

Jul 18 '05 #3

Alex Martelli

Donn Cave wrote:
...

There's no guarantee that if this program were written in ML
instead, I'd find every upgrade error, but it would be a hell
of a lot better than patch.
Yep, but nowhere as good as unit-tests (and acceptance tests
and stress tests and whateveryouwant-tests). Particularly if
I could run it with DBC-constructs (preconditions, postconditions,
invariants) enabled (but I admit I've never yet done that on
any serious, large Python program -- it's more of a memory of
how we did it with C++ a few years ago -- _tests_, however, I
_am_ deadly serious about, not just "nostalgic" ...:-).

If I were as confident as you that ``it will rapidly all get
fixed,'' then I guess it would not be an issue. But my
Having a good battery of tests gives me that confidence --
quite independent of the language. (Having asserts makes it
a bit better, having systematic pre/post conditions and
invariants better still -- but tests are It, all in all).

experience is that too much of it won't get fixed until it
breaks in production, and I hate to mess with it for that
reason. I find Haskell and Objective CAML kind of liberating
in this way - I can go in and really tear it up, and the
compiler won't let me call it finished until all the boards
are back, wires and fixtures re-connected - stuff that I can't
see but it can.

I wish I was still young and optimistic enough to believe that
the compiler's typechecking (even if as strong and clean as in
ML or Haskell) was able to spot "all" the whatevers. Sure,
"tests can only show the _presence_ of errors, not their
_absence_". But so can static, compiler-enforced typing -- it
can show the presence of some errors, but never the absence of
others ("oops I meant a+b, not a-b"! and the like...). In my
experience, the errors that static type-checking reliably catches
are a subset of those caught by systematic tests, particularly
with test-driven design. But systematic use of tests also
catches quite a few other kinds of errors, so, it gives me MORE
confidence than static type-checking would; and the "added
value" of static type-checking _given_ that I'll have good
batteries of tests anyway is too small for me to yearn to go
back to statically checked languages.

I _know_ I'd feel differently if e.g. management didn't LET
me do TDD and systematic testing because of deadline pressures.
But I don't think I'd stay long in a job managed like that;-).
Alex

Jul 18 '05 #4

Donn Cave

In article <QA************ *******@news2.t in.it>,
Alex Martelli <al***@aleax.it > wrote:
....

I _know_ I'd feel differently if e.g. management didn't LET
me do TDD and systematic testing because of deadline pressures.
But I don't think I'd stay long in a job managed like that;-).

Nice for you. Depending on the application, that's an option
for me too - but unlikely to be combined with the option to
write in Python. Sometimes I believe a lot of the disparity
of reactions to programming languages comes from the fact that
we live in different worlds and have no idea what it's like to
work in other collaborative models, development tools, etc.

It makes little difference here anyway, since as far as I can
tell no language remotely like Python could productively be
adapted to static typing. In my world, the applications I write
and the free open source software I get from elsewhere and have
to modify, static type analysis looks like more help than burden
to me. See you next time around.

Donn Cave, do**@u.washingt on.edu

Jul 18 '05 #5

Aaron Watters

tw*********@hot mail.com (Jeremy Fincher) wrote in message news:<69******* *************** ***@posting.goo gle.com>...

aa***@reportlab .com (Aaron Watters) wrote in message news:<9a******* *************** ****@posting.go ogle.com>...>
I am curious, however -- what are these "type unsafe" stages you have
to go through to refactor your program? I've refactored my personal
project several times and haven't yet gone through what I'd consider a
type-unsafe stage, where I'm *fundamentally* required to use
operations that aren't type-safe.

It's a bit hard to describe, but it works something like this:
there is a problem in one particular path through the code which
requires a fundamental data structure/interface change...
to fix, try several approaches
(which each invalidate bighunks of code not on the test path)
and once you find the one that works best, THEN retrofit the
remaining code.

Now if I wrote zillions of tiny modules with zillions of tiny
functions, methods and classes in them this procedure might be type safe.
But I don't do that.
-- Aaron Watters
===
If I haven't seen as far as others it is because
giants have been standing on my shoulders. -- Gerald Sussman

Jul 18 '05 #6

Jeremy Fincher

aa***@reportlab .com (Aaron Watters) wrote in message news:<9a******* *************** ***@posting.goo gle.com>...

Now if I wrote zillions of tiny modules with zillions of tiny
functions, methods and classes in them this procedure might be type safe.
But I don't do that.

Well that's it then. Static typing isn't your problem; coupling is.

Jeremy

Jul 18 '05 #7

Jeremy Fincher

Alex Martelli <al***@aleax.it > wrote in message news:<QA******* ************@ne ws2.tin.it>...

Sure,
"tests can only show the _presence_ of errors, not their
_absence_". But so can static, compiler-enforced typing -- it
can show the presence of some errors, but never the absence of
others ("oops I meant a+b, not a-b"! and the like...).
But it *does* show the absence of type errors, and almost any
invariant can be coded into the Hindley-Milner typesystem. Writing to
a file opened for reading, multiplying matrices with improper
dimensions, etc. are both (among others) valid for encoding in the
typesystem. Too many dynamic typing advocates look at a typesystem
and see only a jail (or a padded room ;)) to restrict them. A good
static typesystem isn't a jail, but the raw material for building
compiler-enforced invariants into your code. Think DBC that the
compiler won't compile unless it can *prove* the contract is never
violated.

The main point, however, you made yourself: tests can only show the
*presence* of errors, whereas static typing can prove their absence.
In my
experience, the errors that static type-checking reliably catches
are a subset of those caught by systematic tests, particularly
with test-driven design.
But does the compiler write the tests for you? At the very least, one
could argue that static typing saves the programmer from having to
write a significant number of tests.
But systematic use of tests also
catches quite a few other kinds of errors, so, it gives me MORE
confidence than static type-checking would; and the "added
value" of static type-checking _given_ that I'll have good
batteries of tests anyway is too small for me to yearn to go
back to statically checked languages.

You make it seem like static typing and tests are mutually exclusive.
Obviously, they're not, though admittedly when I programmed in O'Caml
I felt far less *need* for tests because I saw far fewer bugs.

Good thing, too -- the testing libraries available for O'Caml (like
most everything else for that language) are pretty nasty :)

Jeremy

Jul 18 '05 #8

Gonçalo Rodrigues

On 14 Nov 2003 04:17:08 -0800, tw*********@hot mail.com (Jeremy
Fincher) wrote:

[text snipped]

The main point, however, you made yourself: tests can only show the
*presence* of errors, whereas static typing can prove their absence.

Huh? Surely you mean proving the absence of *type errors*. And on the
total amount of errors how many are type errors? Many people on this
ng have argued based on their experience that type errors are a tiny
fraction of the errors found in programs.

With my best regards,
G. Rodrigues

Jul 18 '05 #9

Alex Martelli

Jeremy Fincher wrote:

Alex Martelli <al***@aleax.it > wrote in message
news:<QA******* ************@ne ws2.tin.it>...
Sure,
"tests can only show the _presence_ of errors, not their
_absence_". But so can static, compiler-enforced typing -- it
can show the presence of some errors, but never the absence of
others ("oops I meant a+b, not a-b"! and the like...).
But it *does* show the absence of type errors, and almost any
invariant can be coded into the Hindley-Milner typesystem. Writing to

How do most _typical_ invariants of application programs, such
as "x > y", get coded e.g. in Haskell's HM typesystem? I don't
think "almost any invariant" makes any real sense here. When I'm
doing geometry I need to ensure that any side of a triangle is
always less than the sum of the other two; when I'm computing a
payroll I need to ensure that the amount of tax to pay does not
exceed the gross on which it is to be paid; etc, etc. Simple
inequalities of this sort _are_ "most invariants" in many programs.
Others include "there exists at least one x in xs and at least
one y in ys such that f(x,y) holds" and other such combinations
of simple propositional logic with quantifiers.
a file opened for reading, multiplying matrices with improper
dimensions, etc. are both (among others) valid for encoding in the
typesystem. Too many dynamic typing advocates look at a typesystem
and see only a jail (or a padded room ;)) to restrict them. A good
And (IMHO) too many static typing advocates have a hammer (a static
typesystem) and look at the world as a collection of nails (the very
restricted kinds of invariants they actually can have that system
check at compile-time), conveniently ignoring (most likely in good
faith) the vast collection of non-nails which happen to fill, by
far, most of the real world.
static typesystem isn't a jail, but the raw material for building
compiler-enforced invariants into your code. Think DBC that the
compiler won't compile unless it can *prove* the contract is never
violated.
What I want is actually a DBC which will let me state invariants I
"know" should hold even when it's not able to check them *at run
time*, NOT one that is the very contrary -- so restrictive that it
won't let me even state things that would easily be checkable at
run time, just because it can't check them at _compile_ time.

If I state "function f when called with parameter x will terminate
and return a result r such that pred(r,x) holds", it may well be
that even the first part can't be proven or checked without solving
the Halting Problem. I don't care, I'd like to STATE it explicitly
anyway in certain cases, perhaps have some part of the compiler
insert a comment about what it's not been able to prove (maybe it
IS able to prove that _IF_ f terminates _THEN_ pred(r, x) holds,
that's fine, it might be helpful to a maintainer to read the (very
hypothetical) computer-generated comment about having proven that
but not having been able to prove the antecedent.

But I'm not going to be willing to pay very much for this kind of
neat features -- either in terms of money (or equivalents thereof,
such as time) or convenience and comfort. I would no doubt feel
otherwise, if the kinds of applications I code and the environments
in which I work were vastly different. But they aren't, haven't
been for the > 1/4 century I've been programming, and aren't at all
likely to change drastically any time soon. So, I see static typing
as a theoretically-interesting field of no real applicability to my
work. If I felt otherwise about it, I would most likely be coding
in Haskell or some kind of ML, of course -- nobody's come and FORCED
me to choose a dynamically-typed language, you know?
The main point, however, you made yourself: tests can only show the
*presence* of errors, whereas static typing can prove their absence.
Static typing *cannot* "prove the absence of errors": it can prove the
absence of "static typing errors", just like a compilation phase can
prove the absence of "syntax errors", and the tests can equally well
prove the absence of the EQUALLY SPECIFIC errors they're testing for.

NONE of these techniques can "prove the absence of errors". CS
theoreticians have been touting theorem-proving techniques that IN
THEORY should be able to do so for the last, what, 40+ years? So
far, the difference from theory and practice in practice has proven
larger than the difference between practice and theory in theory.

Incidentally, at least as much of this theoretical work has been
done with such dynamic-typing languages as Scheme as with such
static-typing languages as ML. Static typing doesn't seem to be
particularly necessary for THAT purpose, either.

In my
experience, the errors that static type-checking reliably catches
are a subset of those caught by systematic tests, particularly
with test-driven design.

But does the compiler write the tests for you? At the very least, one
could argue that static typing saves the programmer from having to
write a significant number of tests.

One could, and one would be dead wrong. That is not just my own
real-life experience -- check out Robert Martin's blog for much more
of the same, for example. Good unit tests are not type tests: they
are _functionality_ tests, and types get exercised as a side effect.
(This might break down in a weakly-typed language, such as Forth or
BCPL: I don't have enough practical experience using TDD with such
weakly-typed -- as opposed to dynamically strongly-typed -- languages
to know, and as I'm not particularly interested in switching over to
any such language at the moment, that's pretty academic to me now).

But systematic use of tests also
catches quite a few other kinds of errors, so, it gives me MORE
confidence than static type-checking would; and the "added
value" of static type-checking _given_ that I'll have good
batteries of tests anyway is too small for me to yearn to go
back to statically checked languages.

You make it seem like static typing and tests are mutually exclusive.

No: the fact that the added value is too small does not mean it's zero,
i.e., that it would necessarly be "irrational " to use both if the
costs were so tiny as to be even smaller than the added value. Say
that I'm typing in some mathematical formulas from one handbook and
checking them on a second one; it's not necessarily _irrational_ to
triple check on a third handbook just in case both of the first should
happen to have the same error never noticed before -- it's just such
a tiny added value that you have to be valuing your time pretty low,
compared to the tiny probability of errors slipping by otherwise, to
make this a rational strategy. There may be cases of extremely costly
errors and/or extremely low-paid workforce in which it could be so (e.g.,
if the N-uple checking was fully automated and thus only cost dirt-cheap
computer-time, NO human time at all, then, why not).

In practice, I see test-driven design practiced much more widely by
users of dynamically typed languages (Smalltalk, Python, Ruby, &c),
maybe in part because of the psychological effect you mention...:
Obviously, they're not, though admittedly when I programmed in O'Caml
I felt far less *need* for tests because I saw far fewer bugs.
....but also, no doubt, because for most people using dynamically
typed languages is so much faster and more productive, that TDD is
a breeze. The scarcity of TDD in (e.g.) O'Caml then in turn produces:
Good thing, too -- the testing libraries available for O'Caml (like
most everything else for that language) are pretty nasty :)

....this kind of effect further discouraging sound testing practices.

(There are exceptions -- for reasons that escape me, many Java shops
appear to have decent testing practices, compared to C++ shops -- I
don't know of any FP-based shop on which to compare, though).
Alex

Jul 18 '05 #10

Similar topics

2280

Type safety vs late binding and class polymorphism

by: Steve Jorgensen | last post by:

I frequently find myself wanting to use class abstraction in VB/VBA code, and frankly, with the tacked-on class support in VB/VBA, there are some problems with trying to do that and have any type-safety as well. I thought I would share some of what I've come to think about this after dealing with it several times of late. First, an example. Let's say I have several classes, each with a string property called Name, and I have several...

Microsoft Access / VBA

2446

Runtime type-safety (for linked lists)

by: Dave | last post by:

Hello all, I am creating a linked list implementation which will be used in a number of contexts. As a result, I am defining its value node as type (void *). I hope to pass something in to its "constructor" so that I will be able to manipulate my list without the need for constant casting; some sort of runtime type-safety mechanism. For example, I want a linked lists of ints. I want to be able to say:

C / C++

4364

type safety and reinterpret_cast<>

by: Noah Roberts | last post by:

What steps do people take to make sure that when dealing with C API callback functions that you do the appropriate reinterpret_cast<>? For instance, today I ran into a situation in which the wrong type was the target of a cast. Of course with a reinterpret_cast nothing complains until the UB bites you in the ass. It seems to me that there ought to be a way to deal with these kinds of functions yet still retain some semblance of type...

C / C++

3648

An example of unions not being type safe?

by: Chad | last post by:

Okay, so like recently the whole idea of using a Union in C finally sunk into my skull. Seriously, I think it probably took me 2 years to catch on what a Union really is. Belated, I mentioned this too my ultra smart friend who just quit working as a CTO of a wireless company so he go complete his PhD in particle physics. Anyhow he mentioned that Unions in C are not typesafe. Now, how is it possible to violate type safety in Unions? ...

C / C++

13391

Type safety warnings (eclipse)

by: hcarlens | last post by:

hey guys, I'm doing a small project which does some basic encryption and decryption, but I'm very new to Java and Eclipse is showing safety warnings but I have no idea what they mean or how I fix them. Here is the part of the code that's causing the problem: String AlphArray = {"a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"}; ArrayList<String> Alphabet = new...

Java

9699

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

10538

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

10063

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

9115

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

7598

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

6838

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

5494

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

5622

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

3792

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP