Code correctness, and testing strategies

David

Hi list.

What strategies do you use to ensure correctness of new code?

Specifically, if you've just written 100 new lines of Python code, then:

1) How do you test the new code?
2) How do you ensure that the code will work correctly in the future?

Short version:

For (1) I thoroughly (manually) test code as I write it, before
checking in to version control.

For (2) I code defensively.

Long version:

For (2), I have a lot of error checks, similar to contracts (post &
pre-conditions, invariants). I've read about Python libs which help
formalize this[1][2], but I don't see a great advantage over using
regular ifs and asserts (and a few disadvantages, like additional
complexity). Simple ifs are good enough for Python built-in libs :-)

[1] PEP 316: http://www.python.org/dev/peps/pep-0316/
[2] An implementation:
http://aspn.activestate.com/ASPN/Coo.../Recipe/436834

An aside: What is the correct situation in which to use assert
statements in Python? I'd like to use them for enforcing 'contracts'
because they're quick to type, but from the docs:

"Assert statements are a convenient way to insert debugging assertions
into a program:"

So to me it sounds like 'assert' statements are only useful while
debugging, and not when an app is live, where you would also
(especially!) want it to enforce contracts. Also, asserts can be
removed with -O, and you only ever get AssertionError, where
ValueError and the like might be more appropriate.

As for point 1 (how do you test the new code?):

I like the idea of automated unit tests. However, in practice I find
they take a long time to write and test, especially if you want to
have good coverage (not just lines, but also possible logic branches).

So instead, I prefer to thoroughly test new code manually, and only
then check in to version control. I feel that if you are disciplined,
then unit tests are mainly useful for:

1) Maintenance of legacy code
2) More than 1 person working on a project

One recent personal example:

My workstation is a Debian Unstable box. I like to upgrade regularly
and try out new library & app versions. Usually this doesn't cause
major problems. One exception is sqlalchemy. It's API seems to change
every few months, causing warnings and breakage in code which used the
old API. This happened regularly enough that for one project I spent a
day adding unit tests for the ORM-using code, and getting the unit
tests up to 100% coverage. These tests should allow me to quickly
catch and fix all sqlalchemy API breakages in my app in the future.
The breakages also make me want to stop using ORM entirely, but it
would take longer to switch to SQL-only code than to keep the unit
tests up to date :-)

My 'test code thoroughly before checkin' methodology is as follows:

1) Add "raise 'UNTESTED'" lines to the top of every function
2) Run the script
3) Look where the script terminated
4) Add print lines just before the exception to check the variable values
5) Re-run and check that the values have expected values.
6) Remove the print and 'raise "UNTESTED"' lines
7) Add liberal 'raise "UNTESTED"' lines to the body of the function.
8.1) For short funcs, before every line (if it seems necessary)
8.2) For longer funcs, before and after each logic entry/exit point
(blocks, exits, returns, throws, etc):

eg, before:

if A():
B()
C()
D()
E()

after:

raise 'UNTESTED'
if A():
raise 'UNTESTED'
B()
C()
D()
raise 'UNTESTED'
raise 'UNTESTED'
E()

8.2.1) Later I add "raise 'UNTESTED'" lines before each line in the
blocks also, if it seems necessary.

9) Repeat steps 2 to 8 until the script stops throwing exceptions
10) Check for 'raise "UNTESTED"' lines still in the script
11) Cause those sections of code to be run also (sometimes I need to
temporarily set vars to impossible values inside the script, since the
logic will never run otherwise)

And here is one of my biggest problem with unit tests. How do you unit
test code which almost never runs? The only easy way I can think of is
for the code to have 'if <some almost impossible conditionor <busy
running test case XYZlines'. I know I'm meant to make 'fake' testing
classes which return erroneous values, and then pass these objects to
the code being tested. But this can take a long time and even then
isn't guaranteed to reach all your error-handling code.

The above methodology works well for me. It goes fairly quickly, and
is much faster than writing and testing elaborate unit tests.

So finally, my main questions:

1) Are there any obvious problems with my 'correctness' strategies?

2) Should I (regardless of time it takes initially) still be adding
unit tests for everything? I'd like to hear what XP/agile programming
advocates have to say on the subject.

3) Are there easy and fast ways to do write and test (complete) unit tests?

4) Any other comments?

Thanks for your time.

David.

Jun 27 '08 #1

Subscribe Post Reply

2472

Scott David Daniels

David wrote:

Specifically, if you've just written 100 new lines of Python code, then:
1) How do you test the new code?
2) How do you ensure that the code will work correctly in the future?

Short version:

For (1) I thoroughly (manually) test code as I write it, before
checking in to version control.

For (2) I code defensively.

....

As for point 1 (how do you test the new code?):
I like the idea of automated unit tests. However, in practice I find
they take a long time to write and test, especially if you want to
have good coverage (not just lines, but also possible logic branches).

This is why I have reluctantly come to accept the XP people's view:
if you you write the tests _as_ you develop (that is as far as I go
w/o re-enforcement; they would have you write them _before_), you will
have a body of tests that work to demonstrate the correctness or
deficiencies of your code based on what it _should_ do. If you write
tests after you've written the code, you will write tests that are
based on what your code _actually_does_. You don't want the latter;
the tests are brittle. The tests don't match needs, rather they
match implementations. Therefore you'll need to discard more tests at
every local rewrite.

1) Add "raise 'UNTESTED'" lines to the top of every function

String exceptions are deprecated. Just raise UNTESTED (and let the
access to undefined global error be the issue).
....<describes how to do code coverage by hand>...

11) Cause those sections of code to be run also (sometimes I need to
temporarily set vars to impossible values inside the script, since the
logic will never run otherwise)

And here is one of my biggest problem with unit tests. How do you unit
test code which almost never runs? The only easy way I can think of is
for the code to have 'if <some almost impossible conditionor <busy
running test case XYZlines'. I know I'm meant to make 'fake' testing
classes which return erroneous values, and then pass these objects to
the code being tested. But this can take a long time and even then
isn't guaranteed to reach all your error-handling code.

Ah, but now you tests are "brittle"; they only work for the code you
have now.
If you want to make sure you have code coverage with your test, the XP
way is:
Write a test for behavior you need.
Watch it fail.
Fix the code so all tests pass.
Lather, rinse, repeat.
You should not have untested code, because there was no test that made
you write it. If you want to do code coverage, find a code coverage
tool and count your code while runnign your unit tests.
--Scott David Daniels
Sc***********@Acm.Org

Jun 27 '08 #2

D'Arcy J.M. Cain

On Sat, 24 May 2008 17:51:23 +0200
David <wi******@gmail.comwrote:

Basically, with TDD you write the tests first, then the code which
passes/fails the tests as appropriate. However, as you're writing the
code you will also think of a lot of corner cases you should also
handle. The natural way to do this is to add them to the code first.
But with TDD you have to first write a test for the corner case, even
if setting up test code for it is very complicated. So, you have these
options:

- Take as much time as needed to put a complicated test case in place.

Absolutely. You may think that it is slowing you down but I can assure
you that in the long run you are saving yourself time.

- Don't add corner case to your code because you can't (don't have
time to) write a test for it.

If you don't have time to write complete, working, tested code then you
have a problem with your boss/client, not your methodology.

- Add the corner case handling to the code first, and try to add a
test later if you have time for it.

Never! It won't happen.

Having to write tests for all code takes time. Instead of eg: 10 hours
coding and say 1/2 an hour manual testing, you spend eg: 2-3 hours
writing all the tests, and 10 on the code.

In conventional development, 10 hours of code requires 90 hours of
testing, debugging and maintenance. Under TDD (and agile in general)
you spend 20 hours testing and coding. That's the real economics if
you want to deliver a good product.

I think that automated tests can be very valuable for maintainability,
making sure that you or other devs don't break something down the
line. But these benefits must be worth the time (and general
inconvenience) spent on adding/maintaining the tests.

I can assure you from experience that it always is worth the time.

If I did start doing some kind of TDD, it would be more of the 'smoke
test' variety. Call all of the functions with various parameters, test
some common scenarios, all the 'low hanging fruit'. But don't spend a
lot of time trying to test all possible scenarios and corner cases,
100% coverage, etc, unless I have enough time for it.

Penny wise, pound foolish. Spend the time now or spend the time later
after your client complains.

I'm going to read more on the subject (thanks to Ben for the link).
Maybe I have some misconceptions.

Perhaps just lack of experience. Read up on actual case studies.

--
D'Arcy J.M. Cain <da***@druid.net | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.

Jun 27 '08 #3

D'Arcy J.M. Cain

On Sat, 24 May 2008 21:14:36 +0200
David <wi******@gmail.comwrote:

Is it considered to be cheating if you make a test case which always
fails with a "TODO: Make a proper test case" message?

Yes. It's better to have the daily reminder that some code needs to be
finished.

While it is possible to describe all problems in docs, it can be very
hard to write actual test code.

It may be hard to start but once you have your framework in place it
becomes very easy.

For example: sanity tests. Functions can have tests for situations
that can never occur, or are very hard to reproduce. How do you unit
test for those?

Believe me, thousands of people reading this are remembering situations
where something that couldn't possibly happen happened.

A few examples off the top of my head:

* Code which checks for hardware defects (pentium floating point,
memory or disk errors, etc).

* Code that checks that a file is less than 1 TB large (but you only
have 320 GB harddrives in your testing environment).

* Code which checks if the machine was rebooted over a year ago.

And so on. These I would manually test by temporarily changing
variables in the code, then changing them back. To unit test these you
would need to write mock functions and arrange for the tested code to
call them instead of the python built-ins.

Yes but the mock functions can be wrappers around the real functions
which only change the results that you are testing for.

eg: You call function MyFunc with argument X, and expect to get result Y.

MyFunc calls __private_func1, and __private_func2.

You can check in your unit test that MyFunc returns result Y, but you
shouldn't check __private_func1 and __private_func2 directly, even if
they really should be tested (maybe they sometimes have unwanted side
effects unrelated to MyFunc's return value).

It isn't your job to test __private_func1 and __private_func2 unless
you are writing MyFunc.

Depends on the type of bug. If it's a bug which breaks the unit tests,
then it can be found quickly. Unit tests won't help with bugs they
don't explicitly cover. eg off-by-one, memory leaks, CPU load,
side-effects (outside what the unit tests test), and so on.

No but when you find that your code breaks due to these problems that's
when you write new unit tests.

But once you track down problems like the above you can write more
unit tests to catch those exact bugs in the future. This is one case
where I do favour unit tests.

Yes! One of the biggest advantages to unit testing is that you never
ever deliver the same bug to the client twice. Delivering software
with a bug is bad but delivering it with the same bug after it was
reported and fixed is calamitous.

--
D'Arcy J.M. Cain <da***@druid.net | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.

Jun 27 '08 #4

Duncan Booth

David <wi******@gmail.comwrote:

So, at what point do you start writing unit tests? Do you decide:
"Version 1 I am going to definitely throw away and not put it into
production, but version 2 will definitely go into production, so I
will start it with TDD?".

If you are going to prototype something without tests then you decide that
up-front and throw the code away afterwards. It never is version 1 nor 2.

>
Where this doesn't work so well is if version 2 is a refactored and
incrementally-improved version of version 1. At some point you need to
decide "this is close to the version that will be in production, so
let's go back and write unit tests for all the existing code".

No you've gone beyond prototyping at that point. Prototypes are for
throwing away.

Seriously, 10 hours of testing for code developed in 10 hours? What
kind of environment do you write code for? This may be practical for
large companies with hordes of full-time testing & QA staff, but not
for small companies with just a handful of developers (and where you
need to borrow somone from their regular job to do non-developer
testing). In a small company, programmers do the lions share of
testing. For programmers to spend 2 weeks on a project, and then
another 2 weeks testing it is not very practical when they have more
than one project.

I've done plenty of projects using conventional methods in small companies.
If anything test time equal to development time is a vast underestimate.

>
As for your other points - agreed, bugs getting to the customer is not
a good thing. But depending on various factors, it may not be the end
of the world if they do. eg: There are many thousands of bugs in open
source bug trackers, but people still use open source for important
things. Sometimes it is better to have software with a few bugs, than
no software (or very expensive, or very long time in development). See
"Worse is Better": http://en.wikipedia.org/wiki/Worse_is_better. See
also: Microsoft ;-)

I'm not saying it is the end of the world. What it is though is expensive.
If as the developer you spot a problem and fix it then all the time is
yours. If it is caught in QA then it involves bug reporting, assigning
priority, fixing and then the code has to be re-tested. If the customer
finds the problem then customer support has to report it, someone has to
assign it a priority, someone has to figure out how to replicate it, then
someone has to figure out the problem, then you have to fix it, then the
code has to be re-tested. Every level out from the initial developer you
increase the number of people and number of steps involved.

The other reason is that if your unit test breaks after you've just written
5 lines of code then you know which 5 lines of code are the problem.
Probably you can see the problem instantly. At worst you just hit undo in
your editor and try again. If a bug report comes back months later you have
no idea where look in thousands of lines of code.

>
The next time your project is running late, your manager and the
customer will be upset if you spend time updating your unit tests
rather than finishing off the project (and handing it over to QA etc)
and adding the unit tests when there's actually time for it.

But you aren't updating unit tests. What you are doing is coding: the tests
are an integral part of that, or rather the tests are part of the final
design rather than coding.

>

>>Clients, deadlines, etc require actual software, not
tests for software (that couldn't be completed on time because you
spent too much time writing tests first ;-)).

Clients generally require *working* software. Unfortunately it is all
too easy to ship something broken because then you can claim you
completed the coding on time and any slippage gets lost in the next 5
years of maintenance.

That's why you have human testing & QA. Unit tests can help, but they
are a poor substitute. If the customer is happy with the first
version, you can improve it, fix bugs, and add more unit tests later.

No, they aren't a substitute at all. You still need human testing and QA,
the difference is that with a good set of unit tests you reduce the number
of times code comes back from QA before it can be passed and make it more
likely that the customer will be happy with the first version.

Jun 27 '08 #5

Roy Smith

David <wi******@gmail.comwrote:

Problem 1: You can only code against tests

Yup. That's the flavor of Kool-Aide being served at a convenient TDD
outlet near you.

Basically, with TDD you write the tests first, then the code which
passes/fails the tests as appropriate. However, as you're writing the
code you will also think of a lot of corner cases you should also
handle. The natural way to do this is to add them to the code first.

That's only the natural way if you're haven't drunk the Kool-Aide :-) It
takes a while to get used to, but once you get the hang of it, doing it
this way becomes very natural. Sure, I think of corner cases when I'm
writing code. But, what I do with them is write a test for them first,
then write the code which implements it.

Jun 27 '08 #6

Roy Smith

David <wi******@gmail.comwrote:

While it is possible to describe all problems in docs, it can be very
hard to write actual test code.

For example: sanity tests. Functions can have tests for situations
that can never occur, or are very hard to reproduce. How do you unit
test for those?

In some cases, you can use mock objects to mimic the "can never happen"
situations. But, you are right, there are certainly cases which are
difficult or impossible to test for. TDD is a very powerful tool, but it's
just that: a tool. It's not a magic wand.

My suggestion is to make using TDD a habit, but don't turn it into a
religion. You will undoubtedly find places where it's just the wrong tool.
Don't let the fact that it can't do everything keep you from using it when
it makes sense.

Jun 27 '08 #7

Ben Finney

David <wi******@gmail.comwrites:

Is it considered to be cheating if you make a test case which always
fails with a "TODO: Make a proper test case" message?

I consider it so.

What I often do, though, is write a TODO comment in the unit test
suite:

# TODO: test_frobnitz_produces_widget_for_znogplatz_input( self):

At which point I've got:

* A unit test suite that has the same number of tests.

* A comment that my editor will highlight, and that many tools are
tuned to look for and flag for my attention (the "TODO:"
convention).

* A description of what the test should be testing, as the name of the
function that will implement the test. (This also forces me to think
of exactly what it is I want to assert about the code's behaviour,
by writing a test case name that embodies that assertion.)

* The beginnings fo the test case function itself, simply by removing
the "# TODO: " part when I get around to that test case.

Then I get back to the work I was doing when that idea struck me,
knowing thaat it's recorded so I don't have to focus on it now.

For example: sanity tests. Functions can have tests for situations
that can never occur, or are very hard to reproduce. How do you unit
test for those?

That depends, of course, on how hard they are to reproduce. But one
common technique is isolation via test doubles
<URL:http://xunitpatterns.com/Test%20Double.html>.

If you want to see how a specific unit of code behaves under certain
conditions, such as a disk failure, you're not interested in testing
the disk subsystem itself, only your code's behaviour. So you isolate
your code by having the unit test (often in a fixture) provide a "test
double" for the disk subsystem, and rig that double so that it will
predictably provide exactly the hard-to-get event that you what your
code to respond to correctly.

A few examples off the top of my head:

* Code which checks for hardware defects (pentium floating point,
memory or disk errors, etc).

* Code that checks that a file is less than 1 TB large (but you only
have 320 GB harddrives in your testing environment).

* Code which checks if the machine was rebooted over a year ago.

All of these can, with the right design, be tested by providing a test
double in place of the subsystem that the code under test will
exercise, and making that test double provide exactly the response you
want to trigger the behaviour in your code.

Also, there are places where mock objects can't be used that easily.

eg 1: A complicated function, which needs to check the consistency of
it's local variables at various points.

That's a clear sign the function is too complicated. Break out the
"stages" in the function to separate functions, and unit test those so
you know they're behaving properly. Then, if you need to, you can
easily provide test doubles for those functions; or you may find you
don't need to once you know they're fully test covered.

It *is* possible to unit test those consistency checks, but you may
have to do a lot of re-organization to enable unit testing.

This results in more modular code with looser coupling (fewer
interdependencies between components). This is a good thing.

In other cases it might not be appropriate to unit test, because it
makes your tests brittle (as mentioned by another poster).

eg: You call function MyFunc with argument X, and expect to get result Y.

MyFunc calls __private_func1, and __private_func2.

If you've got name-mangled functions, that's already a bad code smell.
Not necessarily wrong, but that's the way to bet.

You can check in your unit test that MyFunc returns result Y, but you
shouldn't check __private_func1 and __private_func2 directly, even if
they really should be tested (maybe they sometimes have unwanted side
effects unrelated to MyFunc's return value).

Don't write functions with unwanted side effects. Or, if those side
effects are actually caused by other subsystems beyond your control,
test those functions with test doubles in place of the external
subsystems.

eg: Resource usage.

How do you unit test how much memory, cpu, temporary disk space, etc
a function uses?

This is beyond the scope of unit tests. You should be testing resource
usage by profiling the entire application, not in unit tests.

eg: Platforms for which unit tests are hard to setup/run.

- embedded programming. You would need to load your test harness into
the device, and watch LED patterns or feedback over serial. Assuming
it has enough memory and resources :-)
- mobile devices (probably the same issues as above)

Yes, it's unfortunate that some systems don't yet have good tools to
enable unit testing. Fix that, or pay others to fix it, because it's
in your best interests (and your customers's interests) to have unit
testing be as easy as feasible on all platforms that you target.

eg: race conditions in multithreaded code: You can't unit test
effectively for these.

Another good reason to avoid multithreaded code where possible. I find
that "it's really hard to deterministically figure out what's going
on" is reason enough to avoid it. If it's so hard to know the current
state of the system, it's therefore a poor choice, because I have no
good way to report to my customer how close I am to finishing the
implementation.

Others may have successfully achieved unit tests for conditions that
only arise in multithreaded code, but that's outside my experience.

--
\ "How many people here have telekenetic powers? Raise my hand." |
`\ -- Emo Philips |
_o__) |
Ben Finney

Jun 27 '08 #8

Ben Finney

Roy Smith <ro*@panix.comwrites:

But, you are right, there are certainly cases which are difficult or
impossible to test for. TDD is a very powerful tool, but it's just
that: a tool. It's not a magic wand.

My suggestion is to make using TDD a habit, but don't turn it into a
religion. You will undoubtedly find places where it's just the wrong
tool.

All true.

I find it best to remember that the tests discussed in Behaviour
Driven Development are always *unit* tests: they test one small,
isolated unit of the application code, to ensure its behaviour is
correct given specific environment and inputs.

The cases where unit tests are not applicable include anything where
we want to assert behaviour of the entire application: performance
tests, acceptance tests, stress tests, etc.

Unit tests get a lot of discussion time simply because "test the
entire running application" isn't something many programmers will
disagree with, so they don't end up discussing it much. That doesn't
make unit tests *more* important; but it does mean more time is spent
convincing people that unit tests are *at least as* important as the
other tests.

Don't let the fact that it can't do everything keep you from using
it when it makes sense.

Yes.

--
\ "Don't worry about people stealing your ideas. If your ideas |
`\ are any good, you'll have to ram them down people's throats." |
_o__) -- Howard Aiken |
Ben Finney

Jun 27 '08 #9

Terry Reedy

"D'Arcy J.M. Cain" <da***@druid.netwrote in message
news:20***************************@druid.net...
| But once you track down problems like the above you can write more
| unit tests to catch those exact bugs in the future. This is one case
| where I do favour unit tests.
|
| Yes! One of the biggest advantages to unit testing is that you never
| ever deliver the same bug to the client twice. Delivering software
| with a bug is bad but delivering it with the same bug after it was
| reported and fixed is calamitous.

Writing a test for each code bug is now part of the Python maintenance
procedure.

Jun 27 '08 #10

Ben Finney

Duncan Booth <du**********@invalid.invalidwrites:

You still need human testing and QA, the difference is that with a
good set of unit tests you reduce the number of times code comes
back from QA before it can be passed and make it more likely that
the customer will be happy with the first version.

Paradoxically, you also end up writing less code that you would
otherwise.

Behaviour Driven Development insists that you be lazy in satisfying
the tests and implement only the simplest thing that could possibly
work (and refactor code whenever the unit test suite passes). This
results in never writing code into the application that isn't needed
to satisfy some unit test, saving a lot of fruitless coding time.

--
\ "...one of the main causes of the fall of the Roman Empire was |
`\ that, lacking zero, they had no way to indicate successful |
_o__) termination of their C programs." -- Robert Firth |
Ben Finney

Jun 27 '08 #11

Ben Finney

"D'Arcy J.M. Cain" <da***@druid.netwrites:

On Sat, 24 May 2008 17:51:23 +0200
David <wi******@gmail.comwrote:
If I did start doing some kind of TDD, it would be more of the
'smoke test' variety. Call all of the functions with various
parameters, test some common scenarios, all the 'low hanging
fruit'. But don't spend a lot of time trying to test all possible
scenarios and corner cases, 100% coverage, etc, unless I have
enough time for it.

Penny wise, pound foolish. Spend the time now or spend the time
later after your client complains.

Worse, if you don't spend the time now, you'll end up spending the
time later *plus* all the time to find the bug in the first place, all
the time to tease its behaviour out from all the interdependent code,
all the time to go through more rounds of QA, etc. With BDD, only the
time to code the test is spent, so you save all that other time.

--
\ "Laugh and the world laughs with you; snore and you sleep |
`\ alone." -- Anonymous |
_o__) |
Ben Finney

Jun 27 '08 #12

Ben Finney

"D'Arcy J.M. Cain" <da***@druid.netwrites:

Yes! One of the biggest advantages to unit testing is that you never
ever deliver the same bug to the client twice.

More specifically, this is a benefit of putting all unit tests into an
automated test suite, and running that test suite all the time during
development so it's immediately clear when any of them fails due to a
code change.

It's not enough to *have* the unit tests, nor is it enough to only run
them at certain times. They need to run after every change, so the
feedback is immediate and local to the change that was made.

(good sigmonster, have a cookie)

--
\ "Program testing can be a very effective way to show the |
`\ presence of bugs, but is hopelessly inadequate for showing |
_o__) their absence." -- Edsger Dijkstra |
Ben Finney

Jun 27 '08 #13

Michael L Torrie

David wrote:

Seriously, 10 hours of testing for code developed in 10 hours? What
kind of environment do you write code for? This may be practical for
large companies with hordes of full-time testing & QA staff, but not
for small companies with just a handful of developers (and where you
need to borrow somone from their regular job to do non-developer
testing). In a small company, programmers do the lions share of
testing. For programmers to spend 2 weeks on a project, and then
another 2 weeks testing it is not very practical when they have more
than one project.

Watch your programmers then. They do have to write and debug the code.
And they will spend at least as much or more time debugging as writing
the code. It's a fact. I have several programmers working for me on
several projects. What you have been told is fact. In my experience
it's 3-10x more time debugging as programming. I've heard that *good*
programmers write, on average, 10 new lines of code per day. I can also
verify that this is pretty accurate, both in my own programming
experience, and watching programmers working for me.

Jun 27 '08 #14

Matthew Woodcraft

Michael L Torrie <to*****@gmail.comwrote:

Watch your programmers then. They do have to write and debug the
code. And they will spend at least as much or more time debugging as
writing the code. It's a fact. I have several programmers working
for me on several projects. What you have been told is fact.

This isn't the case for everyone. In my workplace the time we spend
debugging is small compared to the time writing the code in the first
place. I wonder what the difference is?

We do use unit-testing quite widely, but by no means everywhere. The
code which doesn't have unit tests doesn't tend to be any buggier than
the code which does. Where testsuites really help is when you have to
upgrade some library or service that your programs are depending on,
and you get to find out about subtle backwards-incompatibilities.

-M-

Jun 27 '08 #15

David

Hi again.

Taking the advice of numerous posters, I've been studying BDD further.

I spent a while looking for a Python library which implemented BDD in
Python similar to jbehave, as described by Dan North on this page:
http://dannorth.net/introducing-bdd. I did find a few, but they either
had awful-looking syntax, or they were overly-complicated. So I
decided that using regular unit tests (via nosetest) was good enough,
even if it doesn't have support for stories, scenarios, givens, etc,
and it uses words with "Test" in it instead of "Behavior".

One thing I just tried was to put together a basic stack class
following BDD, with nosetest. I got the idea from this page:
http://www.ibm.com/developerworks/we...187/index.html

It was an interesting exercise, and I'm encouraged to try it further.

I ended up with these 2 modules:

======test_stack.py========

from nose.tools import raises
import stack

class TestStackBehaviour:
def setup(self):
self.stack = stack.Stack()
@raises(stack.Empty)
def test_should_throw_exception_upon_pop_without_push( self):
self.stack.pop()
def test_should_pop_pushed_value(self):
self.stack.push(12345)
assert self.stack.pop() == 12345
def test_should_pop_second_pushed_value_first(self):
self.stack.push(1)
self.stack.push(2)
assert self.stack.pop() == 2
def test_should_leave_value_on_stack_after_peep(self):
self.stack.push(999)
assert self.stack.peep() == 999
assert self.stack.pop() == 999
def test_should_pop_values_in_reverse_order_of_push(se lf):
self.stack.push(1)
self.stack.push(2)
self.stack.push(3)
assert self.stack.pop() == 3
assert self.stack.pop() == 2
assert self.stack.pop() == 1
@raises(stack.Empty)
def test_peep_should_fail_when_stack_is_empty(self):
self.stack.peep()
def test_should_be_empty_when_new(self):
assert len(self.stack) == 0

======stack.py========

class Empty(Exception):
"""Thrown when a stack operation is impossible because it is empty"""
pass

class Stack:
"""Basic implementation of a stack"""
def __init__(self):
self._data = []
def push(self, value):
"""Push an element onto a stack"""
self._data.append(value)
def pop(self):
"""Pop an element off a stack"""
try:
return self._data.pop()
except IndexError:
raise Empty
def peep(self):
"""Return the top-most element of the stack"""
try:
return self._data[-1]
except IndexError:
raise Empty
def __len__(self):
"""Return the number of elements in the stack"""
return len(self._data)

===================

Does the above look like a decent BDD-developed class?

Is it ok that there are no 'scenarios', 'stories', 'having', 'given',
etc references?

Some pages suggest that you should use so-called contexts
(EmptyStackContext, StackWithOneElementContext, FullStackContext,
AlmostFullStackContext, etc).

Would you normally start with a basic TestStackBehavoiur class, and
when Stack becomes more complicated, split the tests up into
TestEmptyStackContext, TestStackWithOneElementContext, etc?

Another thing I noticed is that some of my test cases were redundant.
Would you normally leave in the redundant tests, or remove the ones
which are included in the more general test?

Also, I have another question. How do you unit test event loops?

eg: Your app is a (very basic) service, and you want to add some
functionality (following BDD principles)

Here's an example unit test:

class TestServiceBehavior:
def setup(self):
...
def test_handles_event_xyz(self):
...

If your service is normally single-threaded, would your unit test need
to start the service in a separate thread to test it?

Another method would be to update the event loop to enable unit
testing. eg only iterate once if a 'being_tested' variable is set
somewhere.

None of the above are ideal. What is a good way to unit test event loops?

David.

Jun 27 '08 #16

Scott David Daniels

D'Arcy J.M. Cain wrote:

Yes! One of the biggest advantages to unit testing is that you never
ever deliver the same bug to the client twice. Delivering software
with a bug is bad but delivering it with the same bug after it was
reported and fixed is calamitous.

QOTW for sure.

--Scott David Daniels
Sc***********@Acm.Org

Jun 27 '08 #17

Scott David Daniels

Ben Finney wrote:

David <wi******@gmail.comwrites:
>You need to justify the extra time spent on writing test code.

From the perspective of someone who once thought this way, and then
dug in and did Behaviour Driven Development, I can tell you the time
is entirely justified: by better design, more maintainable code,
higher confidence when speaking to customers, high visibility of
progress toward completion, freedom to refactor code when it needs it,
and many other benefits.

David, here's another way to convince yourself you have the time:
For the next week, count how many times you repeat the same "hand-test"
of some code on a subsequent version. Think of how much time that can
add up to if you no longer do that; you had-tesing time is subtracting
from the time you have to do unit tests.

--Scott David Daniels
Sc***********@Acm.Org

Jun 27 '08 #18

Aahz

In article <ma***************************************@python. org>,
David <wi******@gmail.comwrote:

>
Seriously, 10 hours of testing for code developed in 10 hours? What
kind of environment do you write code for? This may be practical for
large companies with hordes of full-time testing & QA staff, but not
for small companies with just a handful of developers (and where you
need to borrow somone from their regular job to do non-developer
testing). In a small company, programmers do the lions share of
testing. For programmers to spend 2 weeks on a project, and then
another 2 weeks testing it is not very practical when they have more
than one project.

You must have poor project management/tracking. You WILL pay the cost
of testing, the only question is when. The when does have an impact on
other aspects of the development process.

Speaking as someone who started in my current job four years ago as the
third developer in a five-person company, I believe that your claim about
the differences between small companies and large companies is specious.
--
Aahz (aa**@pythoncraft.com) <* http://www.pythoncraft.com/

Need a book? Use your library!

Jun 27 '08 #19

David

You must have poor project management/tracking. You WILL pay the cost
of testing, the only question is when. The when does have an impact on
other aspects of the development process.

Speaking as someone who started in my current job four years ago as the
third developer in a five-person company, I believe that your claim about
the differences between small companies and large companies is specious.
--

Might be a difference in project size/complexity then, rather than
company size. Most of my works projects are fairly small (a few
thousand lines each), very modular, and each is usually written and
maintained by one developer. A lot of the programs will be installed
together on a single server, but their boundaries are very clearly
defined.

Personally I've had to do very little bug fixing and maintenance.
Thorough testing of all my changes before they go into production
means that I've caught 99% of the problems, and there is very little
to fix later.

That's why I'm surprised to hear that such a huge amount of time is
spent on testing maintenance, and why the other posters make such a
big deal about unit tests.

I'm not a genius programmer, so it must be that I'm lucky to work on
smaller projects most of the time.

David.

Jun 27 '08 #20

Matthew Woodcraft

David <wi******@gmail.comwrote:

Might be a difference in project size/complexity then, rather than
company size. Most of my works projects are fairly small (a few
thousand lines each), very modular, and each is usually written and
maintained by one developer. A lot of the programs will be installed
together on a single server, but their boundaries are very clearly
defined.

Personally I've had to do very little bug fixing and maintenance.
Thorough testing of all my changes before they go into production
means that I've caught 99% of the problems, and there is very little
to fix later.

That's why I'm surprised to hear that such a huge amount of time is
spent on testing maintenance, and why the other posters make such a
big deal about unit tests.

I'm not a genius programmer, so it must be that I'm lucky to work on
smaller projects most of the time.

This is close to my experience. One lesson we might draw is that
there's an advantage to structuring your work as multiple small
projects whenever possible, even if making one big project seems more
natural.

I should think everyone would be happy with a bug-coping strategy of
"don't write the bugs in the first place", if at all possible. My guess
is that the main part of the 'small project' advantage is that all
changes can be written or reviewed by someone who is decently familiar
with the whole program.

That does suggest that if programmers do find that they're spending
more than half of their time fighting bugs, it might be worthwhile to
invest time in having more people become very familiar with the
existing code.

-M-

Jun 27 '08 #21

Paul Rubin

aa**@pythoncraft.com (Aahz) writes:

You must have poor project management/tracking. You WILL pay the cost
of testing, the only question is when. The when does have an impact on
other aspects of the development process.

Well, let's say you used TDD and your program has 5000 tests. One
might reasonably ask: why 5000 test? Why not 10000? Why not 20000?
No number of tests can give you mathematical certainty that your code
is error-free. The only sensible answer I can think of to "why 5000"
is that 5000 empirically seemed to be enough to make the program
reliable in practice. Maybe if you used a more error-prone coding
process, or wrote in assembly language instead of Python, you would
have needed 10000 or 20000 tests instead of 5000 to get reliable code.

But then one might reasonably ask again: why 5000 tests? Why not 2000
or 1000? Was there something wrong with the coding process, that
couldn't produce reliable code with fewer tests?

So, I think you have to consider the total development cycle and not
treat test development as if it were free.

I also haven't yet seem an example of a real program written in this
test-driven style that people keep touting. I use doctest when
writing purely computational code, and maybe it helps some, but for
more typical code involving (e.g.) network operations, writing
automatic tests (with "mock objects" and all that nonsense) is a heck
of a lot more work than testing manually in the interactive shell, and
doesn't seem to help reliability much. I'd be interested in seeing
examples of complex, interactive or network-intensive programs with
automatic tests.

Jun 27 '08 #22

Duncan Booth

ja***@cd.chalmers.se (Jacob Hallen) wrote:

The most important aspect of usnit testing is actually that it makes
the code testable. This may sound lik an oxymoron but it is actually a
really important property. Testable code has to have a level of
modularity as well as simplicity and clarity in its interfaces that
you will not achieve in code that lacks automated unit tests.

Excellent clear description, thanks Jacob.

I made the mistake at one point when I was trying to sell the concept of
TDD telling the people I was trying to persuade that by writing the tests
up front it influences the design of the code. I felt the room go cold:
they said the customer has to sign off the design before we start coding,
and once they've signed it off we can't change anything.

I wish I'd had your words then.

--
Duncan Booth http://kupuguy.blogspot.com

Jun 27 '08 #23

Paul Rubin

Duncan Booth <du**********@invalid.invalidwrites:

I made the mistake at one point when I was trying to sell the concept of
TDD telling the people I was trying to persuade that by writing the tests
up front it influences the design of the code. I felt the room go cold:
they said the customer has to sign off the design before we start coding,
and once they've signed it off we can't change anything.

Usually the customer signs off on a functional specification but that
has nothing to do with the coding style. Jacob makes a very good
point that TDD influences coding style, for example by giving a strong
motivation to separate computational code from I/O. But that is
independent of the external behavior that the customer cares about.

Jun 27 '08 #24

Ben Finney

Duncan Booth <du**********@invalid.invalidwrites:

I made the mistake at one point when I was trying to sell the
concept of TDD telling the people I was trying to persuade that by
writing the tests up front it influences the design of the code. I
felt the room go cold: they said the customer has to sign off the
design before we start coding, and once they've signed it off we
can't change anything.

It's for misunderstandings like you describe here that I prefer (these
days) to use the term "behaviour driven development". The coding is
driven by desired new and/or altered behaviour of the application;
everything else follows from that statement of direction.

People, such as customers, who don't care about the term "unit test"
can easily relate to the term "behaviour". They are happy to talk
about how they want the application to behave, and are usually easy to
convince that such descriptions of behaviour are what should be the
driving force behind the implementation.

--
\ "Truth would quickly cease to become stranger than fiction, |
`\ once we got as used to it." -- Henry L. Mencken |
_o__) |
Ben Finney

Jun 27 '08 #25

Code correctness, and testing strategies

Similar topics