Python for large projects

assaf__

Hello,

I am beginning to work on a fairly large project and I'm considering to use python for most of the coding, but I need to make sure first that it is reliable enough.

I need to make sure that I won't have surprises when my program runs on different real-world systems. So far I wrote a little script with python using urllib, and on one computer it failed completely because of a problem in getting the proxies (in my opinion this is a bug). How likely are such things to happen and how often, and to what extent are they more prevalent in python in comparison to C++?

If python is indeed suitable for large projects, how common is it to actually use if for such purposes? Is there perhaps a list of examples of real projects using python?

Thanks,

Bob

-----------------------------------------------------------------------
Walla! Mail, Get Your Private, Free E-mail from Walla! at:
http://mail.walla.co.il

Jul 18 '05 #1

Subscribe Post Reply

3369

Andrew Wilkinson

as*****@walla.com wrote:

If python is indeed suitable for large projects, how common is it to
actually use if for such purposes? Is there perhaps a list of examples of
real projects using python?

I would say that Python is very suitable for use in large projects, just the
benefit of automatic memory management is enough to convince me that it
beats C++. The powerful builtin types are another big win - what you would
spend ages coding in C++ is often a no brainer in Python.

Obviously there are downsides, I find the lack of static type checking to
cause some bugs to hide in my code that in other languages would be found
at compile time. In general (and IMHO) though, the pros far outway cons.

There is a decent sized list of companies that use Python at
http://pythonology.org/success.

HTH,
Andrew

Jul 18 '05 #2

Paul Rubin

as*****@walla.com writes:

I am beginning to work on a fairly large project and I'm considering
to use python for most of the coding, but I need to make sure first
that it is reliable enough.

I need to make sure that I won't have surprises when my program runs
on different real-world systems. So far I wrote a little script with
python using urllib, and on one computer it failed completely
because of a problem in getting the proxies (in my opinion this is a
bug). How likely are such things to happen and how often, and to
what extent are they more prevalent in python in comparison to C++?
The Python language is in general well-designed and much more concise
than C++. A big program in C++ may map to a much smaller Python
program, turning a large project into a small project, and making it
less relevant whether Python works for large projects.

Python's library does have a lot of small gaps like the one you found
in urllib. As the Twisted Matrix documentation puts it, you find
yourself re-inventing the wheel a lot, because you discover that the
existing wheels are often square and made of glue.
If python is indeed suitable for large projects, how common is it to
actually use if for such purposes? Is there perhaps a list of
examples of real projects using python?

The canonical example of a complex Python project is Zope
(www.zope.com). It's medium sized by the standards of big C++
projects, but as mentioned, a medium amount of Python code can
implement functionality that would take a much larger amount of C++.

Because Python is interpreted and highly dynamic, Python programs tend
to run slower than comparable C++ programs. Whether that's a problem
for you depends on your application. If you have specific Python
functions that are bottlenecks, you can re-implement them in C and
call them through Python's C API. There's also a semi-experimental
native-code Python compiler called Psyco that produces a considerable
speedup at the cost of increased memory consumption. The
next-generation Python implementation (PyPy or Python in Python) will
reportedly use Psyco or something simliar, in a more fundamental way.

In short, there's not a quick and simple answer to your question of
whether Python is right for what you're doing. It's great for lots of
things, not so hot for some others, and is still evolving rather
quickly, so an unsuitable application today may become suitable in a
forthcoming release.

Jul 18 '05 #3

Cameron Laird

In article <7x************@ruckus.brouhaha.com>,
Paul Rubin <http://ph****@NOSPAM.invalid> wrote:

as*****@walla.com writes:
I am beginning to work on a fairly large project and I'm considering
to use python for most of the coding, but I need to make sure first
that it is reliable enough.

I need to make sure that I won't have surprises when my program runs
on different real-world systems. So far I wrote a little script with
python using urllib, and on one computer it failed completely
because of a problem in getting the proxies (in my opinion this is a
bug). How likely are such things to happen and how often, and to
what extent are they more prevalent in python in comparison to C++?

The Python language is in general well-designed and much more concise
than C++. A big program in C++ may map to a much smaller Python
program, turning a large project into a small project, and making it
less relevant whether Python works for large projects.

Python's library does have a lot of small gaps like the one you found
in urllib. As the Twisted Matrix documentation puts it, you find
yourself re-inventing the wheel a lot, because you discover that the
existing wheels are often square and made of glue.
If python is indeed suitable for large projects, how common is it to
actually use if for such purposes? Is there perhaps a list of
examples of real projects using python?

The canonical example of a complex Python project is Zope
(www.zope.com). It's medium sized by the standards of big C++
projects, but as mentioned, a medium amount of Python code can
implement functionality that would take a much larger amount of C++.

Because Python is interpreted and highly dynamic, Python programs tend
to run slower than comparable C++ programs. Whether that's a problem
for you depends on your application. If you have specific Python
functions that are bottlenecks, you can re-implement them in C and
call them through Python's C API. There's also a semi-experimental
native-code Python compiler called Psyco that produces a considerable
speedup at the cost of increased memory consumption. The
next-generation Python implementation (PyPy or Python in Python) will
reportedly use Psyco or something simliar, in a more fundamental way.

In short, there's not a quick and simple answer to your question of
whether Python is right for what you're doing. It's great for lots of
things, not so hot for some others, and is still evolving rather
quickly, so an unsuitable application today may become suitable in a
forthcoming release.

You might also have occasion to learn about Pyrex and
.... well, I'm in the Trotskyite wing on this question.
It's not just that Python can work for large projects.
I sincerely regard it as an even *better* comparative
choice on large projects; C++ and Java, the usual com-
petition, show all sorts of blemishes when one scales
the size of the project. Python remains usable, even
at the high end.
--

Cameron Laird <cl****@phaseit.net>
Business: http://www.Phaseit.net

Jul 18 '05 #4

Paul Rubin

cl****@lairds.com (Cameron Laird) writes:

You might also have occasion to learn about Pyrex and ... well, I'm
in the Trotskyite wing on this question. It's not just that Python
can work for large projects. I sincerely regard it as an even
*better* comparative choice on large projects; C++ and Java, the
usual com- petition, show all sorts of blemishes when one scales the
size of the project. Python remains usable, even at the high end.

I keep hearing this, but I don't see any large (much less very large)
applications that have been done in Python; Zope is medium sized.

Suppose you wanted to write any of the following:

1) Optimizing C/C++ compiler, like GCC
2) Full featured web browser, like MSIE or Mozilla
3) Full featured office suite, like MS Office or Open Office or KDE
4) Avionics for the space shuttle
5) Internals of a large telephone/data switch
6) Tax processing software for the IRS
7) Operating system kernel (Linux: the next generation)
8) Accounting software for a big bank
9) Full featured database like Oracle or Postgres
10) ... well you get the idea.

Which, if any, would you write in Python? By "write in Python" I mean
the central framework and most of the code is written in Python,
though you're allowed to use the C API as needed. Using Python to
provide some functions around the edges (e.g. the operator UI for the
phone switch) doesn't count. Note that these are all supposed to be
used in production and not just as technology demos, so speed matters
(nobody wants a compiler that's 10x slower than GCC. GCC is slow
enough already).

I think I'd spend some time at least considering Python in each of the
above cases, but I'm not sure I could convincingly make it fly for any
of them.

Jul 18 '05 #5

Diez B. Roggisch

Paul Rubin wrote:

1) Optimizing C/C++ compiler, like GCC
2) Full featured web browser, like MSIE or Mozilla
3) Full featured office suite, like MS Office or Open Office or KDE
4) Avionics for the space shuttle
5) Internals of a large telephone/data switch
6) Tax processing software for the IRS
7) Operating system kernel (Linux: the next generation)
8) Accounting software for a big bank
9) Full featured database like Oracle or Postgres
10) ... well you get the idea.

Which, if any, would you write in Python? By "write in Python" I mean

Which, if any, would you write in JAVA or C#? I think for a project in
"userland" thats not on the number crunching or machine architecture side
like a kernel2, python often would be a viable alternative.

And the spaceshuttle runs on a 68K processor - I bet a modern 3GHz processor
running python can beat that :)
--
Regards,

Diez B. Roggisch

Jul 18 '05 #6

Heather Coppersmith

On 22 Mar 2004 15:37:27 -0800,
Paul Rubin <http://ph****@NOSPAM.invalid> wrote:

1) Optimizing C/C++ compiler, like GCC
2) Full featured web browser, like MSIE or Mozilla
3) Full featured office suite, like MS Office or Open Office or KDE
4) Avionics for the space shuttle
5) Internals of a large telephone/data switch
6) Tax processing software for the IRS
7) Operating system kernel (Linux: the next generation)
8) Accounting software for a big bank
9) Full featured database like Oracle or Postgres
10) ... well you get the idea. Which, if any, would you write in Python? ... Note that these
are all supposed to be used in production and not just as
technology demos, so speed matters ...

With the "speed counts" caveat, that's a bit of a trick question,
trivially answered with None Of The Above (okay, so maybe most
bankers (application 8) are in less of a hurry than most user
processes (application 7)).

With sufficiently powerful hardware and sufficiently patient
clients, I'd take Python over C for any of those projects except:

Life/death applications, like 4, and maybe 5, and maybe 7
(depends on who's using the OS and for what). Python is not
well-specified enough (too many things depend on the
underlying C library) for this type of application.

Hardware-oriented applications, like maybe parts of 4 and 5
(depending on what the interfaces to the hardware look like),
and the low-level pieces of 7.

Specifically, I can't think of a single advantage C has over
Python (except speed and footprint) for applications 1, 2, 3, 5,
6, 8, and 9.

What advantages *do* C and Java have for such large projects? How
many bug-free, meets-the-full-spec, on-time, under-budget examples
of applications written in those languages do we have?

Regards,
Heather

--
Heather Coppersmith
That's not right; that's not even wrong. -- Wolfgang Pauli

Jul 18 '05 #7

Jay O'Connor

Paul Rubin wrote:

cl****@lairds.com (Cameron Laird) writes:
You might also have occasion to learn about Pyrex and ... well, I'm
in the Trotskyite wing on this question. It's not just that Python
can work for large projects. I sincerely regard it as an even
*better* comparative choice on large projects; C++ and Java, the
usual com- petition, show all sorts of blemishes when one scales the
size of the project. Python remains usable, even at the high end.

I keep hearing this, but I don't see any large (much less very large)
applications that have been done in Python; Zope is medium sized.

Suppose you wanted to write any of the following:

1) Optimizing C/C++ compiler, like GCC
2) Full featured web browser, like MSIE or Mozilla
3) Full featured office suite, like MS Office or Open Office or KDE
4) Avionics for the space shuttle
5) Internals of a large telephone/data switch
6) Tax processing software for the IRS
7) Operating system kernel (Linux: the next generation)
8) Accounting software for a big bank
9) Full featured database like Oracle or Postgres
10) ... well you get the idea.

Which, if any, would you write in Python?

6 and 8, maybe 3...and 10 of course

Take care,
Jay

Jul 18 '05 #8

Paul Rubin

Heather Coppersmith <me@privacy.net> writes:

What advantages *do* C and Java have for such large projects? How
many bug-free, meets-the-full-spec, on-time, under-budget examples
of applications written in those languages do we have?

It was a real question, not a rhetorical one, and your answers are
reasonable.

We may not have C or Java examples of those apps that are on-time and
under budget, but at least we have examples now that are deployed and
that work, even if they were delivered late and over budget. We don't
have ANY examples for programs like that in Python, whether delivered
on time or not. On the other hand, we haven't been writing stuff like
that in Python for as long.

I'd like to think that a suitable implementation of Python, or
something pretty closely resembling Python as we know it, could do
pretty much anything Common Lisp can do, and CL has been used for some
very big apps. However, the current Python implementations seem aimed
at smaller projects.

Jul 18 '05 #9

Paul Rubin

Jay O'Connor <ja********@earthlink.net> writes:

1) Optimizing C/C++ compiler, like GCC
2) Full featured web browser, like MSIE or Mozilla
3) Full featured office suite, like MS Office or Open Office or KDE
4) Avionics for the space shuttle
5) Internals of a large telephone/data switch
6) Tax processing software for the IRS
7) Operating system kernel (Linux: the next generation)
8) Accounting software for a big bank
9) Full featured database like Oracle or Postgres
10) ... well you get the idea.
Which, if any, would you write in Python?

6 and 8, maybe 3...and 10 of course

I think 6 and 8 would depend on decimal arithmetic; some modules for
that have been proposed but nothing is released yet, afaik.

I can't see answering 3 without also answering 2.

Jul 18 '05 #10

Peter Hansen

Paul Rubin wrote:

I keep hearing this, but I don't see any large (much less very large)
applications that have been done in Python; Zope is medium sized.

Suppose you wanted to write any of the following:

1) Optimizing C/C++ compiler, like GCC
2) Full featured web browser, like MSIE or Mozilla
3) Full featured office suite, like MS Office or Open Office or KDE
4) Avionics for the space shuttle
5) Internals of a large telephone/data switch
6) Tax processing software for the IRS
7) Operating system kernel (Linux: the next generation)
8) Accounting software for a big bank
9) Full featured database like Oracle or Postgres
10) ... well you get the idea.

Which, if any, would you write in Python? By "write in Python" I mean
the central framework and most of the code is written in Python,
though you're allowed to use the C API as needed. Using Python to
provide some functions around the edges (e.g. the operator UI for the
phone switch) doesn't count. Note that these are all supposed to be
used in production and not just as technology demos, so speed matters
(nobody wants a compiler that's 10x slower than GCC. GCC is slow
enough already).

I think I'd spend some time at least considering Python in each of the
above cases, but I'm not sure I could convincingly make it fly for any
of them.

D'oh! You ask this just before some of us have to leave for PyCon?!
How cruel! ;-)

(Hmm... maybe a BOF for "serious s/w engineering with Python" or
something like that?)

-Peter

Jul 18 '05 #11

Jay O'Connor

Paul Rubin wrote:

Jay O'Connor <ja********@earthlink.net> writes:
1) Optimizing C/C++ compiler, like GCC
2) Full featured web browser, like MSIE or Mozilla
3) Full featured office suite, like MS Office or Open Office or KDE
4) Avionics for the space shuttle
5) Internals of a large telephone/data switch
6) Tax processing software for the IRS
7) Operating system kernel (Linux: the next generation)
8) Accounting software for a big bank
9) Full featured database like Oracle or Postgres
10) ... well you get the idea.
Which, if any, would you write in Python?
6 and 8, maybe 3...and 10 of course

I think 6 and 8 would depend on decimal arithmetic; some modules for
that have been proposed but nothing is released yet, afaik.

Depends on what you mean. If you're talking desktop line-of-business software
for the IRS, a bank, or any business then there's probably not going to be a lot
of heavy math. If you're talking some sort of automatic transaction processing
then maybe different, but I can't see the math really being that hoary. I've
seen financial transaction software written in TCL which was running a *lot*
slower than Python. Even in that case, I/O for getting yuor transactions and
storing your results are going to be over head so I would really discount Python
unless the example could be defined better. I'd seriously consider Python to
start with (but I'd consider Smalltalk first so..)
I can't see answering 3 without also answering 2.

A browser is a fairly straightforward piece of work that I think any slowness in
a particular language is going to be noticed when you repeat a fundamental
operation (render an element) many, many times. I think there's going to be a
lot more 'stuff' going on in office suite. That's just speculation, the real
reason I didn't consider #2 was that I overlooked it

Take care,
Jay

Jul 18 '05 #12

Ville Vainio

>>>>> "Peter" == Peter Hansen <pe***@engcorp.com> writes:

Peter> (Hmm... maybe a BOF for "serious s/w engineering with
Peter> Python" or something like that?)

We're living in interesting times, what with all the noise created by
Havoc Pennington in

http://ometer.com/desktop-language.html

I think Python might have a great shot as a more "official" strategic
direction for Linux desktop app development, since it's quite possible
their neither Mono or Java gets picked because they are encumbered.

Some lobbying is needed, of course. Getting something like IBM or
Novell (well, Novell is probably stuck w/ Mono... IronPython, anyone?)
officially endorsing and sponsoring Python would be oh-so-cool.

--
Ville Vainio http://tinyurl.com/2prnb

Jul 18 '05 #13

Jacek Generowicz

Andrew Wilkinson <aj******@SPAMyork.ac.uk> writes:

as*****@walla.com wrote: I find the lack of static type checking to cause some bugs to hide
in my code that in other languages would be found at compile time.

I feel honour-bound to point out that citing static typing (explicit
static typing, in particular) as a means of creating more correct
programs, to be one of the greatest contemporary myths of software
engineering.

Static typing makes it easier for compilers to produce more efficient
code. That is the only advantage that static typing offers.

I am of the opinion that (explicit) static typing contributes to the
bugginess of programs.

Jul 18 '05 #14

Neil Hodgson

Jacek Generowicz:

Do you have a reference describing such an attempt?

It was mentioned in Dr Dobbs, July 1996, page 87 although I also heard
about this from a colleague that worked with someone ...
Googling for "zero defect development", looks like the Dr Dobbs piece
requires an account
http://www.ddj.com/documents/s=959/ddj9607j/

Neil

Jul 18 '05 #15

Alan Gauld

On 25 Mar 2004 12:21:36 +0100, Matthias <no@spam.pls> wrote:

Jacek Generowicz <ja**************@cern.ch> writes:
"After", is far too late, in my opinion. It's a bit like suggesting to
a static-typing-for-safety fan, that he should only run his program
through the compiler _after_ he has finished developing.

I think this method was advertised as the "cleanroom approach".
Google finds some references.

The clean room approach was slihtly different although heading in
that direction. It relied on rigorous review, inspection and
testing at every stage of the process. (sound familiar?)

It was popular in the early/mid eighties and here are a few
references:

Wicked problems, Righteous Solutions; P DeGrace & L Hulet Stahl
- many methodolofgies including a section on clean room.

Cleanroom approach to REliable Software Devt; Dyer & MIlls
Proceedings Validation Methods Research for Fault Tolerant
Avionics....; Research Triangele Institiute, 1981

Cleanroom Software Devt, An Emopirical Investigation;
Selby, Basili, Baker, 1987
IEEE Transactions on Software Engineering,
VolSE-13,#9, Sept 1987

HTH,

Alan G.

PS. Keeping programmers away from compilers is not that old a
prctice, I was working on a VAX project in 1989 that only allowed
us one compile each per day, with a full compile overnight (which
took 6 hours)

Author of the Learn to Program website
http://www.freenetpages.co.uk/hp/alan.gauld

Jul 18 '05 #16

Hung Jung Lu

> On Tue, 2004-03-23 at 17:24, Cameron Laird wrote:

they're at a particular DISadvantage there. If you have a
big job, you *particularly* need to look at Python (or Erlang,
or Eiffel, or ...)
--------------------------------------------
gabor <ga***@z10n.net> wrote in message news:<ma************************************@pytho n.org>... ...
i wanted to use python for a project in our company... we wanted to
build a fairly big system/program.

but when i recommended python, i got a question like:
(previously all the programs were written in java)
"if one of our programmers changes a method in a class/interface, we
immediately will know about it, because the next program-rebuild will
simply fail. but if we would use python, we wouldn't find it out".

--------------------------------------------

I use C++ and Python everyday. Let us be fair and point out some good
things about each of them.

(a) In compiled language like C++, changing function prototypes and
variable names is comfortable, because the compiler will find all
those spots that you need to change. In Python, you do not have the
same level of comfort. Sure, there are other techniques, but it's
different than clicking a button.

(b) Cameron said something very true in my opinion: for large
projects, you want Python. But he said so without giving more details.
So let me add some comments.

In my opinion, the essence of software development is code/task
factorization. It seems such a trivial concept, but if you really
really think about it, goto statements, loops, functions, classes,
arrays, pointers, OOP, macros/templates, metaprogramming, AOP,
databases, etc, just about every single technique in programming has
its base in the concept of code/task factorization. Take for instance
classes and inheritance, basically, you factor out the common parts of
two classes and push it up into a common parent class. To go one level
deeper, my belief is that at the bottom, all human intellectual
activities are based on factorization: no more, no less.

In large projects, you'll find that you need to factor out even more.
Let us take an example. Suppose you write an application, and later on
you realize that you need to make it transactional: that is, if some
exceptions happen, you want to roll back the changes. Now, this kind
of major after-thought is terrible for languages without
metaprogramming capabilities. To add a new feature, you will have to
make modifications in hundreds or thousands of spots. Another example,
suppose your software is versioned, more over, you have different
versions for the application and for the data file format, and your
application needs to work with legacy file formats. Again, without
metaprogramming capabilities, your code will have many redundant lines
of code, or be cluttered with tons of if-statements or
switch-statements. Another similar problem: you have several different
clients that buy your application, and they want some different extra
features. Again, without metaprogramming, your code will be either
hard to code (using virtual functions, function pointers, and/or
templates in C++), or will be cluttered with if-else- and switch-
statements (a terrible practice that will make your code
unmaintainable.)

As your project grows more and more complex (become threaded, many new
clients requirements, support for legacy versions, using distributed
computing in a cluster, etc.) you will realize more and more that you
need to factorize efficiently, otherwise your pain will be unbearable.

When you have reached that point, you'll come to appreciate simplicity
and purity in a language. Frankly, Python is good but still not good
enough.

For large projects, if you use a rigid language, then your best bet is
to use tons of programmers coding trivial interfaces and APIs to make
up for the shortcomings of the language. In flexible languages like
Python, you often can use metaprogramming features to factor out the
common areas. At that point, I think that issues like automatically
finding name changes as I mentioned in point (a) become small issues,
because you will have bigger concerns. The fact that you may miss a
name change or function header change is not the thing that will kill
you. The fact that your entire system is unmaintainable is the thing
that will kill you. Don't look at individual bugs when you are talking
about large projects, because your worry should not be there: your
worry should be focused on how to make your system maintainable. Bugs
can and will be fixed. But if your language does not allow you to
factorize efficiently, at the end of the day, that's what's going to
kill you.

regards,

Hung Jung

Jul 18 '05 #17

Roger Binns

> (a) In compiled language like C++, changing function prototypes and

variable names is comfortable, because the compiler will find all
those spots that you need to change.

It won't catch some stuff such as where a prototype changes from
pass by value to pass by reference (or vice versa), or if another
operator or explicit conversion is available. [That is true
of many languages, but C++ gives the impression it has this
rigid type checking system that avoids errors if the code compiles]

In reality I find the best approach is to use multiple languages.
You can code components in C++ and glue them together using
Swig and Python. You can make multiple binaries and execute
them telling them where to send their output, or use a pipe.
That kind of thing also makes it easier dealing with issues
in the field. For example you can send the customer a different
binary (that has the same interface) or the debugging version
of a DLL/so etc.

At the end of the day, use the best tool for the job, and
don't use any that preclude you from using others at the
same time as well.

Roger

Jul 18 '05 #18

Bill Rubenstein

In article <ma************************************@python.org >, ga***@z10n.net
says...

On Wed, 2004-03-24 at 15:16, Bill Rubenstein wrote:
...snip...
> other thing is, that in the projects i work on, there seems to be
> very hard to do unit tests

...snip...

The ability to do unit testing should not be an afterthought. It should be
considered as a major influence on the architecture of a project.

If one cannot do proper unit testing, the architecture of the project is
questionable.

ok, so let's use a specific example:

imagine you're building a library, which fetches webpages.

you have a library which can fetch 1 webpage at a time, but it is a
synchronous library (like wget). you call him, and he returns the page.

but you want an async one.

so you decide to build a threadpool, where every thread will do this:
look into a queue, and if there is a new URL to fetch, fetches it with
his wget-like library, and saves the html page somewhere (and maybe
signals something).

and now the user who uses your library, simply adds the URL to fetch,
and can check later asynchronously whether they are already fetched or
not.

could you tell me what unit tests would you create for this example?
(a more generic request: is there on the internet a webpage with
something like this? one where they have some complex
modules/programs/algorithms, and they show how to write unittests for
them?)

thanks,
gabor

Ok, I think I understand what the job is so, here is a try.

I'm assuming that this async wget's job is to start at a url, fetch it, track
down and fetch any links and such, get them, and make all of that available on
the local system for later viewing.

To make it testable, I'd design so that the application part of the system
(described above) has as limited a knowledge of its surroundings as possible --
except for the actual work performed. It should have no knowledge of a gui, for
instance.

Instead it should know about an object which represents a 'job'. This object
should have attributes and/or functions which can be accessed to find out the
base URL, the current status or state of the specific job (not started, in
progress (various states here),..., complete. There should be a log associated
with the job object where both normal and abnormal stuff can be kept. It should
also be able to provide information about the user if there is one, instructions
about the base URL, where in the local file system to store the results, etc.
During the development phase this job object is going to be a bit dynamic as new
needs for it are discovered.

There should probably be one object which can keep track of all of the job
objects and is responsible for creating new ones and deleting old ones.

All of the interfaces to the job management object and the job object need to be
formalized and properly documented. This whole subsystem can be tested, then, by
a test driver requesting services via the documented interfaces, changing the
state of a job via the documented interfaces and determining that the state
transitions are as expected. There is no need to fetch any real URLs to do this,
just pretend you did. This test driver also needs to exercise the interfaces
intended for use by a gui.

Now, as to testing the actual application code -- I'd think that you'd need a set
of URLs which would return known and stable results and a number of error
situations (bad links and such) to test against. Then a test driver would be
written to use the standard interfaces to the job management object and the job
object to schedule work against those URLs, determine when that work is done and
test that the results are as expected, highlight the differences between a prior
run against the particular URL and the current run, etc.

I've been retired for years but that was pretty much how we did it. There were
two small programming teams -- one writing application code against the formal
interface documentation and one writing test scaffolding against the same
documentation and building test cases. Things worked, the bug rate was very low,
implementation changes were localized and testable...

Anyway, it worked for us and we never had to claim that we just couldn't test
something except in production.

Bill

Jul 18 '05 #19

Cameron Laird

In article <c3**********@atlantis.news.tpi.pl>,
Jarek Zgoda <jz****@gazeta.usun.pl> wrote:

Jul 18 '05 #20

Cameron Laird

In article <8e**************************@posting.google.com >,
Hung Jung Lu <hu********@yahoo.com> wrote:

Jul 18 '05 #21

Joe Mason

In article <10*************@corp.supernews.com>, Cameron Laird wrote:
Remarkable fact that I see as turning up all over: we work with

grep(1). There are visual programming and language-savvy editors
and IDEs and refactoring plugins and all sorts of other tools,
and we find our variables with text searches. 'Know how to make
a C programmer mad? Name a global variable 'i'. 'Know how to
make him happy? Change the name to 'ii'. Both Lisp's inventor
and I keep our human address collection in a plaintext file.

My address collection was scattered all over various databases and
phones, and I lost the phone with the most recent one. I spent a good
hour searching for an important number, and realized that the one
database I might still have access to was for a PDA I no longer owned,
with a desktop app that I could no longer run, in a Windows partition
that I couldn't boot to at the time.

I could see the actual data, but knowing the Windows world I was almost
positive it'd be some binary database, and I'd be out of luck.

Nope, XML. Almost as good as plain text for grepping. I've never been
so relieved.

Joe

Jul 18 '05 #22

Isaac Gouy

Jacek Generowicz <ja**************@cern.ch> wrote in message news:<ty*************@pcepsft001.cern.ch>...

I am of the opinion that (explicit) static typing contributes to the
bugginess of programs.

Is there a theory for the periodicity of static-checking /
dynamic-checking debates?

A couple of weeks worth has drawn to a close on comp.lang.object

http://groups.google.com/groups?hl=e...9i%25404ax.com

The last discussion on comp.lang.functional was back in Nov 2003

Jul 18 '05 #23

Aahz

[quoting unsnipped, voting this for post of the week]

In article <8e**************************@posting.google.com >,
Hung Jung Lu <hu********@yahoo.com> wrote:

I use C++ and Python everyday. Let us be fair and point out some good
things about each of them.

(a) In compiled language like C++, changing function prototypes and
variable names is comfortable, because the compiler will find all
those spots that you need to change. In Python, you do not have the
same level of comfort. Sure, there are other techniques, but it's
different than clicking a button.

(b) Cameron said something very true in my opinion: for large
projects, you want Python. But he said so without giving more details.
So let me add some comments.

In my opinion, the essence of software development is code/task
factorization. It seems such a trivial concept, but if you really
really think about it, goto statements, loops, functions, classes,
arrays, pointers, OOP, macros/templates, metaprogramming, AOP,
databases, etc, just about every single technique in programming has
its base in the concept of code/task factorization. Take for instance
classes and inheritance, basically, you factor out the common parts of
two classes and push it up into a common parent class. To go one level
deeper, my belief is that at the bottom, all human intellectual
activities are based on factorization: no more, no less.

In large projects, you'll find that you need to factor out even more.
Let us take an example. Suppose you write an application, and later on
you realize that you need to make it transactional: that is, if some
exceptions happen, you want to roll back the changes. Now, this kind
of major after-thought is terrible for languages without
metaprogramming capabilities. To add a new feature, you will have to
make modifications in hundreds or thousands of spots. Another example,
suppose your software is versioned, more over, you have different
versions for the application and for the data file format, and your
application needs to work with legacy file formats. Again, without
metaprogramming capabilities, your code will have many redundant lines
of code, or be cluttered with tons of if-statements or
switch-statements. Another similar problem: you have several different
clients that buy your application, and they want some different extra
features. Again, without metaprogramming, your code will be either
hard to code (using virtual functions, function pointers, and/or
templates in C++), or will be cluttered with if-else- and switch-
statements (a terrible practice that will make your code
unmaintainable.)

As your project grows more and more complex (become threaded, many new
clients requirements, support for legacy versions, using distributed
computing in a cluster, etc.) you will realize more and more that you
need to factorize efficiently, otherwise your pain will be unbearable.

When you have reached that point, you'll come to appreciate simplicity
and purity in a language. Frankly, Python is good but still not good
enough.

For large projects, if you use a rigid language, then your best bet is
to use tons of programmers coding trivial interfaces and APIs to make
up for the shortcomings of the language. In flexible languages like
Python, you often can use metaprogramming features to factor out the
common areas. At that point, I think that issues like automatically
finding name changes as I mentioned in point (a) become small issues,
because you will have bigger concerns. The fact that you may miss a
name change or function header change is not the thing that will kill
you. The fact that your entire system is unmaintainable is the thing
that will kill you. Don't look at individual bugs when you are talking
about large projects, because your worry should not be there: your
worry should be focused on how to make your system maintainable. Bugs
can and will be fixed. But if your language does not allow you to
factorize efficiently, at the end of the day, that's what's going to
kill you.

regards,

Hung Jung

--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

"usenet imitates usenet" --Darkhawk

Jul 18 '05 #24

Python for large projects

Similar topics