473,387 Members | 1,493 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Automatic debugging of copy by reference errors?

Is there a module that allows me to find errors that occur due to copy
by reference? I am looking for something like the following:
>>import mydebug
mydebug.checkcopybyreference = True
a=2
b=[a]
a=4
Traceback (most recent call last):
File "<stdin>", line 1, in ?
CopyByReferenceError: Variable b refers to variable a, so please do not
change variable a.

Does such a module exist?
Would it be possible to code such a module?
Would it be possible to add the functionality to array-copying in
numpy?
What would be the extra cost in terms of memory and CPU power?

I realize that this module would disable some useful features of the
language. On the other hand it could be helpful for new python users.

Niels

Dec 9 '06 #1
16 1150
On 9 dic, 02:22, "Niels L Ellegaard" <niels.ellega...@gmail.comwrote:
Is there a module that allows me to find errors that occur due to copy
by reference?
What do you mean by "copy by reference"?
I am looking for something like the following:
>import mydebug
mydebug.checkcopybyreference = True
a=2
b=[a]
a=4
Traceback (most recent call last):
File "<stdin>", line 1, in ?
CopyByReferenceError: Variable b refers to variable a, so please do not
change variable a.
(I won't be pedantic to say that Python has no "variables"). What's
wrong with that code? You are *not* changing "variable a", you are
binding the integer 4 to the name "a". That name used previously to be
bound to another integer, 2 - what's wrong with it? Anyway it has no
effect on the list referred by the name "b", still holds b==[2]
What do you want? Forbid the re-use of names? So once you say a=2, you
can't modify it? That could be done, yes, a "write-once-read-many"
namespace. But I don't see the usefullness.
Does such a module exist?
No
Would it be possible to code such a module?
I don't know what do you want to do exactly, but I feel it's not
useful.
Would it be possible to add the functionality to array-copying in
numpy?
No idea.
What would be the extra cost in terms of memory and CPU power?
No idea yet.
I realize that this module would disable some useful features of the
language.
Not only "some useful features", most programs would not even work!
On the other hand it could be helpful for new python users.
I think you got in trouble with something and you're trying to avoid it
again - but perhaps this is not the right way. Could you provide some
example?

--
Gabriel Genellina

Dec 9 '06 #2
Gabriel Genellina wrote:
I think you got in trouble with something and you're trying to avoid it
again - but perhaps this is not the right way. Could you provide some
example?
I have been using scipy for some time now, but in the beginning I made
a few mistakes with copying by reference. The following example is
exagerated
for clarity, but the principle is the same:

import os
output=[]
firstlines =[0,0]
for filename in os.listdir('.'):
try:
firstlines[0] = open(filename,"r").readlines()[0]
firstlines[1] = open(filename,"r").readlines()[1]
output.append((filename,firstlines))
except:continue
print output

Now some of my fortran-using friends would like to use python to
analyze their data files. I wanted them to avoid making the same
mistakes as I did so I thought it would be good if they could get some
nanny-like warnings saying: "Watch out young man. If do this, then
python will behave differently from fortran and matlab". That could
teach them to do things the right way.

I am not an expert on all this, but I guessed that it would be possible
to make a set of constraints that could catch a fair deal of simple
errors such as the one above, but still allow for quite a bit of
programming.

Niels

Dec 9 '06 #3
In <11**********************@n67g2000cwd.googlegroups .com>, Niels L
Ellegaard wrote:
I have been using scipy for some time now, but in the beginning I made
a few mistakes with copying by reference.
But "copying by reference" is the way Python works. Python never copies
objects unless you explicitly ask for it. So what you want is a warning
for *every* assignment.

Ciao,
Marc 'BlackJack' Rintsch
Dec 9 '06 #4
Marc 'BlackJack' Rintsch wrote:
In <11**********************@n67g2000cwd.googlegroups .com>, Niels L
Ellegaard wrote:
I have been using scipy for some time now, but in the beginning I made
a few mistakes with copying by reference.
But "copying by reference" is the way Python works. Python never copies
objects unless you explicitly ask for it. So what you want is a warning
for *every* assignment.
Maybe I am on the wrong track here, but just to clarify myself:

I wanted a each object to know whether or not it was being referred to
by a living object, and I wanted to warn the user whenever he tried to
change an object that was being refered to by a living object. As far
as I can see the garbage collector module would allow to do some of
this, but one would still have to edit the assignment operators of each
of the standard data structures:

http://docs.python.org/lib/module-gc.html

Anyway you are probably right that the end result would be a somewhat
crippled version of python
Niels

Dec 9 '06 #5
Niels L Ellegaard wrote:

I wanted a each object to know whether or not it was being referred to
by a living object, and I wanted to warn the user whenever he tried to
change an object that was being refered to by a living object. As far
as I can see the garbage collector module would allow to do som
*all* objects in Python are referred to by "living" objects; those that
don't are garbage, and are automatically destroyed sooner or later.

(maybe you've missed that namespaces are objects too?)

</F>

Dec 9 '06 #6
On Sat, 09 Dec 2006 05:58:22 -0800, Niels L Ellegaard wrote:
I wanted a each object to know whether or not it was being referred to
by a living object, and I wanted to warn the user whenever he tried to
change an object that was being refered to by a living object. As far
as I can see the garbage collector module would allow to do some of
this, but one would still have to edit the assignment operators of each
of the standard data structures:
I think what you want is a namespace that requires each object to have
exactly one reference - the namespace. Of course, additional references
will be created during evaluation of expressions. So the best you can do
is provide a function that checks reference counts for a namespace when
called, and warns about objects with multiple references. If that could
be called for every statement (i.e. not during expression evaluation -
something like C language "sequence points"), it would probably catch the
type of error you are looking for. Checking such a thing efficiently would
require deep changes to the interpreter.

The better approach is to revel in the ease with which data can be
referenced rather than copied. I'm not sure it's worth turning python
into fortran - even for selected namespaces.

--
Stuart D. Gathman <st****@bmsi.com>
Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154
"Confutatis maledictis, flamis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.

Dec 9 '06 #7
I think what you want is a namespace that requires each object to have
exactly one reference - the namespace. Of course, additional references
will be created during evaluation of expressions. So the best you can do
It's not enough. It won't catch the case where a list holds many
references to the same object - the example provided by the OP.

--
Gabriel Genellina

Dec 9 '06 #8


On 9 dic, 09:08, "Niels L Ellegaard" <niels.ellega...@gmail.comwrote:
Now some of my fortran-using friends would like to use python to
analyze their data files. I wanted them to avoid making the same
mistakes as I did so I thought it would be good if they could get some
nanny-like warnings saying: "Watch out young man. If do this, then
python will behave differently from fortran and matlab". That could
teach them to do things the right way.
Best bet is to learn to use python the right way. It's not so hard...
I am not an expert on all this, but I guessed that it would be possible
to make a set of constraints that could catch a fair deal of simple
errors such as the one above, but still allow for quite a bit of
programming.
The problem is, it's an error only becasuse it's not the *intent* of
the programmer - but it's legal Python code, and useful. It's hard for
the computer to guess the intent of the programmer -yet!...-

--
Gabriel Genellina

Dec 9 '06 #9
Niels L Ellegaard wrote:
I wanted to warn the user whenever he tried to
change an object that was being refered to by a living object.
To see how fundamentally misguided this idea is,
consider that, under your proposal, the following
program would produce a warning:

a = 1

The reason being that the assignment is modifying
the dictionary holding the namespace of the main
module, which is referred to by the main module
itself. So you are "changing an object that is
being referred to by a living object".

--
Greg
Dec 10 '06 #10
Niels L Ellegaard wrote:
Is there a module that allows me to find errors that occur due to copy
by reference? I am looking for something like the following:
I don't have a solution for you, but I certainly appreciate your
concern. The copy by reference semantics of Python give it great
efficiency but are also its achille's heel for tough-to-find bugs.

I once wrote a simple "vector" class in Python for doing basic vector
operations such as adding and subtracting vectors and multiplying them
by a scalar. (I realize that good solutions exist for that problem, but
I was more or less just experimenting.) The constructor took a list as
an argument to initialize the vector. I later discovered that a
particularly nasty bug was due to the fact that my constructor "copied"
the initializing list by reference. When I made the constructor
actually make its own copy using "deepcopy," the bug was fixed. But
deecopy is less efficient than copy by reference, of course.

So a fundamental question in Python, it seems to me, is when to take
the performance hit and use "copy" or "deepcopy."

If a debugger could tell you how many references exist to an object,
that would be helpful. For all I know, maybe some of them already do. I
confess I don't use debuggers as much as I should.

Dec 11 '06 #11
Russ wrote:
If a debugger could tell you how many references exist to an object,
that would be helpful.
import sys
sys.getrefcount(a)

But I doubt it would be very helpful.

Carl Banks

Dec 11 '06 #12

Niels L Ellegaard wrote:
Marc 'BlackJack' Rintsch wrote:
In <11**********************@n67g2000cwd.googlegroups .com>, Niels L
Ellegaard wrote:
I have been using scipy for some time now, but in the beginning I made
a few mistakes with copying by reference.
But "copying by reference" is the way Python works. Python never copies
objects unless you explicitly ask for it. So what you want is a warning
for *every* assignment.

Maybe I am on the wrong track here, but just to clarify myself:

I wanted a each object to know whether or not it was being referred to
by a living object, and I wanted to warn the user whenever he tried to
change an object that was being refered to by a living object.
This really wouldn't work, trust us. Objects do not know who
references them, and are not notified when bound to a symbol or added
to a container. However, I do think you're right about one thing: it
would be nice to have a tool that can catch errors of this sort, even
if it's imperfect (as it must be).

ISTM the big catch for Fortran programmers is when a mutable container
is referenced from multiple places; thus a change via one reference
will confusingly show up via the other one. (This, of course, is
something that happens all the time in Python, but it's not something
people writing glue for numerical code need to do a lot.) Instead of
the interpreter doing it automatically, it might be better (read:
possible) to use an explicit check. Here is a simple, untested
example:
class DuplicateReference(Exception):
pass

def _visit_container(var,cset):
if id(var) in cset:
raise DuplicateReference("container referenced twice")
cset.add(id(var))

def assert_no_duplicate_container_refs(datalist,cset=N one):
if cset is None:
cset = set()
for var in data:
if isinstance(var,(list,set)):
_visit_container(var,cset)
assert_no_duplicate_container_refs(var,cset)
elif isinstance(var,dict):
_visit_container(var,cset)
assert_no_duplicate_container_refs(var.itervalues( ),cset)
Then, at important points along the way, call this function with data
you provide yourself. For example, we might modify your example as
follows:

import os
output=[]
firstlines =[0,0]
for filename in os.listdir('.'):
try:
firstlines[0] = open(filename,"r").readlines()[0]
firstlines[1] = open(filename,"r").readlines()[1]
output.append((filename,firstlines))
except (IOError, IndexError):
# please don't use bare except unless reraising exception
continue
assert_no_duplicate_container_refs([output,firstlines])
print output
We passed the function output and firstlines because those are the two
names you defined yourself. It'll check to see if any containers are
referenced more than once. And it turns out there are--firstlines is
referenced many times. It'll raise DuplicateReference on you.

There's plenty of room for improvement in the recipe, for sure. It'll
catch real basic stuff, but it doesn't account for all containers, and
doesn't show you the loci of the duplicate references.

Now, to be honest, the biggest benefit of this check is it gives
newbies a chance to learn about references some way other than the hard
way. It's not meant to catch a common mistake, so much as a
potentially very confusing one. (It's really not a common mistake for
programmers who are aware of references.) Once the lesson is learned,
this check should be put aside, as it unnecessarily restricts
legitimate use of referencing.
Carl Banks

Dec 11 '06 #13
Russ wrote:
The copy by reference semantics of Python give it great
efficiency but are also its achille's heel for tough-to-find bugs.
You need to stop using the term "copy by reference",
because it's meaningless. Just remember that assignment
in Python is always reference assignment. If you want
something copied, you need to be explicit about it.
I later discovered that a
particularly nasty bug was due to the fact that my constructor "copied"
the initializing list by reference.
The reason you made that mistake was that you were
using the wrong mental model for how assignment works
in Python -- probably one that you brought over from
some other language.

When you become more familiar with Python, you won't
make mistakes like that anywhere near as often. And
if you do, you'll be better at recognising the
symptoms, so the cause won't be hard to track down.
So a fundamental question in Python, it seems to me, is when to take
the performance hit and use "copy" or "deepcopy."
Again, this is something you'll find easier when
you've had more experience with Python. Generally,
you only need to copy something when you want an
independent object that you can manipulate without
affecting anything else, although that probably
doesn't sound very helpful.

In your vector example, it depends on whether you
want your vectors to be mutable or immutable. It
sounds like you were treating them as mutable, i.e.
able to be modified in-place. In that case, each
vector obviously needs to be a new object with the
initial values copied into it.

The alternative would be to treat your vectors as
immutable, i.e. once created you never change their
contents, and any operation, such as adding two
vectors, produces a new vector holding the result.
In that case, two vectors could happily share a
reference to a list of values (as long as there is
nothing else that might modify the contents of the
list).

--
Greg
Dec 11 '06 #14

Carl Banks wrote:
Niels L Ellegaard wrote:
Marc 'BlackJack' Rintsch wrote:
In <11**********************@n67g2000cwd.googlegroups .com>, Niels L
Ellegaard wrote:
I have been using scipy for some time now, but in the beginning I made
a few mistakes with copying by reference.
But "copying by reference" is the way Python works. Python never copies
objects unless you explicitly ask for it. So what you want is a warning
for *every* assignment.
Maybe I am on the wrong track here, but just to clarify myself:

I wanted a each object to know whether or not it was being referred to
by a living object, and I wanted to warn the user whenever he tried to
change an object that was being refered to by a living object.

This really wouldn't work, trust us. Objects do not know who
references them, and are not notified when bound to a symbol or added
to a container. However, I do think you're right about one thing: it
would be nice to have a tool that can catch errors of this sort, even
if it's imperfect (as it must be).

ISTM the big catch for Fortran programmers is when a mutable container
is referenced from multiple places; thus a change via one reference
will confusingly show up via the other one.
As a Fortranner, I agree. Is there an explanation online of why Python
treats lists the way it does? I did not see this question in the Python
FAQs at http://www.python.org/doc/faq/ .

Here is a short Python code and a Fortran 95 equivalent.

a = [1]
c = a[:]
b = a
b[0] = 10
print a,b,c

output: [10] [10] [1]

program xalias
implicit none
integer, target :: a(1)
integer :: c(1)
integer, pointer :: b(:)
a = [1]
c = a
b = a
b(1) = 10
print*,a,b,c
end program xalias

output: 10 10 1

It is possible to get similar behavior when assigning an array (list)
in Fortran as in Python, but one must explicitly use a pointer and "=>"
instead of "=". This works well IMO, causing fewer surprises, and I
have never heard Fortranners complain about it.

Another way of writing the Fortran code so that "a" and "b" occupy the
same memory is to use EQUIVALENCE.

program xequivalence
implicit none
integer :: a(1),b(1)
integer :: c(1)
equivalence (a,b)
a = [1]
c = a
b = a
b(1) = 10
print*,a,b,c
end program xequivalence

output: 10 10 1

EQUIVALENCE is considered a "harmful" feature of early FORTRAN
http://www.ibiblio.org/pub/languages/fortran/ch1-5.html .

Dec 11 '06 #15
In <11*********************@79g2000cws.googlegroups.c om>, Beliavsky wrote:
>ISTM the big catch for Fortran programmers is when a mutable container
is referenced from multiple places; thus a change via one reference
will confusingly show up via the other one.

As a Fortranner, I agree. Is there an explanation online of why Python
treats lists the way it does? I did not see this question in the Python
FAQs at http://www.python.org/doc/faq/ .
This question sounds as if lists are treated somehow special. They are
treated like any other object in Python. Any assignment binds an object
to a name or puts a reference into a "container" object.

Ciao,
Marc 'BlackJack' Rintsch
Dec 11 '06 #16

greg wrote:
You need to stop using the term "copy by reference",
because it's meaningless. Just remember that assignment
I agree that "copy by reference" is a bad choice of words. I meant pass
by reference and assign by reference. But the effect is to make a
virtual copy, so although the phrase is perhaps misleading, it is not
"meaningless."

<cut>
Again, this is something you'll find easier when
you've had more experience with Python. Generally,
you only need to copy something when you want an
independent object that you can manipulate without
affecting anything else, although that probably
doesn't sound very helpful.
Yes, I realize that, but deciding *when* you need a copy is the hard
part.

Dec 11 '06 #17

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Johann Blake | last post by:
I can hardly believe I'm the first one to report this, but having gone through the newsgroup, it appears that way. I would like to open a solution in the VS.NET IDE that consists of multiple...
11
by: Alexei | last post by:
Hello, The following doesn't compile due to absence of the copy constructor in class FileStream: FileInfo ^ fi = ...; FileStream fs = fi->OpenRead(); The compiler is Beta 2. Is this...
0
by: RadekP | last post by:
Hi Gurus .. I would really appreciate some insights for the problem that bugs me for quite some time. I keep my custom controls in their own shared (private/public key signed) assembly. I need...
3
by: Brian Bischof | last post by:
I'm having troubles getting the debugging process to work consistenly for external classes. I got it to work once and then I turned it off. But now I can't get re-enabled. Here is what I'm doing....
6
by: Brian Bischof | last post by:
I'm having troubles getting the debugging process to work consistenly for external classes. I got it to work once and then I turned it off. But now I can't get re-enabled. Here is what I'm doing....
0
by: Ken Allen | last post by:
The MSDN documentation on remote debugging is a bit sparse, to say the least, and there is almost no information available on the 'best' way to configure this. I should note that my development...
1
by: Ethan Strauss | last post by:
Hi, I have a C#.net Web application which calls a web service (http://eutils.ncbi.nlm.nih.gov/entrez/eutils/soap/eutils.wsdl). It has run just fine for months. Recently I started getting...
9
by: puzzlecracker | last post by:
From my understanding, if you declare any sort of constructors, (excluding copy ctor), the default will not be included by default. Is this correct? class Foo{ public: Foo(int); // no...
6
kenobewan
by: kenobewan | last post by:
Congratulations! You are one of the few who realise that over 80% of errors are simple and easy to fix. It is important to realise this as it can save a lot of time. Time that could be wasted making...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.