Hello!
I have a question about filecmp.cmp(). The short code snippet blow
does not bahave as I would expect:
import filecmp
f0 = "foo.dat"
f1 = "bar.dat"
f = open(f0, "w")
f.write("1:2")
f.close()
f = open(f1, "w")
f.write("1:2")
f.close()
print "cmp 1: " + str(filecmp.cmp (f0, f1, False))
f = open(f1, "w")
f.write("2:3")
f.close()
print "cmp 2: " + str(filecmp.cmp (f0, f1, False))
I would expect the second comparison to return False instead of True.
Looking at the docs for filecmp.cmp() I found the following: "This
function uses a cache for past comparisons and the results, with a
cache invalidation mechanism relying on stale signatures.". I guess
that this is the reason for my test case failing.
Is there someone here that can tell me how I should invalidate this
cache? If that is not possible, what workaround could I use? I guess
that I can write my own file comparison function, but I would not like
to have to do that since we have filecmp.
Any ideas?
Regards,
Mattias 5 2442
Mattias Brändström wrote:
I have a question about filecmp.cmp(). The short code snippet blow
does not bahave as I would expect:
import filecmp
f0 = "foo.dat"
f1 = "bar.dat"
f = open(f0, "w")
f.write("1:2")
f.close()
f = open(f1, "w")
f.write("1:2")
f.close()
print "cmp 1: " + str(filecmp.cmp (f0, f1, False))
f = open(f1, "w")
f.write("2:3")
f.close()
print "cmp 2: " + str(filecmp.cmp (f0, f1, False))
I would expect the second comparison to return False instead of True.
Looking at the docs for filecmp.cmp() I found the following: "This
function uses a cache for past comparisons and the results, with a
cache invalidation mechanism relying on stale signatures.". I guess
that this is the reason for my test case failing.
Is there someone here that can tell me how I should invalidate this
cache? If that is not possible, what workaround could I use? I guess
that I can write my own file comparison function, but I would not like
to have to do that since we have filecmp.
Any ideas?
You can clear the cache with
filecmp._cache = {}
as a glance into the filecmp module would have shown.
If you don't want to use the cache at all (untested):
class NoCache:
def __setitem__(sel f, key, value):
pass
def get(self, key):
return None
filecmp._cache = NoCache()
Alternatively an update to Python 2.5 might work as the type of
os.stat(filenam e).st_mtime was changed from int to float and now offers
subsecond resolution.
Peter
On Feb 15, 5:56 pm, Peter Otten <__pete...@web. dewrote:
You can clear the cache with
filecmp._cache = {}
as a glance into the filecmp module would have shown.
You are right, a quick glance would have enlighten me. Next time I
will RTFS first. :-)
If you don't want to use the cache at all (untested):
class NoCache:
def __setitem__(sel f, key, value):
pass
def get(self, key):
return None
filecmp._cache = NoCache()
Just one small tought/question. How likely am I to run into trouble
because of this? I mean, by setting _cache to another value I'm
mucking about in filecmp's implementation details. Is this generally
considered OK when dealing with Python's standard library?
:.:: mattias
Mattias Brändström wrote:
On Feb 15, 5:56 pm, Peter Otten <__pete...@web. dewrote:
>You can clear the cache with
filecmp._cac he = {}
as a glance into the filecmp module would have shown.
You are right, a quick glance would have enlighten me. Next time I
will RTFS first. :-)
>If you don't want to use the cache at all (untested):
class NoCache: def __setitem__(sel f, key, value): pass def get(self, key): return None filecmp._cac he = NoCache()
Just one small tought/question. How likely am I to run into trouble
because of this? I mean, by setting _cache to another value I'm
mucking about in filecmp's implementation details. Is this generally
considered OK when dealing with Python's standard library?
I think it's a feature that Python lends itself to monkey-patching, but
still there are a few things to consider:
- Every hack increases the likelihood that your app will break in the next
version of Python.
- You take some responsibility for the "patched" code. It's no longer the
tried and tested module as provided by the core developers.
- The module may be used elsewhere in the standard library or third-party
packages, and failures (or in the above example: performance degradation)
may ensue.
For a script and a relatively obscure module like 'filecmp' monkey-patching
is probably OK, but for a larger app or a module like 'os' that is heavily
used throughout the standard lib I would play it safe and reimplement.
Peter
On Feb 15, 11:43 pm, Peter Otten <__pete...@web. dewrote:
Mattias Brändström wrote:
Just one small tought/question. How likely am I to run into trouble
because of this? I mean, by setting _cache to another value I'm
mucking about in filecmp's implementation details. Is this generally
considered OK when dealing with Python's standard library?
I think it's a feature that Python lends itself to monkey-patching, but
still there are a few things to consider:
- Every hack increases the likelihood that your app will break in the next
version of Python.
- You take some responsibility for the "patched" code. It's no longer the
tried and tested module as provided by the core developers.
- The module may be used elsewhere in the standard library or third-party
packages, and failures (or in the above example: performance degradation)
may ensue.
For a script and a relatively obscure module like 'filecmp' monkey-patching
is probably OK, but for a larger app or a module like 'os' that is heavily
used throughout the standard lib I would play it safe and reimplement.
Thanks for the insight! Right now I need this for a unit test, so in
this case I'm quite happy to use the NoCache solution you suggested.
:.:: brasse
Peter Otten wrote:
Mattias Brändström wrote:
>On Feb 15, 5:56 pm, Peter Otten <__pete...@web. dewrote:
>>You can clear the cache with
filecmp._cach e = {}
as a glance into the filecmp module would have shown.
You are right, a quick glance would have enlighten me. Next time I will RTFS first. :-)
>>If you don't want to use the cache at all (untested):
class NoCache: def __setitem__(sel f, key, value): pass def get(self, key): return None filecmp._cach e = NoCache()
Just one small tought/question. How likely am I to run into trouble because of this? I mean, by setting _cache to another value I'm mucking about in filecmp's implementation details. Is this generally considered OK when dealing with Python's standard library?
I think it's a feature that Python lends itself to monkey-patching, but
still there are a few things to consider:
- Every hack increases the likelihood that your app will break in the next
version of Python.
- You take some responsibility for the "patched" code. It's no longer the
tried and tested module as provided by the core developers.
- The module may be used elsewhere in the standard library or third-party
packages, and failures (or in the above example: performance degradation)
may ensue.
For a script and a relatively obscure module like 'filecmp' monkey-patching
is probably OK, but for a larger app or a module like 'os' that is heavily
used throughout the standard lib I would play it safe and reimplement.
It would probably be a good idea to add a clear_cache() function to the
module API for 2.6 to avoid such issues.
regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
Blog of Note: http://holdenweb.blogspot.com
See you at PyCon? http://us.pycon.org/TX2007 This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Borek |
last post by:
In my project I have to access in one business method of session bean
usually more then 10 CMP Beans. I would like to have some utils classes,
which could get me right instance of CMP or create one. Instead of making
the same thing again and again. My proposition looks like:
1) Class for taking home interface:
public final class JNDILookup {
private static Context initialContext;
private static HashMap homeInterfaces;
private static...
|
by: Ivan Van Laningham |
last post by:
Hi All--
I noticed recently that a few of the jpgs from my digital cameras have
developed bitrot. Not a real problem, because the cameras are CD
Mavicas, and I can simply copy the original from the cd. Except for the
fact that I've got nearly 25,000 images to check. So I wrote a set of
programs to both index the disk versions with the cd versions, and to
compare, using filecmp.cmp(), the cd and disk version. Works fine.
Turned up...
|
by: iqbal |
last post by:
Hi Folks,
If someone could please help me out with my frustration trying to
install the Visual Age C++ libraries (vacpp.cmp.lib) V5.0.2 on AIX 5.1,
ML1. I keep getting the following message from installp (using smitty):
/vacpp.cmp.lib.post_i: TYPERML: 0403-009 The specified number is
not valid for this command.
instal: Failed while executing the ./vacpp.cmp.lib.post_i script.
|
by: Antoon Pardon |
last post by:
I was wondering how people would feel if the cmp function and
the __cmp__ method would be a bit more generalised.
The problem now is that the cmp protocol has no way to
indicate two objects are incomparable, they are not
equal but neither is one less or greater than the other.
So I thought that either cmp could return None in this
case or throw a specific exception. People writing a
__cmp__ method could do the same.
|
by: Schüle Daniel |
last post by:
Hello,
first question
In : cmp("ABC",)
Out: 1
against what part of the list is the string "ABC" compared?
second question
| |
by: Ping |
last post by:
Hi,
I'm wondering if it is useful to extend the count() method of a list
to accept a callable object? What it does should be quite intuitive:
count the number of items that the callable returns True or anything
logically equivalent (non-empty sequence, non-zero number, etc).
This would return the same result as len(filter(a_callable, a_list)),
but without constructing an intermediate list which is thrown away
after len() is done.
|
by: xkenneth |
last post by:
Looking to do something similair. I'm working with alot of timestamps
and if they're within a couple seconds I need them to be indexed and
removed from a list.
Is there any possible way to index with a custom cmp() function?
I assume it would be something like...
list.index(something,mycmp)
Thanks!
|
by: George Sakkis |
last post by:
I want to sort sequences of strings lexicographically but those with
longer prefix should come earlier, e.g. for s = , the sorted sequence is . Currently I do it with:
s.sort(cmp=lambda x,y: 0 if x==y else
-1 if x.startswith(y) else
+1 if y.startswith(x) else
cmp(x,y))
Can this be done with an equivalent key function instead of cmp ?
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
| |
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
|
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |