473,747 Members | 2,822 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

filecmp.cmp() cache

Hello!

I have a question about filecmp.cmp(). The short code snippet blow
does not bahave as I would expect:

import filecmp

f0 = "foo.dat"
f1 = "bar.dat"

f = open(f0, "w")
f.write("1:2")
f.close()

f = open(f1, "w")
f.write("1:2")
f.close()

print "cmp 1: " + str(filecmp.cmp (f0, f1, False))

f = open(f1, "w")
f.write("2:3")
f.close()

print "cmp 2: " + str(filecmp.cmp (f0, f1, False))

I would expect the second comparison to return False instead of True.
Looking at the docs for filecmp.cmp() I found the following: "This
function uses a cache for past comparisons and the results, with a
cache invalidation mechanism relying on stale signatures.". I guess
that this is the reason for my test case failing.

Is there someone here that can tell me how I should invalidate this
cache? If that is not possible, what workaround could I use? I guess
that I can write my own file comparison function, but I would not like
to have to do that since we have filecmp.

Any ideas?

Regards,
Mattias

Feb 15 '07 #1
5 2433
Mattias Brändström wrote:
I have a question about filecmp.cmp(). The short code snippet blow
does not bahave as I would expect:

import filecmp

f0 = "foo.dat"
f1 = "bar.dat"

f = open(f0, "w")
f.write("1:2")
f.close()

f = open(f1, "w")
f.write("1:2")
f.close()

print "cmp 1: " + str(filecmp.cmp (f0, f1, False))

f = open(f1, "w")
f.write("2:3")
f.close()

print "cmp 2: " + str(filecmp.cmp (f0, f1, False))

I would expect the second comparison to return False instead of True.
Looking at the docs for filecmp.cmp() I found the following: "This
function uses a cache for past comparisons and the results, with a
cache invalidation mechanism relying on stale signatures.". I guess
that this is the reason for my test case failing.

Is there someone here that can tell me how I should invalidate this
cache? If that is not possible, what workaround could I use? I guess
that I can write my own file comparison function, but I would not like
to have to do that since we have filecmp.

Any ideas?
You can clear the cache with

filecmp._cache = {}

as a glance into the filecmp module would have shown.
If you don't want to use the cache at all (untested):

class NoCache:
def __setitem__(sel f, key, value):
pass
def get(self, key):
return None
filecmp._cache = NoCache()
Alternatively an update to Python 2.5 might work as the type of
os.stat(filenam e).st_mtime was changed from int to float and now offers
subsecond resolution.

Peter
Feb 15 '07 #2
On Feb 15, 5:56 pm, Peter Otten <__pete...@web. dewrote:
You can clear the cache with

filecmp._cache = {}

as a glance into the filecmp module would have shown.
You are right, a quick glance would have enlighten me. Next time I
will RTFS first. :-)
If you don't want to use the cache at all (untested):

class NoCache:
def __setitem__(sel f, key, value):
pass
def get(self, key):
return None
filecmp._cache = NoCache()
Just one small tought/question. How likely am I to run into trouble
because of this? I mean, by setting _cache to another value I'm
mucking about in filecmp's implementation details. Is this generally
considered OK when dealing with Python's standard library?

:.:: mattias

Feb 15 '07 #3
Mattias Brändström wrote:
On Feb 15, 5:56 pm, Peter Otten <__pete...@web. dewrote:
>You can clear the cache with

filecmp._cac he = {}

as a glance into the filecmp module would have shown.

You are right, a quick glance would have enlighten me. Next time I
will RTFS first. :-)
>If you don't want to use the cache at all (untested):

class NoCache:
def __setitem__(sel f, key, value):
pass
def get(self, key):
return None
filecmp._cac he = NoCache()

Just one small tought/question. How likely am I to run into trouble
because of this? I mean, by setting _cache to another value I'm
mucking about in filecmp's implementation details. Is this generally
considered OK when dealing with Python's standard library?
I think it's a feature that Python lends itself to monkey-patching, but
still there are a few things to consider:

- Every hack increases the likelihood that your app will break in the next
version of Python.
- You take some responsibility for the "patched" code. It's no longer the
tried and tested module as provided by the core developers.
- The module may be used elsewhere in the standard library or third-party
packages, and failures (or in the above example: performance degradation)
may ensue.

For a script and a relatively obscure module like 'filecmp' monkey-patching
is probably OK, but for a larger app or a module like 'os' that is heavily
used throughout the standard lib I would play it safe and reimplement.

Peter

Feb 15 '07 #4
On Feb 15, 11:43 pm, Peter Otten <__pete...@web. dewrote:
Mattias Brändström wrote:
Just one small tought/question. How likely am I to run into trouble
because of this? I mean, by setting _cache to another value I'm
mucking about in filecmp's implementation details. Is this generally
considered OK when dealing with Python's standard library?

I think it's a feature that Python lends itself to monkey-patching, but
still there are a few things to consider:

- Every hack increases the likelihood that your app will break in the next
version of Python.
- You take some responsibility for the "patched" code. It's no longer the
tried and tested module as provided by the core developers.
- The module may be used elsewhere in the standard library or third-party
packages, and failures (or in the above example: performance degradation)
may ensue.

For a script and a relatively obscure module like 'filecmp' monkey-patching
is probably OK, but for a larger app or a module like 'os' that is heavily
used throughout the standard lib I would play it safe and reimplement.
Thanks for the insight! Right now I need this for a unit test, so in
this case I'm quite happy to use the NoCache solution you suggested.

:.:: brasse

Feb 15 '07 #5
Peter Otten wrote:
Mattias Brändström wrote:
>On Feb 15, 5:56 pm, Peter Otten <__pete...@web. dewrote:
>>You can clear the cache with

filecmp._cach e = {}

as a glance into the filecmp module would have shown.
You are right, a quick glance would have enlighten me. Next time I
will RTFS first. :-)
>>If you don't want to use the cache at all (untested):

class NoCache:
def __setitem__(sel f, key, value):
pass
def get(self, key):
return None
filecmp._cach e = NoCache()
Just one small tought/question. How likely am I to run into trouble
because of this? I mean, by setting _cache to another value I'm
mucking about in filecmp's implementation details. Is this generally
considered OK when dealing with Python's standard library?

I think it's a feature that Python lends itself to monkey-patching, but
still there are a few things to consider:

- Every hack increases the likelihood that your app will break in the next
version of Python.
- You take some responsibility for the "patched" code. It's no longer the
tried and tested module as provided by the core developers.
- The module may be used elsewhere in the standard library or third-party
packages, and failures (or in the above example: performance degradation)
may ensue.

For a script and a relatively obscure module like 'filecmp' monkey-patching
is probably OK, but for a larger app or a module like 'os' that is heavily
used throughout the standard lib I would play it safe and reimplement.
It would probably be a good idea to add a clear_cache() function to the
module API for 2.6 to avoid such issues.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
Blog of Note: http://holdenweb.blogspot.com
See you at PyCon? http://us.pycon.org/TX2007

Feb 16 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1832
by: Borek | last post by:
In my project I have to access in one business method of session bean usually more then 10 CMP Beans. I would like to have some utils classes, which could get me right instance of CMP or create one. Instead of making the same thing again and again. My proposition looks like: 1) Class for taking home interface: public final class JNDILookup { private static Context initialContext; private static HashMap homeInterfaces; private static...
3
1734
by: Ivan Van Laningham | last post by:
Hi All-- I noticed recently that a few of the jpgs from my digital cameras have developed bitrot. Not a real problem, because the cameras are CD Mavicas, and I can simply copy the original from the cd. Except for the fact that I've got nearly 25,000 images to check. So I wrote a set of programs to both index the disk versions with the cd versions, and to compare, using filecmp.cmp(), the cd and disk version. Works fine. Turned up...
1
2539
by: iqbal | last post by:
Hi Folks, If someone could please help me out with my frustration trying to install the Visual Age C++ libraries (vacpp.cmp.lib) V5.0.2 on AIX 5.1, ML1. I keep getting the following message from installp (using smitty): /vacpp.cmp.lib.post_i: TYPERML: 0403-009 The specified number is not valid for this command. instal: Failed while executing the ./vacpp.cmp.lib.post_i script.
39
2811
by: Antoon Pardon | last post by:
I was wondering how people would feel if the cmp function and the __cmp__ method would be a bit more generalised. The problem now is that the cmp protocol has no way to indicate two objects are incomparable, they are not equal but neither is one less or greater than the other. So I thought that either cmp could return None in this case or throw a specific exception. People writing a __cmp__ method could do the same.
4
1532
by: Schüle Daniel | last post by:
Hello, first question In : cmp("ABC",) Out: 1 against what part of the list is the string "ABC" compared? second question
26
2450
by: Ping | last post by:
Hi, I'm wondering if it is useful to extend the count() method of a list to accept a callable object? What it does should be quite intuitive: count the number of items that the callable returns True or anything logically equivalent (non-empty sequence, non-zero number, etc). This would return the same result as len(filter(a_callable, a_list)), but without constructing an intermediate list which is thrown away after len() is done.
6
2183
by: xkenneth | last post by:
Looking to do something similair. I'm working with alot of timestamps and if they're within a couple seconds I need them to be indexed and removed from a list. Is there any possible way to index with a custom cmp() function? I assume it would be something like... list.index(something,mycmp) Thanks!
9
1836
by: George Sakkis | last post by:
I want to sort sequences of strings lexicographically but those with longer prefix should come earlier, e.g. for s = , the sorted sequence is . Currently I do it with: s.sort(cmp=lambda x,y: 0 if x==y else -1 if x.startswith(y) else +1 if y.startswith(x) else cmp(x,y)) Can this be done with an equivalent key function instead of cmp ?
0
8979
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8818
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9354
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9307
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9223
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8233
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6067
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
3296
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2203
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.