473,748 Members | 6,664 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Re: how to remove oldest files up to a limit efficiently



Dan Stromberg wrote:
On Tue, 08 Jul 2008 15:18:23 -0700, li******@gmail. com wrote:
>I need to mantain a filesystem where I'll keep only the most recently
used (MRU) files; least recently used ones (LRU) have to be removed to
leave space for newer ones. The filesystem in question is a clustered fs
(glusterfs) which is very slow on "find" operations. To add complexity
there are more than 10^6 files in 2 levels: 16³ dirs with equally
distributed number of files inside.
>Any suggestions of how to do it effectively?

os.walk once.

Build a list of all files in memory.

Sort them by whatever time you prefer - you can get times from os.stat.
Since you do not need all 10**6 files sorted, you might also try the
heapq module. The entries into the heap would be (time, fileid)

Jul 9 '08 #1
1 1079
On Jul 9, 7:08*pm, Terry Reedy <tjre...@udel.e duwrote:
Dan Stromberg wrote:
On Tue, 08 Jul 2008 15:18:23 -0700, linux...@gmail. com wrote:
I need to mantain a filesystem where I'll keep only the most recently
used (MRU) files; least recently used ones (LRU) have to be removed to
leave space for newer ones. The filesystem in question is a clustered fs
(glusterfs) which is very slow on "find" operations. To add complexity
there are more than 10^6 files in 2 levels: 16³ dirs with equally
distributed number of files inside.
Any suggestions of how to do it effectively?
os.walk once.
Build a list of all files in memory.
Sort them by whatever time you prefer - you can get times from os.stat.

Since you do not need all 10**6 files sorted, you might also try the
heapq module. *The entries into the heap would be (time, fileid)
I'll look into it: probably sorting dirs by atime and adding the files
inside to the heapq until I can remove enough of them would work very
efficiently.

Thanks
Pau
Jul 9 '08 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
12264
by: Ola Natvig | last post by:
Hi all Does anyone know of a fast way to calculate checksums for a large file. I need a way to generate ETag keys for a webserver, the ETag of large files are not realy nececary, but it would be nice if I could do it. I'm using the python hash function on the dynamic generated strings (like in page content) but on things like images I use the shutil's copyfileobject function and the hash of a fileobject's hash are it's handlers memmory...
5
2112
by: Jeremy | last post by:
I am relatively inexperienced with SQL, and I am trying to learn how to analyze some data with it. I have a table with the following information. COMPANY ID , DATE, MarektValue I would like to select the largest 100 marketvalues for each date in my table. How can I do this efficiently in MySql.
22
1758
by: fynali | last post by:
Hi all, I have two files: - PSP0000320.dat (quite a large list of mobile numbers), - CBR0000319.dat (a subset of the above, a list of barred bumbers) # head PSP0000320.dat CBR0000319.dat ==> PSP0000320.dat <== 96653696338
3
5407
by: Niyazi | last post by:
Hi all, I have a dataTable that contains nearly 38400 rows. In the dataTable consist of 3 column. column 1 Name: MUHNO column 2 Name: HESNO Column 3 Name: BALANCE Let me give you some example first:
31
4596
by: Extremest | last post by:
I have a loop that is set to run as long as the arraylist is > 0. at the beginning of this loop I grab the first object and then remove it. I then go into another loop that checks to see if there are more objects that match the first object that i grabbed. If they match then I put them in an array. I would like to remove each match from the arraylist as I find them to speed things up and so that they don't get checked again. If I try...
100
5123
by: jacob navia | last post by:
Recently, a heated debate started because of poor mr heathfield was unable to compile a program with // comments. Here is a utility for him, so that he can (at last) compile my programs :-) More seriously, this code takes 560 bytes. Amazing isn't it? C is very ompact, you can do great things in a few bytes. Obviously I have avoided here, in consideration for his pedantic
7
5781
by: florin | last post by:
hi, i have been working on a project that my brother gave me, im supposed to get the statistics of a directory and print them out in a html format, i have completed most of the requirements, but i still have a few problems 1. i need to get the newest and oldest files created (ex. the date) 2. i need to put it into a gooey, i guess he doesn't want to use apache, so have it on the desktop if u have any ideas on how to do this i would...
1
1257
by: Avi | last post by:
Hi all, I would like to remove the oldest rows beyond a predefined size from a DataTable. Is it possible to remove a bulk of rows in one shot or do I have to loop and do a DataTable.Rows.RomoveAt(0)
61
3276
by: arnuld | last post by:
I have created a program which creates and renames files. I have described everything in comments. All I have is the cod-duplication. function like fopen, sprint and fwrite are being called again and again. I know to remove code-duplication I have to make functions and pass arguments to them but I am not able to think of a way doing it. Can you post some example for me, out of this code:
0
8987
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9534
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
9316
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9241
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6793
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4597
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4867
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2777
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2211
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.