473,796 Members | 2,483 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Locking around

Hello,

I need to synchronize the access to a couple of hundred-thousand
files[1]. It seems to me that creating one lock object for each of the
files is a waste of resources, but I cannot use a global lock for all
of them either (since the locked operations go over the network, this
would make the whole application essentially single-threaded even
though most operations act on different files).

My idea is therefore to create and destroy per-file locks "on-demand"
and to protect the creation and destruction by a global lock
(self.global_lo ck). For that, I add a "usage counter"
(wlock.user_cou nt) to each lock, and destroy the lock when it reaches
zero. The number of currently active lock objects is stored in a dict:

def lock_s3key(s3ke y):

self.global_loc k.acquire()
try:

# If there is a lock object, use it
if self.key_lock.h as_key(s3key):
wlock = self.key_lock[s3key]
wlock.user_coun t += 1
lock = wlock.lock

# otherwise create a new lock object
else:
wlock = WrappedLock()
wlock.lock = threading.Lock( )
wlock.user_coun t = 1
self.key_lock[s3key] = wlock

finally:
self.global_loc k.release()

# Lock the key itself
lock.acquire()
and similarly

def unlock_s3key(s3 key):

# Lock dictionary of lock objects
self.global_loc k.acquire()
try:

# Get lock object
wlock = self.key_lock[s3key]

# Unlock key
wlock.lock.rele ase()

# We don't use the lock object any longer
wlock.user_coun t -= 1

# If no other thread uses the lock, dispose it
if wlock.user_coun t == 0:
del self.key_lock[s3key]
assert wlock.user_coun t >= 0

finally:
self.global_loc k.release()
WrappedLock is just an empty class that allows me to add the
additional user_count attribute.
My questions:

- Does that look like a proper solution, or does anyone have a better
one?

- Did I overlook any deadlock possibilities?
Best,
Nikolaus

[1] Actually, it's not really files (because in that case I could use
fcntl) but blobs stored on Amazon S3.
--
»It is not worth an intelligent man's time to be in the majority.
By definition, there are already enough people to do that.«
-J.H. Hardy

PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C
Aug 4 '08 #1
11 1110
Ulrich Eckhardt <ec******@sator laser.comwrites :
Nikolaus Rath wrote:
>I need to synchronize the access to a couple of hundred-thousand
files[1]. It seems to me that creating one lock object for each of the
files is a waste of resources, but I cannot use a global lock for all
of them either (since the locked operations go over the network, this
would make the whole application essentially single-threaded even
though most operations act on different files).

Just wondering, but at what time do you know what files are needed?
As soon as I have read a client request. Also, I will only need one
file per request, not multiple.
If you know that rather early, you could simply 'check out' the
required files, do whatever you want with them and then release them
again. If one of the requested files is marked as already in use,
you simply wait (without reserving the others) until someone
releases files and then try again. You could also wait for that
precise file to be available, but that would require that you
already reserve the other files, which might unnecessarily block
other accesses.

Note that this idea requires that each access locks one set of files at the
beginning and releases them at the end, i.e. no attempts to lock files in
between, which would otherwise easily lead to deadlocks.
I am not sure that I understand your idea. To me this sounds exactly
like what I'm already doing, just replace 'check out' by 'lock' in
your description... Am I missing something?
>My idea is therefore to create and destroy per-file locks "on-demand"
and to protect the creation and destruction by a global lock
(self.global_l ock). For that, I add a "usage counter"
(wlock.user_co unt) to each lock, and destroy the lock when it reaches
zero.
[...code...]
> - Does that look like a proper solution, or does anyone have a better
one?

This should work, at least the idea is not flawed. However, I'd say
there are too many locks involved. Rather, you just need a simple
flag and the global lock. Further, you need a condition/event that
tells waiting threads that you released some of the files so that it
should see again if the ones it wants are available.
I have to agree that this sounds like an easier implementation. I just
have to think about how to do the signalling. Thanks a lot!
Best,

-Nikolaus

--
»It is not worth an intelligent man's time to be in the majority.
By definition, there are already enough people to do that.«
-J.H. Hardy

PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C
Aug 6 '08 #2
Nikolaus Rath <Ni******@rath. orgwrites:
>This should work, at least the idea is not flawed. However, I'd say
there are too many locks involved. Rather, you just need a simple
flag and the global lock. Further, you need a condition/event that
tells waiting threads that you released some of the files so that it
should see again if the ones it wants are available.

I have to agree that this sounds like an easier implementation. I
just have to think about how to do the signalling. Thanks a lot!
Here's the code I use now. I think it's also significantly easier to
understand (cv is a threading.Condi tion() object and cv.locked_keys a
set()).

def lock_s3key(s3ke y):
cv = self.s3_lock

try:
# Lock set of locked s3 keys (global lock)
cv.acquire()

# Wait for given s3 key becoming unused
while s3key in cv.locked_keys:
cv.wait()

# Mark it as used (local lock)
cv.locked_keys. add(s3key)
finally:
# Release global lock
cv.release()
def unlock_s3key(s3 key):
cv = self.s3_lock

try:
# Lock set of locked s3 keys (global lock)
cv.acquire()

# Mark key as free (release local lock)
cv.locked_keys. remove(s3key)

# Notify other threads
cv.notify()

finally:
# Release global lock
cv.release()
Best,

-Nikolaus

--
»It is not worth an intelligent man's time to be in the majority.
By definition, there are already enough people to do that.«
-J.H. Hardy

PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C
Aug 6 '08 #3
On Aug 4, 9:30*am, Nikolaus Rath <Nikol...@rath. orgwrote:
Hello,

I need to synchronize the access to a couple of hundred-thousand
files[1]. It seems to me that creating one lock object for each of the
files is a waste of resources, but I cannot use a global lock for all
of them either (since the locked operations go over the network, this
would make the whole application essentially single-threaded even
though most operations act on different files).

My idea is therefore to create and destroy per-file locks "on-demand"
and to protect the creation and destruction by a global lock
(self.global_lo ck). For that, I add a "usage counter"
(wlock.user_cou nt) to each lock, and destroy the lock when it reaches
zero.
[snip]
My questions:

- Does that look like a proper solution, or does anyone have a better
one?

You need the per-file locks at all if you use a global lock like
this. Here's a way to do it using threading.Condi tion objects. I
suspect it might not perform so well if there is a lot of competition
for certain keys but it doesn't sound like that's the case for you.
Performance and robustness improvements left as an exercise. (Note:
I'm not sure where self comes from in your examples so I left it out
of mine.)
global_lock = threading.Condi tion()
locked_keys = set()

def lock_s3key(s3ke y):
global_lock.acq uire()
while s3key in locked_keys:
global_lock.wai t()
locked_keys.add (s3key)
global_lock.rel ease()

def unlock_s3key(s3 key):
global_lock.acq uire()
locked_keys.rem ove(s3key)
global_lock.not ifyAll()
global_lock.rel ease()

Carl Banks
Aug 6 '08 #4
On Aug 6, 6:34*am, Nikolaus Rath <Nikol...@rath. orgwrote:
Nikolaus Rath <Nikol...@rath. orgwrites:
This should work, at least the idea is not flawed. However, I'd say
there are too many locks involved. Rather, you just need a simple
flag and the global lock. Further, you need a condition/event that
tells waiting threads that you released some of the files so that it
should see again if the ones it wants are available.
I have to agree that this sounds like an easier implementation. I
just have to think about how to do the signalling. Thanks a lot!

Here's the code I use now. I think it's also significantly easier to
understand (cv is a threading.Condi tion() object and cv.locked_keys a
set()).

* * def lock_s3key(s3ke y):
* * * * cv = self.s3_lock

* * * * try:
* * * * * * # Lock set of locked s3 keys (global lock)
* * * * * * cv.acquire()

* * * * * * # Wait for given s3 key becoming unused
* * * * * * while s3key in cv.locked_keys:
* * * * * * * * cv.wait()

* * * * * * # Mark it as used (local lock)
* * * * * * cv.locked_keys. add(s3key)
* * * * finally:
* * * * * * # Release global lock
* * * * * * cv.release()

* * def unlock_s3key(s3 key):
* * * * cv = self.s3_lock

* * * * try:
* * * * * * # Lock set of locked s3 keys (global lock)
* * * * * * cv.acquire()

* * * * * * # Mark key as free (release local lock)
* * * * * * cv.locked_keys. remove(s3key)

* * * * * * # Notify other threads
* * * * * * cv.notify()

* * * * finally:
* * * * * * # Release global lock
* * * * * * cv.release()
Freaky... I just posted nearly this exact solution.

I have a couple comments. First, the call to acquire should come
before the try block. If the acquire were to fail, you wouldn't want
to release the lock on cleanup.

Second, you need to change notify() to notifyAll(); notify alone won't
cut it. Consider what happens if you have two threads waiting for
keys A and B respectively. When the thread that has B is done, it
releases B and calls notify, but notify happens to wake up the thread
waiting on A. Thus the thread waiting on B is starved.
Carl Banks
Aug 6 '08 #5
Carl Banks <pa************ @gmail.comwrite s:
Freaky... I just posted nearly this exact solution.

I have a couple comments. First, the call to acquire should come
before the try block. If the acquire were to fail, you wouldn't want
to release the lock on cleanup.

Second, you need to change notify() to notifyAll(); notify alone won't
cut it. Consider what happens if you have two threads waiting for
keys A and B respectively. When the thread that has B is done, it
releases B and calls notify, but notify happens to wake up the thread
waiting on A. Thus the thread waiting on B is starved.
You're right. Thanks for pointing it out.

Best,

-Nikolaus

--
»It is not worth an intelligent man's time to be in the majority.
By definition, there are already enough people to do that.«
-J.H. Hardy

PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C
Aug 6 '08 #6
On Mon, 04 Aug 2008 15:30:51 +0200, Nikolaus Rath wrote:
Hello,

I need to synchronize the access to a couple of hundred-thousand
files[1]. It seems to me that creating one lock object for each of the
files is a waste of resources, but I cannot use a global lock for all
of them either (since the locked operations go over the network, this
would make the whole application essentially single-threaded even
though most operations act on different files).
Do you think you could use an SQL database on the network to
handle the locking? I was thinking of a table with one row
per file. If the lock field is clear, you could update with a unique
ID, and query back to make sure that it is still yours before
accessing the file.

Hey, maybe the files themselves should go into blobs.
** Posted from http://www.teranews.com **
Aug 6 '08 #7
Hey,

I'm trying to figure out how I can validate an XML file using a DTD that
isn't specified in the XML file.

My code so far is:

from xml import sax
from xml.sax import sax2exts

parser = sax2exts.XMLVal ParserFactory.m ake_parser()

parser.setConte ntHandler(handl er)
parser.setError Handler(handler )

parser.parse(xm l_file)

And this works fine if the DTD is specified in the XML file i.e errors
are generated for non-compliant entities. But I would like to force the
file to be valid according to one other DTD file that is not referenced
in the XML file.

Anyone know how to do this?

Cheers,
Brian
Aug 6 '08 #8
Tobiah <to**@tobiah.or gwrites:
On Mon, 04 Aug 2008 15:30:51 +0200, Nikolaus Rath wrote:
>Hello,

I need to synchronize the access to a couple of hundred-thousand
files[1]. It seems to me that creating one lock object for each of the
files is a waste of resources, but I cannot use a global lock for all
of them either (since the locked operations go over the network, this
would make the whole application essentially single-threaded even
though most operations act on different files).

Do you think you could use an SQL database on the network to
handle the locking?
Yeah, I could. It wouldn't even have to be over the network (I'm
synchronizing access from within the same program). But I think that
is even more resource-wasteful than my original idea.
Hey, maybe the files themselves should go into blobs.
Nope, not possible. They're on Amazon S3.

Best,

-Nikolaus

--
»It is not worth an intelligent man's time to be in the majority.
By definition, there are already enough people to do that.«
-J.H. Hardy

PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C
Aug 6 '08 #9
Brian Quinlan <br***@sweetapp .comwrites:
I'm trying to figure out how I can validate an XML file using a DTD
that isn't specified in the XML file.
When your inention is to start a new discussion, you could compose a
new message, *not* reply to an existing message. Your message here is
now part of an existing thread of discussion, yet is confusingly
unrelated in its content, and will not be noticed by most readers.

--
\ “Whatever you do will be insignificant, but it is very |
`\ important that you do it.” —Mahatma Gandhi |
_o__) |
Ben Finney
Aug 6 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
3514
by: Sam | last post by:
Hello everyone, I have around 20 reports in an ASP web-application which connects to a SQL Server 2000 dB, executes stored procedures based on input parameters and returns the data in a nice tabular format. The data which is used in these reports actually originates from a 3rd party accounting application called Exchequer. I have written a VB application (I call it the extractor) which extracts data from Exchequer and dumps the same...
16
8935
by: Nid | last post by:
How do I do row-level locking on SQL Server? Thanks, Nid
2
2219
by: Deano | last post by:
I use the Access 2000 MSI Wizard from Sagekey and they don't know if the bug documented on page 32 of the Access 2000 Developer`s Handbook Volume 2: Enterprise Edition still affects the runtime. Apparently page locking is invoked when using the Access run-time record than record-level locking. This also affects Access 2000 but if you patch Access the bug is fixed. I really don't know about the runtime though. Does anyone know if...
5
2360
by: swapna_munukoti | last post by:
Hi all, Is there any tool to achieve record locking in MS Access 2000. Thanks, Swapna.
15
6203
by: z. f. | last post by:
Hi, i have an ASP.NET project that is using a (Class Library Project) VB.NET DLL. for some reason after running some pages on the web server, and trying to compile the Class Library DLL, it can't compile because the DLL is in use (and the PDB too), and the w3wp.exe process is the process locking the DLL (as viewed with Sysinternals - Process Explorer). this is a huge problem. i need to do IIS reset in order to free the DLL! 1. why is...
16
3470
by: akantrowitz | last post by:
In csharp, what is the correct locking around reading and writing into a hashtable. Note that the reader is not looping through the keys, simply reading an item out with a specific key: If i have the following hashtable h which has multiple readers and 1 writer (on different threads) is this the correct locking below: lock (h.syncroot) {
7
2869
by: Shak | last post by:
Hi all, I'm trying to write a thread-safe async method to send a message of the form (type)(contents). My model is as follows: private void SendMessage(int type, string message) { //lets send the messagetype via async NetworkStream ns = client.GetStream(); //assume client globally accessible
3
10405
by: dchadha | last post by:
Hi, I am working on application in C# which uses and stores data in xml file. This app can use xml file from two different ways: (a) From User Interface (Windows application) and (b) From Windows Service. Both apps read data from xml file in dataset and update the file after doing the desired operation on dataset. Now the problem is when windows service reads the xml in dataset and while its doing the operation (it takes few minutes)...
5
1711
by: Chris Mullins | last post by:
I've spent some time recently looking into optimizing some memory usage in our products. Much of this was doing through the use of string Interning. I spent the time and checked numbers in both x86 and x64, and have published the results here: http://www.coversant.com/dotnetnuke/Default.aspx?tabid=88&EntryID=24 The benefits for our SoapBox suite of products are pretty compelling, memory wise. Before I roll the changes into our...
0
1323
by: xpding | last post by:
Hello, I have a class MyEmbededList contains a generic dictionary, the value field is actually the MyEmbededList type as well. There is another class need to access and manipulate a list of MyEmbededList (please refer to the MyTestClass below). I am not sure whether I implements the right locking mechanism here and hope someone can give me some advices. I have provided some codes for these two classes below. My questions are: 1. Am I...
0
9680
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, well explore What is ONU, What Is Router, ONU & Routers main usage, and What is the difference between ONU and Router. Lets take a closer look ! Part I. Meaning of...
0
9528
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10455
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10006
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
5441
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5573
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4116
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3731
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2925
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.