473,395 Members | 1,726 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

1 file, multiple threads

If I have multiple threads reading from the same file, would that be a
problem?

if yes, how would I solve it?

Let's say I want to take it a step further and start writing to 1 file form
multiple threads, how would I solve that?
thanx,

Guyon
Jul 18 '05 #1
7 11181
In article <41***********************@news.nl.uu.net>,
Guyon Morée <gumuz@NO_looze_SPAM.net> wrote:

If I have multiple threads reading from the same file, would that be a
problem?

if yes, how would I solve it?

Let's say I want to take it a step further and start writing to 1 file form
multiple threads, how would I solve that?


Make a new thread just for dealing with the file and post messages to it
using Queue.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

WiFi is the SCSI of the 21st Century -- there are fundamental technical
reasons for sacrificing a goat. (with no apologies to John Woods)
Jul 18 '05 #2
> If I have multiple threads reading from the same file, would that be a
problem?
As long as you open each file with 'r' or 'rb' access only, it is not
a problem. I believe you can even write to that file from one (but
only one) thread while reading the file in multiple other threads.
This probably isn't the most robust way and may even cause some
conflicts. You may want to check on this though.

Let's say I want to take it a step further and start writing to 1 file form
multiple threads, how would I solve that?


If you use the built-in 'threading' module in Python, you can simple
acquire a mutex (implemented using Semaphore/Lock/RLock) before
writing and release it immediately afterward.

Example...

Your main thread would allocate a file lock:

mutex_writefile = threading.Lock()

Each thread writing to a file can do the following:

mutex_writefile.acquire()
fh = open(fname, 'a')
fh.write(str)
mutex_writefile.release()

It is important that you release your locks or you will have a
deadlock on each thread waiting to write. See the threading module's
documentation for more details.

You could also do the same with the thread module's lock object;
however, in my opinion you gain much more functionality with threading
with little increase in complexity.

-Jay
Jul 18 '05 #3
Jason wrote:
If I have multiple threads reading from the same file, would that be a
problem?


As long as you open each file with 'r' or 'rb' access only, it is not
a problem. I believe you can even write to that file from one (but
only one) thread while reading the file in multiple other threads.


You *can*, but due to buffering issues, it's likely that the reader
threads will not see changes made by the writer thread properly, and may
have issues with separate disk-reads which nominally stop/start at the
same location not actually matching because the underlying disk file has
changed. (Note that disk reads, which go into a buffer, do not
necessarily correlate in any predictable way to calls to the read*()
family of functions.)

In order to ensure consistent access to a mutable (i.e. not read-only)
file from multiple threads, it would be necessary to ensure that only
one thread was accessing the file at a given instant (i.e. use some form
of locking/synchronizing mechanism), and to be careful to flush all
buffers both before and after any file access. If the file can change
at all, then the only time that a given thread can make *any*
assumptions about the state of the file is during a single section in
which that thread has exclusive access to the file.

Much simpler to designate a single file-handler 'server', and have each
thread access the file only through the intermediary of this server
(which could be implemented as a separate thread itself). The server
then manages all of the file buffers, both in and out, and can ensure
that each access happens in a consistent way.

Jeff Shannon
Technician/Programmer
Credit International

Jul 18 '05 #4
Jeff Shannon wrote:
Much simpler to designate a single file-handler 'server', and have each
thread access the file only through the intermediary of this server
(which could be implemented as a separate thread itself). The server
then manages all of the file buffers, both in and out, and can ensure
that each access happens in a consistent way.


Yes - Aahz's somewhat terse suggestion of using the threading module and a Queue
is the best way to achieve this.

Localising everything to do with parsing and formatting the file correctly also
becomes trivial in this case - the clients of the file handler thread can deal
with standard Python objects, leaving the file handler to cope with the file
format issues.

It can also be very fast, if the file is small enough that the whole thing can
be loaded into memory.

Regards,
Nick.
Jul 18 '05 #5
"Guyon Morée" <gumuz@NO_looze_SPAM.net> wrote in message news:<41***********************@news.nl.uu.net>...
If I have multiple threads reading from the same file, would that be a
problem?

if yes, how would I solve it?

Let's say I want to take it a step further and start writing to 1 file form
multiple threads, how would I solve that?
thanx,

Guyon


The sort answer: yes you will.
The long: how it will hurt you, depends on OS, the sequence of
operations
and the kind of operations.

If you read "next line" (or something like this) or read data from
random positions,
using lock/unlock around seek/read, will solve the problem
(and maybe defeat the multithreading gain).

Anyway writes may be not seen by readers.

Remember, that for each file OS has one read position and one write
position.
For writing from multiple threads you also needs lock/unlock guard.

You really would be better off if you use Queues in all threads
Jul 18 '05 #6
Other folks have answered your questions, although some of the answers
may be open to interpretation.

I just want to throw in a warning. Multi-threaded programming can be a
little tricky.
If you don't know what you are doing, or if you are not careful, you can
program
bugs that manifest themselves very rarely. I've seen them that cropped up
only about
once every two months on average in a continuously running program.
Reproducing
the failure can be next to impossible. Finding the bugs can be a nightmare.

You can even write a program that works correctly "by accident".
Later you introduce an apparently innocuous change that introduces a bug,
but
it does not show up in testing. Over the course of days or months you make
more
changes to the program. The bug you introduced way back when finally
manifests
itself. You will assume the problem was caused by the last change you made.
Big trouble.

Look before you leap.
Jul 18 '05 #7
Sorry about the line wraps. I've gone over to the dark side and Outlook
Express. Still haven't quite got the hang of it.

Sorry about the HTML. Ditto.
Jul 18 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Tony Liu | last post by:
Hi, how can I create multiple new file handles of a file without having to share to file to the other processes? I have a file that will be accessed by multiple threads in my application, each...
6
by: cj | last post by:
As many of you know I'm writing a TCP/IP server with multiple threads handling multiple short conversations (submit a short string, send back a sort string). Threads are created as needed to...
35
by: keerthyragavendran | last post by:
hi i'm downloading a single file using multiple threads... how can i specify a particular range of bytes alone from a single large file... for example say if i need only bytes ranging from...
2
by: =?Utf-8?B?RGFtZW9u?= | last post by:
Hi - I am attempting to write lines to a file at high volume, multiple threads. Here is my scenario: (initial "WriteToFile" object created via a parent multithreaded process, which receives...
10
by: Terry Olsen | last post by:
I need to be able to write to a file simultaneously from approximately 4 different threads. I'm working on a program that will download parts of a file and combine the parts. Each thread will have...
15
by: Matt Brandt | last post by:
I am trying to get multiple threads to lock specific regions of a file. However, since each thread has the same PID, it appears that a lock by one thread does not block another thread from the same...
16
by: WATYF | last post by:
Hi there... I have a huge text file that needs to be processed. At the moment, I'm loading it into memory in small chunks (x amount of lines) and processing it that way. I'd like the process to be...
7
by: Peter Morris | last post by:
Hi all This is a bit vague I suppose :-) Tomorrow I need to write a service which monitors two folders for new files and performs tasks appropriately. Some of these tasks are not too intensive...
2
by: scriptlearner | last post by:
OS: Solaris 9 Python Version: 2.4.4 I need to log certain data in a worker thread; however, I am getting an error now when I use two worker threads. I think the problem comes from the line...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.