473,545 Members | 2,002 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

page faults when spawning subprocesses

I am working on a network management program written in python that has
multiple threads (typically 20+) spawning subprocesses which are used
to communicate with other systems on the network. This runs fine for a
while, but eventually slows down to a crawl. Running sar shows that
when it is running slowly there is an exceptionally large number of
minor page faults - there are continuously 14000 faults/sec, with a
variation of about +/-100. There are no pages swapped to disk, these
are purely in-memory faults.

I have a hypothesis about what is happening, but have not been able to
prove or disprove it:
the theory is that when a subprocess is spawned, there is a small
window between the call to fork and the call to exec where the parent's
memory is shared between the two processes. Linux marks the memory as
copy-on-write, so if the parent process then accesses memory during
that window a minor page fault is generated and the page is copied.
Normally this is not a problem, but with a large number of threads all
spawning subprocesses there is a chance of a another process being
spawned during that window and the whole of memory is copied. This
slows everything else down so the probability of another collision
increases, and the whole thing snowballs. This could also happen if
something else tries to write to large areas of memory (maybe the
python garbage collector?).

This is running on a Sun V40 64 bit SMP with Fedora Core 3. The same
code has been run on intel systems and the problem has not been seen -
this could be because the problem is specific to that hardware or
because the intel systems are not fast enough for a collision to occur.

My questions are:

1) is the theory plausible/likely?

2) what could I do to prove/disprove it?

3) has anyone else seen this problem?

4) are there any other situations that could be causing a continuous
stream of minor page faults?

5) WTF can I do about it?

Dave Kirby
(dave.x.kirby at
gmail dot
com)

Nov 9 '05 #1
2 2626
Dave Kirby wrote:

5) WTF can I do about it?


Maybe using vfork rather than fork would help. But
I'm not sure that will work as intended when there
are multiple threads, in fact I'm not sure fork
will work either. You could have fork racing against
another thread being in a critical region thus
duplicating the memory map at some point where some
data structures are in an inconsistent state and
apparently locked by some thread existing in the
parent.

A possible solution would be to use fork to create
two processes before creating any threads. Have
the communicate over pipes or sockets when new
processes are to be created. Then one process can
create all the threads you need, and the other can
fork off children.

Even in that case vfork may come in handy. If you
dislike the semantics of vfork, but still want the
parent to block until the child has called execve,
then you can do so manually using a pipe. Create
the pipe before calling fork, in parent process
you close write end and try to read from the pipe,
in child process you close read end and mark write
end close on exec. When exec succeeds, the pipe is
closed and parent gets EOF.

(I have tried some of this in C, but I must admit,
I don't know if it can be done in Python as well.)

--
Kasper Dupont
Note to self: Don't try to allocate
256000 pages with GFP_KERNEL on x86.
Nov 9 '05 #2
Dave Kirby wrote:
I am working on a network management program written in python that has
multiple threads (typically 20+) spawning subprocesses which are used
to communicate with other systems on the network. ...


Let me check if I got you right: You are using fork() inside a thread in
a multi-threaded environment. That sounds complicated. :-)

Have a look at

http://www.opengroup.org/onlinepubs/...ions/fork.html

It mentions your use of fork, i.e. to create a new process running a
different program (in this case the call to fork() is soon followed by a
call to exec()).

If you fork in your multi-threaded environment, what happens with all
your threads? The document resorts to "the effects of calling functions
that require certain resources between the call to fork() and the call
to an exec function are undefined.". Maybe you are just experiencing
this :-)

The document above recommends: "to avoid errors, the child process may
only execute async-signal-safe operations until such time as one of the
exec functions is called."

Maybe this discussion is also of some help:

http://groups.google.com/group/comp....7660515af867ea

Cheers
Daniel
Nov 9 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
2426
by: Marcos | last post by:
Hi guys, I realise this question has been answered in one form or another many times before but I can't quite find the solution I need. I am trying to run multiple subprocesses from a python script and then wait until all subprocesses have completed before continuing. The subprocesses run on other machines ie this is a coarse grained parallel...
3
2731
by: benben | last post by:
Is there a standard guidline to avoid or minimize page faults when manipulating data collections in C++? ben
1
6548
by: tbatwork828 | last post by:
I've PerfMon-ed our application for several days now and it consistently averages 2000 Page Faults/sec, and accumulates on average about 4 mill page faults during 35 mins. During the same monitoring time of 35 mins, there is never any sustained occurence of hard page faults -as a matter of fact, almost on avg 98+ % of the entire time the app...
4
3771
by: tbatwork828 | last post by:
Related to my other post on Graphics.FillRectangle and a lot of page faults caused by this call... We determine that when Control.DoubleBuffer=true to avoid the flicker effect, Graphics.FillRectangle causes a lot of soft page faults - order of 700/sec and more... When Control.DoubleBuffer=false, we have no page faults at all - 0/sec. Has...
2
4149
by: David Morgan | last post by:
Hi Have 4Gb of RAM and plenty of free disk. Those page faults are for DLLHOST.EXE using ~370Mb RAM. inetinfo.exe has 403,106,036 page faults at the time of writing and is using ~145Mb RAM. Why so many page faults. System uptime ~500 hours. The ASP-based site does interface with a SQL Server instance on a completely
4
2357
by: George Sakkis | last post by:
I have a pure python program (no C extensions) that occasionally core dumps in a non-reproducible way. The program is started by a (non- python) cgi script when a form is submitted. It involves running a bunch of other programs through subprocess in multiple threads and writing its output in several files. So the only suspicious parts I can...
3
1804
by: scotp | last post by:
Does anyone know what would cause excessive page faults running the js function below? The most common browser used is IE 6. The page has records that include text & checkbox inputs. Each record also has a hidden input named "questions", whose value is an id that is used in the name of inputs to be disabled. The number of page faults...
4
2158
by: none | last post by:
I have an ASP.NET application, hosted on two web servers. I am looking for advice on what should be an acceptable level of page faults on these production servers. If the acceptable level is zero, then where should I begin reducing page faults in code? What should I look for in code to reduce page faults? Having to reduce page faults is...
12
4514
by: bhunter | last post by:
Hi, I've used subprocess with 2.4 several times to execute a process, wait for it to finish, and then look at its output. Now I want to spawn the process separately, later check to see if it's finished, and if it is look at its output. I may want to send a signal at some point to kill the process. This seems straightforward, but it...
0
7475
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7409
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7664
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
7921
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
5343
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
4958
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3465
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
1
1900
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1023
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.