Why does start_new_thread() create an extra process under Linux?

Jon Perez

Running the following under Linux creates
3 processes instead of 2. Once the started
thread exits, 2 processes still remain. Why?
import thread
from thread import start_new_thread

def newthread():
print "child"

.... suitable delay ...

thread.exit()
start_new_thread(newthread, () )
while 1:
pass
Note: am running Linux Kernel 2.6.7 / glibc 2.3.2 (Slackware 10)

Jul 18 '05 #1

Subscribe Post Reply

2693

Jp Calderone

Jon Perez wrote:

Running the following under Linux creates
3 processes instead of 2. Once the started
thread exits, 2 processes still remain. Why?

Most likely, the "extra" you are seeing is an implementation detail
of your platform's underlying thread library. It probably exists to act
as a scheduler or perform other administrative tasks for the "real"
threads of your application.

Jp

Jul 18 '05 #2

Heiko Wundram

Am Donnerstag, 29. Juli 2004 16:00 schrieb Jp Calderone:

Most likely, the "extra" you are seeing is an implementation detail
of your platform's underlying thread library. It probably exists to act
as a scheduler or perform other administrative tasks for the "real"
threads of your application.

Well, first of all, what the op was seeing wasn't actually what he thought he
was seeing.

In Python there's always the main thread (which is started when python starts
up), and other threads may be started. Thus, if you start two threads in your
program, you'll see three processes in the process list (one for the main
thread, two for the started threads).

But whether these threads will show up as processes depends on the threading
library you use...

LinuxThreads creates a process for each thread that is run. All these
processes share the same memory, although they show up as separate processes
(and actually are, at least for the kernel, they are started by the sys-call
CLONE, which clones a process creating a new process ID, stack and
instruction pointer, but keeping the data and code segment of the cloning
process).

NPTL (Native Posix Threads Library), the "next-generation" threads library for
Linux, handles threads "correctly" in the sense that they are just one
process with separate execution frames but shared memory. NPTL requires
kernel >= 2.5.40-something and a specially adapted glibc. Most new Linux
distributions (>= 9.0 something, debian sid aka. unstable) ship with NPTL
enabled by default, although this creates compatability problems with apps
written for LinuxThreads, as LinuxThreads isn't completely Posix-Threads
compatible (which NPTL is). It also uses some form of syscall, but you'd have
to see the docs for this, I don't know. ps from procps was augmented to
support NPTL threads sometime ago, there's a specific flag you have to
specify to have threads shown.

There are also other Linux threads libraries out there, all of them completely
implemented in user-space, using dispatch/longjmp and other black magic. When
a program uses one of these, you'll also see only one process, although I
don't know any production program that uses one of these threading libraries.

Anyway, hope this clears it up a little...

Heiko.

Jul 18 '05 #3

Heiko Wundram

Am Donnerstag, 29. Juli 2004 17:31 schrieb Heiko Wundram:

Well, first of all, what the op was seeing wasn't actually what he thought
he was seeing.

In Python there's always the main thread (which is started when python
starts up), and other threads may be started. Thus, if you start two
threads in your program, you'll see three processes in the process list
(one for the main thread, two for the started threads).

Forget that, I read the first post wrong. What the op was probably seeing was
output from an NPTL patched ps, which always shows the threads that are
running (not only when asked for it). ps outputs one line (the first) for the
process, all other lines are for each of the threads that are currently
running under this process. So, the following output from ps (actually
ps ax -m on my machine) means that pickup (part of postfix) only runs one
thread (not two processes), and xmms runs five threads.

17539 ? - 0:00 pickup -l -t fifo -u
- - S 0:00 -
18338 ? - 0:02 /usr/bin/xmms
- - S 0:02 -
- - S 0:00 -
- - S 0:00 -
- - S 0:00 -
- - S 0:00 -

I have an NPTL enabled glibc + kernel (without a somewhat strange patch, as
the op seems to have), when I only type ps ax, it'll show up as:

17539 ? S 0:00 pickup -l -t fifo -u
18338 ? S 0:02 /usr/bin/xmms

To see whether you have an NPTL enabled glibc, type /lib/libc.so.6, and it'll
output something like:

....
Available extensions:
....
NPTL 0.61 by Ulrich Drepper
....

Heiko.

Jul 18 '05 #4

Jon Perez

Heiko Wundram wrote:

NPTL (Native Posix Threads Library), the "next-generation" threads library for
Linux, handles threads "correctly" in the sense that they are just one
process with separate execution frames but shared memory.
Does this the fact that NPTL threads are 'just one process' mean they
are not created using clone()? Are NPTL threads not scheduled by
the kernel?

If so, then how come NTPL is described as a 1:1 model which afaik
means 1 application thread is mapped to exactly 1 'kernel' scheduled thread
(or lightweight process if you will) which, again afaik, can only be created
via a clone() call (albeit with different flags) and nothing else.

If NTPL threads are scheduled by NPTL code as opposed to kernel code and are
all mapped onto one process started by one clone() call, wouldn't that make it M:1?

What the op was probably seeing was output from an NPTL patched ps, which
always shows the threads that are running (not only when asked for it).

Note that my glibc is not NTPL-enabled, it is the stock 2.3.2 used
in Slackware 10 (although the procps-3.2.1 it uses may already be NPTL-ready),
so this would not seem to be the explanation.

If you start the sample program in my original message and it hasn't launched
a thread yet, ps will only show one running process. The moment it calls
start_new_thread() however, two new processes show up in ps (so that makes
three total)! Once this newly started thread dies, only one process gets removed,
so there will still be two processes running and that's what's puzzling me.

Same thing applies if you start N number of threads. Seems there's always
one extra thread lying around after you call start_new_thread().

Jul 18 '05 #5

Erno Kuusela

Jon Perez <jb********@wahoo.com> writes:

Does this the fact that NPTL threads are 'just one process' mean they
are not created using clone()? Are NPTL threads not scheduled by
the kernel?

they are just hidden from the /proc directory listing.

(erno@fabulous) /home.b/erno % ls -l /proc/`pidof firefox-bin`/task
total 0
dr-xr-xr-x 3 erno erno 0 Jul 30 17:56 28319
dr-xr-xr-x 3 erno erno 0 Jul 30 17:56 31596
dr-xr-xr-x 3 erno erno 0 Jul 30 17:56 31597
dr-xr-xr-x 3 erno erno 0 Jul 30 17:56 31599
(erno@fabulous) /home.b/erno % ls -l /proc/28319 | wc -l
16
(erno@fabulous) /home.b/erno % ls -l /proc|grep -c 28319
0

-- erno

Jul 18 '05 #6

by: Marcus Schneider | last post by:

I use PythonWin on WinXP. Every time I change a module, I have to leave PythonWin and re enter to make it notice I have made changes. I guess this is not the normal way to do that.. do I have to...

Python

question to start_new_thread in thread

by: Thomas Schmid | last post by:

Hi there, I wrote a tcp server which listens on a port. When he gets a new connection, he starts a new thread like this: thread.start_new_thread(self.ConnectionHandler, (conn,)) where conn is...

Python

thread.start_new_thread question

by: Konstantin Veretennicov | last post by:

Hi, Just curious: >>> import thread >>> help(thread.start_new_thread) . . . start_new_thread(function, args) . . . Second argument is mandatory. Is it incidental or for a reason?

Python

Open Source DRM? What does everyone think about it? Will Open Source DRM ever catch up to MS DRM?

by: greatbooksclassics | last post by:

Open Source DRM? What does everyone think about it? Will Open Source DRM ever catch up to MS DRM? Will DRM ever be integrated into common LAMP applications?...

PHP

On what does size of data types depend?

by: Sunil | last post by:

Hi all, I am using gcc compiler in linux.I compiled a small program int main() { printf("char : %d\n",sizeof(char)); printf("unsigned char : ...

C / C++

postgresql +AMD64 +big address spaces - does it work?

by: Andy B | last post by:

If I bought one of these boxes/OS combos as a postgresql database server, would postgresql be able to make the best use of it with a huge (e.g. 40GB) database? Box: HP ProLiant DL585, with ...

PostgreSQL Database

Exploiting Dual Core's with Py_NewInterpreter's separated GIL ?

by: robert | last post by:

I'd like to use multiple CPU cores for selected time consuming Python computations (incl. numpy/scipy) in a frictionless manner. Interprocess communication is tedious and out of question, so I...

Python

113

Python does not play well with others

by: John Nagle | last post by:

The major complaint I have about Python is that the packages which connect it to other software components all seem to have serious problems. As long as you don't need to talk to anything outside...

Python

Process and its memory Limit. (Linux biased)

by: Nehil | last post by:

When a process is started three segments are created : 1) Text. 2) Stack. 3) Data. The size of First two is fixed and their upperlimit is fixed by the compiler. (Plz correct if i'm wrong) Now...

C / C++

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Why does start_new_thread() create an extra process under Linux?

Similar topics