By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,767 Members | 1,987 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,767 IT Pros & Developers. It's quick & easy.

os walk() and threads problems (os.walk are thread safe?)

P: n/a
Hello list,

I have a strange problem with os.walk and threads in python script. I
have one script that create some threads and consume Queue. For every
value in Queue this script run os.walk() and printing root dir. But if i
increase number of threads the result are inconsistent compared with one
thread.

For example, run this code plus sort with one thread and after run again
with ten threads and see diff(1).

--code--
#!/usr/local/bin/python

import os, time, glob
import Queue
import threading

EXIT=False
POOL=Queue.Queue(0)
NRO_THREADS=1
#NRO_THREADS=10

class Worker(threading.Thread):
def run(self):
global POOL, EXIT
while True:
try:
mydir=POOL.get(timeout=1)
if mydir == None:
continue

for root, dirs, files in os.walk(mydir):
print root

except Queue.Empty:
if EXIT:
break
else:
continue
except KeyboardInterrupt:
break
except Exception:
raise

for x in xrange(NRO_THREADS):
Worker().start()
try:
for i in glob.glob('/usr/ports/*'):
POOL.put(i)

while not POOL.empty():
time.sleep(1)
EXIT = True

while (threading.activeCount() 1):
time.sleep(1)
except KeyboardInterrupt:
EXIT=True
--code--

If someone can help with this i appreciate.

Regards

--
Marcus Alves Grando
marcus(at)sbh.eng.br | Personal
mnag(at)FreeBSD.org | FreeBSD.org
Nov 13 '07 #1
Share this Question
Share on Google+
4 Replies


P: n/a
Marcus Alves Grando wrote:
Hello list,

I have a strange problem with os.walk and threads in python script. I
have one script that create some threads and consume Queue. For every
value in Queue this script run os.walk() and printing root dir. But if i
increase number of threads the result are inconsistent compared with one
thread.

For example, run this code plus sort with one thread and after run again
with ten threads and see diff(1).
I don't see any difference. I ran it with 1 and 10 workers + sorted the
output. No diff whatsoever.

And I don't know what you mean by diff(1) - was that supposed to be some
output?

Diez
Nov 13 '07 #2

P: n/a
Diez B. Roggisch wrote:
Marcus Alves Grando wrote:
>Hello list,

I have a strange problem with os.walk and threads in python script. I
have one script that create some threads and consume Queue. For every
value in Queue this script run os.walk() and printing root dir. But if i
increase number of threads the result are inconsistent compared with one
thread.

For example, run this code plus sort with one thread and after run again
with ten threads and see diff(1).

I don't see any difference. I ran it with 1 and 10 workers + sorted the
output. No diff whatsoever.
Do you test in one dir with many subdirs? like /usr or /usr/ports (in
freebsd) for example?
>
And I don't know what you mean by diff(1) - was that supposed to be some
output?
No. One thread produce one result and ten threads produce another result
with less lines.

Se example below:

@@ -13774,8 +13782,6 @@
/usr/compat/linux/proc/44
/usr/compat/linux/proc/45
/usr/compat/linux/proc/45318
-/usr/compat/linux/proc/45484
-/usr/compat/linux/proc/45532
/usr/compat/linux/proc/45857
/usr/compat/linux/proc/45903
/usr/compat/linux/proc/46

Regards

--
Marcus Alves Grando
marcus(at)sbh.eng.br | Personal
mnag(at)FreeBSD.org | FreeBSD.org
Nov 13 '07 #3

P: n/a
Marcus Alves Grando wrote:
Diez B. Roggisch wrote:
>Marcus Alves Grando wrote:
>>Hello list,

I have a strange problem with os.walk and threads in python script. I
have one script that create some threads and consume Queue. For every
value in Queue this script run os.walk() and printing root dir. But if i
increase number of threads the result are inconsistent compared with one
thread.

For example, run this code plus sort with one thread and after run again
with ten threads and see diff(1).

I don't see any difference. I ran it with 1 and 10 workers + sorted the
output. No diff whatsoever.

Do you test in one dir with many subdirs? like /usr or /usr/ports (in
freebsd) for example?
Yes, over 1000 subdirs/files.
>>
And I don't know what you mean by diff(1) - was that supposed to be some
output?

No. One thread produce one result and ten threads produce another result
with less lines.

Se example below:

@@ -13774,8 +13782,6 @@
/usr/compat/linux/proc/44
/usr/compat/linux/proc/45
/usr/compat/linux/proc/45318
-/usr/compat/linux/proc/45484
-/usr/compat/linux/proc/45532
/usr/compat/linux/proc/45857
/usr/compat/linux/proc/45903
/usr/compat/linux/proc/46
I'm not sure what that directory is, but to me that looks like the
linux /proc dir, containing process ids. Which incidentially changes
between the two runs, as more threads will have process id aliases.

Try your script on another directory.

Diez
Nov 13 '07 #4

P: n/a
Diez B. Roggisch wrote:
Marcus Alves Grando wrote:
>Diez B. Roggisch wrote:
>>Marcus Alves Grando wrote:

Hello list,

I have a strange problem with os.walk and threads in python script. I
have one script that create some threads and consume Queue. For every
value in Queue this script run os.walk() and printing root dir. But if i
increase number of threads the result are inconsistent compared with one
thread.

For example, run this code plus sort with one thread and after run again
with ten threads and see diff(1).
I don't see any difference. I ran it with 1 and 10 workers + sorted the
output. No diff whatsoever.
Do you test in one dir with many subdirs? like /usr or /usr/ports (in
freebsd) for example?

Yes, over 1000 subdirs/files.
Strange, because to me accurs every time.
>
>>And I don't know what you mean by diff(1) - was that supposed to be some
output?
No. One thread produce one result and ten threads produce another result
with less lines.

Se example below:

@@ -13774,8 +13782,6 @@
/usr/compat/linux/proc/44
/usr/compat/linux/proc/45
/usr/compat/linux/proc/45318
-/usr/compat/linux/proc/45484
-/usr/compat/linux/proc/45532
/usr/compat/linux/proc/45857
/usr/compat/linux/proc/45903
/usr/compat/linux/proc/46

I'm not sure what that directory is, but to me that looks like the
linux /proc dir, containing process ids. Which incidentially changes
between the two runs, as more threads will have process id aliases.
My example are not good enough. I run this script in ports directory of
freebsd and imap folders in my linux server, same thing.

@@ -182,7 +220,6 @@
/usr/ports/archivers/p5-POE-Filter-Bzip2
/usr/ports/archivers/p5-POE-Filter-LZF
/usr/ports/archivers/p5-POE-Filter-LZO
-/usr/ports/archivers/p5-POE-Filter-LZW
/usr/ports/archivers/p5-POE-Filter-Zlib
/usr/ports/archivers/p5-PerlIO-gzip
/usr/ports/archivers/p5-PerlIO-via-Bzip2
@@ -234,7 +271,6 @@
/usr/ports/archivers/star-devel
/usr/ports/archivers/star-devel/files
/usr/ports/archivers/star/files
-/usr/ports/archivers/stuffit
/usr/ports/archivers/szip
/usr/ports/archivers/tardy
/usr/ports/archivers/tardy/files

Regards

--
Marcus Alves Grando
marcus(at)sbh.eng.br | Personal
mnag(at)FreeBSD.org | FreeBSD.org
Nov 13 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.