471,312 Members | 1,815 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,312 software developers and data experts.

os walk() and threads problems (os.walk are thread safe?)

Hello list,

I have a strange problem with os.walk and threads in python script. I
have one script that create some threads and consume Queue. For every
value in Queue this script run os.walk() and printing root dir. But if i
increase number of threads the result are inconsistent compared with one
thread.

For example, run this code plus sort with one thread and after run again
with ten threads and see diff(1).

--code--
#!/usr/local/bin/python

import os, time, glob
import Queue
import threading

EXIT=False
POOL=Queue.Queue(0)
NRO_THREADS=1
#NRO_THREADS=10

class Worker(threading.Thread):
def run(self):
global POOL, EXIT
while True:
try:
mydir=POOL.get(timeout=1)
if mydir == None:
continue

for root, dirs, files in os.walk(mydir):
print root

except Queue.Empty:
if EXIT:
break
else:
continue
except KeyboardInterrupt:
break
except Exception:
raise

for x in xrange(NRO_THREADS):
Worker().start()
try:
for i in glob.glob('/usr/ports/*'):
POOL.put(i)

while not POOL.empty():
time.sleep(1)
EXIT = True

while (threading.activeCount() 1):
time.sleep(1)
except KeyboardInterrupt:
EXIT=True
--code--

If someone can help with this i appreciate.

Regards

--
Marcus Alves Grando
marcus(at)sbh.eng.br | Personal
mnag(at)FreeBSD.org | FreeBSD.org
Nov 13 '07 #1
4 3207
Marcus Alves Grando wrote:
Hello list,

I have a strange problem with os.walk and threads in python script. I
have one script that create some threads and consume Queue. For every
value in Queue this script run os.walk() and printing root dir. But if i
increase number of threads the result are inconsistent compared with one
thread.

For example, run this code plus sort with one thread and after run again
with ten threads and see diff(1).
I don't see any difference. I ran it with 1 and 10 workers + sorted the
output. No diff whatsoever.

And I don't know what you mean by diff(1) - was that supposed to be some
output?

Diez
Nov 13 '07 #2
Diez B. Roggisch wrote:
Marcus Alves Grando wrote:
>Hello list,

I have a strange problem with os.walk and threads in python script. I
have one script that create some threads and consume Queue. For every
value in Queue this script run os.walk() and printing root dir. But if i
increase number of threads the result are inconsistent compared with one
thread.

For example, run this code plus sort with one thread and after run again
with ten threads and see diff(1).

I don't see any difference. I ran it with 1 and 10 workers + sorted the
output. No diff whatsoever.
Do you test in one dir with many subdirs? like /usr or /usr/ports (in
freebsd) for example?
>
And I don't know what you mean by diff(1) - was that supposed to be some
output?
No. One thread produce one result and ten threads produce another result
with less lines.

Se example below:

@@ -13774,8 +13782,6 @@
/usr/compat/linux/proc/44
/usr/compat/linux/proc/45
/usr/compat/linux/proc/45318
-/usr/compat/linux/proc/45484
-/usr/compat/linux/proc/45532
/usr/compat/linux/proc/45857
/usr/compat/linux/proc/45903
/usr/compat/linux/proc/46

Regards

--
Marcus Alves Grando
marcus(at)sbh.eng.br | Personal
mnag(at)FreeBSD.org | FreeBSD.org
Nov 13 '07 #3
Marcus Alves Grando wrote:
Diez B. Roggisch wrote:
>Marcus Alves Grando wrote:
>>Hello list,

I have a strange problem with os.walk and threads in python script. I
have one script that create some threads and consume Queue. For every
value in Queue this script run os.walk() and printing root dir. But if i
increase number of threads the result are inconsistent compared with one
thread.

For example, run this code plus sort with one thread and after run again
with ten threads and see diff(1).

I don't see any difference. I ran it with 1 and 10 workers + sorted the
output. No diff whatsoever.

Do you test in one dir with many subdirs? like /usr or /usr/ports (in
freebsd) for example?
Yes, over 1000 subdirs/files.
>>
And I don't know what you mean by diff(1) - was that supposed to be some
output?

No. One thread produce one result and ten threads produce another result
with less lines.

Se example below:

@@ -13774,8 +13782,6 @@
/usr/compat/linux/proc/44
/usr/compat/linux/proc/45
/usr/compat/linux/proc/45318
-/usr/compat/linux/proc/45484
-/usr/compat/linux/proc/45532
/usr/compat/linux/proc/45857
/usr/compat/linux/proc/45903
/usr/compat/linux/proc/46
I'm not sure what that directory is, but to me that looks like the
linux /proc dir, containing process ids. Which incidentially changes
between the two runs, as more threads will have process id aliases.

Try your script on another directory.

Diez
Nov 13 '07 #4
Diez B. Roggisch wrote:
Marcus Alves Grando wrote:
>Diez B. Roggisch wrote:
>>Marcus Alves Grando wrote:

Hello list,

I have a strange problem with os.walk and threads in python script. I
have one script that create some threads and consume Queue. For every
value in Queue this script run os.walk() and printing root dir. But if i
increase number of threads the result are inconsistent compared with one
thread.

For example, run this code plus sort with one thread and after run again
with ten threads and see diff(1).
I don't see any difference. I ran it with 1 and 10 workers + sorted the
output. No diff whatsoever.
Do you test in one dir with many subdirs? like /usr or /usr/ports (in
freebsd) for example?

Yes, over 1000 subdirs/files.
Strange, because to me accurs every time.
>
>>And I don't know what you mean by diff(1) - was that supposed to be some
output?
No. One thread produce one result and ten threads produce another result
with less lines.

Se example below:

@@ -13774,8 +13782,6 @@
/usr/compat/linux/proc/44
/usr/compat/linux/proc/45
/usr/compat/linux/proc/45318
-/usr/compat/linux/proc/45484
-/usr/compat/linux/proc/45532
/usr/compat/linux/proc/45857
/usr/compat/linux/proc/45903
/usr/compat/linux/proc/46

I'm not sure what that directory is, but to me that looks like the
linux /proc dir, containing process ids. Which incidentially changes
between the two runs, as more threads will have process id aliases.
My example are not good enough. I run this script in ports directory of
freebsd and imap folders in my linux server, same thing.

@@ -182,7 +220,6 @@
/usr/ports/archivers/p5-POE-Filter-Bzip2
/usr/ports/archivers/p5-POE-Filter-LZF
/usr/ports/archivers/p5-POE-Filter-LZO
-/usr/ports/archivers/p5-POE-Filter-LZW
/usr/ports/archivers/p5-POE-Filter-Zlib
/usr/ports/archivers/p5-PerlIO-gzip
/usr/ports/archivers/p5-PerlIO-via-Bzip2
@@ -234,7 +271,6 @@
/usr/ports/archivers/star-devel
/usr/ports/archivers/star-devel/files
/usr/ports/archivers/star/files
-/usr/ports/archivers/stuffit
/usr/ports/archivers/szip
/usr/ports/archivers/tardy
/usr/ports/archivers/tardy/files

Regards

--
Marcus Alves Grando
marcus(at)sbh.eng.br | Personal
mnag(at)FreeBSD.org | FreeBSD.org
Nov 13 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

7 posts views Thread by Mr. Mountain | last post: by
22 posts views Thread by Jeff Louie | last post: by
9 posts views Thread by Arafangion | last post: by
9 posts views Thread by jdlists | last post: by
reply views Thread by rosydwin | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.