By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
449,007 Members | 1,017 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 449,007 IT Pros & Developers. It's quick & easy.

string parsing screwing up on large files?

P: n/a
Hello, I'm fairly new to python but I've written a script that takes
in a special text file (a renderman .rib to be specific).. and filters
some of the commands. The .rib file is a simple text file, but in
some cases it's very large.. can be 20megs or more at times.

The script steps though each line looking for keywords and changes the
line if nessisary but most lines just pass in and out of the script
un-modified. The problem is sometimes the lines aren't written out
correctly and it's an intermittent problem. If I re-run the script
again on the same input usually it works fine. After filtering about
100 files i might get 4 or 5 that come out bad.. simply re-running
those fixes them.

Anyone know what I might look for? It's possible that the machine is
under a lot of i/o load and/or cpu load when it happens, but not sure
about that.. I normally send this processing to a render farm, so it's
hard to predict exactly what sort of load is going on at that time. It
feels like a buffer isn't getting flushed before the text is written
out.. or something like that.

Any suggestions where I might look?

thanks

daniel
Jul 18 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Daniel Kramer:
Any suggestions where I might look?


In the source code, probably. I've looked long and hard at your posting,
but I didn't find any bug there.

--
René Pijlman
Jul 18 '05 #2

P: n/a
On 19 Dec 2003 18:55:29 -0800, da*********@yahoo.com (Daniel Kramer) wrote:
Hello, I'm fairly new to python but I've written a script that takes
in a special text file (a renderman .rib to be specific).. and filters
some of the commands. The .rib file is a simple text file, but in
some cases it's very large.. can be 20megs or more at times.

The script steps though each line looking for keywords and changes the
line if nessisary but most lines just pass in and out of the script
un-modified. The problem is sometimes the lines aren't written out
correctly and it's an intermittent problem. If I re-run the script
again on the same input usually it works fine. After filtering about
100 files i might get 4 or 5 that come out bad.. simply re-running
those fixes them.

Anyone know what I might look for? It's possible that the machine is
under a lot of i/o load and/or cpu load when it happens, but not sure
about that.. I normally send this processing to a render farm, so it's
hard to predict exactly what sort of load is going on at that time. It
feels like a buffer isn't getting flushed before the text is written
out.. or something like that.

Any suggestions where I might look?

What is telling you that some lines aren't correct? Renderman syntax errors?
Maybe if you saved the bad file(s) and re-ran the changes until you got a good
one, and then ran diff -u goodfile badfile to see how things were actually
changing, it would become clear. Or if not, you could post some diffs and
the code that should be accomplishing the changes, and we could go from there.

Is the code threaded? Are you perhaps clobbering something across threads
occasionally? Accidental name collisions? Unsychronized accesses?

You might also want to mention what platform and python version etc you are running.
Maybe there is a file system bug that an upgrade would fix? It doesn't happen often,
but it might be worth googling for for your platform.

Regards,
Bengt Richter
Jul 18 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.