473,397 Members | 2,099 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,397 software developers and data experts.

why am I getting a segmentation fault?

I have a short multi-threaded script that checks web images to make
sure they are still there. I get a segmentation fault everytime I run
it and I can't figure out why. Writing threaded scripts is new to me so
I may be doing something wrong that should be obvious :(

google messes up the python code so here is a link to it.

http://kracomp.com/~jay/py.txt

This is the output of the script.
[jay@localhost scripts]$ ./py.py
update item set goodImage = 'yes' where productId='12603'
update item set goodImage = 'yes' where productId='18272'
update item set goodImage = 'yes' where productId='1927'
update item set goodImage = 'no' where productId='12709'
update item set goodImage = 'yes' where productId='32087'
update item set goodImage = 'no' where productId='25803'
Segmentation fault

Thanks in advance.

Jul 18 '05 #1
6 4737
"Jay donnell" <ja********@gmail.com> wrote in message
news:11*********************@f14g2000cwb.googlegro ups.com...
I have a short multi-threaded script that checks web images to make
sure they are still there. I get a segmentation fault everytime I run
it and I can't figure out why. Writing threaded scripts is new to me so
I may be doing something wrong that should be obvious :(

<snip>

Here is a code excerpt from your link (the main routine, omits the class
definition for ImageChecker, which extends threading.Thread):

db = MySQLdb.connect(host="localhost", user="xxx", passwd="xxx", db="xxx")
cursor = db.cursor()
query = "select * from item order by rand() limit 0, 100"
#query = "select * from item"
cursor.execute(query)
result = cursor.fetchall()

maxThreads = 5

for r in result:
while(threading.activeCount() > maxThreads):
pass
flag='good'
#pass
#print str(r[0]) + ', ' + str(r[7])
tmp = r[7].split('/')
filename = tmp[-1]
#print 'filename ' + filename

filename = '/tmp/'+filename

threadList = []

#r[7] is the url of the image
#r[0] is the id for the row
imageChecker = ImageChecker(r[7], filename, r[0])
imageChecker.start()
threadList.append(imageChecker)

----------------------------------------------
1. What happens after you fall out of the loop "for r in result"? Shouldn't
you wait for the remaining threads to finish? Perhaps you need another
busy-loop to wait for the threads to finish, something like
while threading.activeCount() > 0: pass
2. Is this the best way to busy-wait? What about some kind of
thread.join()? At least throw a sleep call in there or something, or this
loop will churn and churn, diverting CPU from your threads that are actually
trying to do some real work.
3. I find it easier to work with named variables than numeric subscripts.
At the top of your for loop, try something like:
id,height,width,numBytes,whatever,slkdjf1,slkdjf2, url = r
This way you have much more meaningful names than r[0] and r[7], which you
later have to comment to explain whats going on!
4. filename=r[7].split('/')[-1] is not terribly portable. See if there is a
standard module for parsing filespecs (I'll bet there is).

-- Paul
Jul 18 '05 #2
Paul McGuire wrote:
4. filename=r[7].split('/')[-1] is not terribly portable. See if there is a
standard module for parsing filespecs (I'll bet there is).


Indeed there is -- os.path. In particular, os.path.basename() seems
to do exactly that snippet is intending, in a much more robust (and
readable) fashion.

Jeff Shannon
Technician/Programmer
Credit International

Jul 18 '05 #3

Jay donnell wrote:
I have a short multi-threaded script that checks web images to make
sure they are still there. I get a segmentation fault everytime I run
it and I can't figure out why. Writing threaded scripts is new to me so I may be doing something wrong that should be obvious :(


def run(self):
try:
self.site = urllib.urlopen(self.url)
self.f=open(self.filename, 'w')
self.im = Image.open(self.filename)
self.size = self.im.size
self.flag = 'yes'
self.q = "yadda yadda"

That's SIX names that don't need to be attributes of the object; they
are used _only_ in this method, so they can be local to the method,
saving you a whole lot of typing "self." and saving puzzlement &
careful scrutiny by those trying to read your code.

Back to your real problem: does it work when you set maxThreads to 1?
did it work before you added the threading code? what does your
debugger tell you about the location of the seg fault?

MOST IMPORTANTLY, sort this mess out:

self.q = "update item set goodImage = '" + self.flag + "' where
productId='" + str(self.id) + "'"
print self.q
self.cursor.execute(query) ###############################

### "query" is a global variable[YUK!] (see below) which isn't going to
do what you want. Looks like you meant "self.q".

self.db.close()

db = MySQLdb.connect(host="localhost", user="xxx", passwd="xxx",
db="xxx")
cursor = db.cursor()
query = "select * from item order by rand() limit 0, 100"
### Have you looked in your database to see if the script has actually
updated item.goodImage? Do you have a test plan?

Jul 18 '05 #4
Thank you.

I made all the changes you recommended and everything seems to be
working.

Jul 18 '05 #5
>### Have you looked in your database to see if the >script has
actually
updated item.goodImage? Do you have a test plan?


Thank you for the help. Sorry for the messy code. I was under time
constraints. I had class, and I was rushing to get this working before
class. I should waited a day and read over it before I asked. Sorry
again.

Jul 18 '05 #6

Jay donnell wrote:
### Have you looked in your database to see if the >script has actually
updated item.goodImage? Do you have a test plan?


Thank you for the help. Sorry for the messy code. I was under time
constraints. I had class, and I was rushing to get this working

before class. I should waited a day and read over it before I asked. Sorry
again.


Apologise to _yourself_ for the messy code. If 'query' had been safely
tucked away as a local variable instead of a global, that problem
(typing 'query' instead of 'self.q') would have caused an exception on
the first time around.

A few points: Yes, don't rush. I've-forgotten-whom wrote something like
"Don't program standing up". Good advice.

Build things a piece at a time. Build your tests at the same time or
earlier. In this case, a stand-alone method or function that checked
that one file was OK, would have been a reasonable place to start. Then
add the code to query the db. Then add the threading stuff, gingerly.
If you build and test incrementally, then a segfault or other disaster
is highly likely to have been caused by the latest addition.

AND SOME MORE CRUFT:
def __init__(self, url, filename, id):
self.t = time.time() <<<<<<<<<======= never used again
threading.Thread.__init__(self)
self.db = MySQLdb.connect(host="localhost", user="xxx",
passwd="xxx", db="xxx")
# create a cursor
self.cursor = db.cursor() <<<=== should be self.db.cursor()
#### picks up the *GLOBAL* 'db'
self.url = url
self.filename = filename
self.id = id
===========================
threadList = []
[snip]
threadList.append(imageChecker)
===>>>> that doesn't achieve much!
============================

N.B. You still have two problems: (1) Your script as you said now
"seems to work". That doesn't sound like you have a test plan. (2) You
have shuffled your code around and the segfault went away; i.e. you
waved a dead chicken and the volcano stopped erupting. Most of the
changes suggested by others and myself were of a style/clarity nature.
The self.q/query thing would have caused a select instead an update;
hardly segfault territory. I wouldn't expect that busy-wait loop to
cause a segfault. You still don't know what caused the segfault. That
means you don't know how to avoid it in the future. You are still
living in the shadow of the volcano. Will the chicken trick work next
time?

Looking forward to the next episode,
John

Jul 18 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Vishal Grover | last post by:
Hello Everyone, I am seeing a certain behaviour which I find strange, and am curious to get an explanation to it. I have the following program. #include <iostream> #include <cstdlib> using...
9
by: fudmore | last post by:
Hello Everybody. I have a Segmentation fault problem. The code section at the bottom keeps throwing a Segmentation fault when it enters the IF block for the second time. const int...
3
by: Zheng Da | last post by:
Program received signal SIGSEGV, Segmentation fault. 0x40093343 in _int_malloc () from /lib/tls/libc.so.6 (gdb) bt #0 0x40093343 in _int_malloc () from /lib/tls/libc.so.6 #1 0x40094c54 in malloc...
5
by: Fra-it | last post by:
Hi everybody, I'm trying to make the following code running properly, but I can't get rid of the "SEGMENTATION FAULT" error message when executing. Reading some messages posted earlier, I...
18
by: Digital Puer | last post by:
Hi, I'm coming over from Java to C++, so please bear with me. In C++, is there a way for me to use exceptions to catch segmentation faults (e.g. when I access a location off the end of an array)?...
27
by: Paminu | last post by:
I have a wierd problem. In my main function I print "test" as the first thing. But if I run the call to node_alloc AFTER the printf call I get a segmentation fault and test is not printed! ...
7
by: pycraze | last post by:
I would like to ask a question. How do one handle the exception due to Segmentation fault due to Python ? Our bit operations and arithmetic manipulations are written in C and to some of our...
3
by: madunix | last post by:
My Server is suffering bad lag (High Utlization) I am running on that server Oracle10g with apache_1.3.35/ php-4.4.2 Web visitors retrieve data from the web by php calls through oci cobnnection...
6
by: DanielJohnson | last post by:
int main() { printf("\n Hello World"); main; return 0; } This program terminate just after one loop while the second program goes on infinitely untill segmentation fault (core dumped) on...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.