473,396 Members | 2,121 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

fixing random segfaults

Hi,

I have the following script (see below) which queries various sites for
statistics about a list of domains. The problem is that 1 in 3 attempts to
run it result in "Segmentation Fault". Sometimes it will work fine 10 times
in a row, other times it will segfault 3 times in a row before it works.

How can I tell what's causing the segfault ? It doesn't happen all the time,
but when it does I'm pretty sure it's $thr->join; that does it. (if I
uncomment the print statements, it will say "Waiting for thread number 1 to
join" followed immediately by "Segmentation Fault"..

Chris
#!/usr/bin/t-perl -W

use strict;
use threads;
use Thread::Queue;
use LWP::UserAgent;

my $dls = './domains.txt'; # List of domain names
my $out = './results.csv'; # Where to output results to

my @nme = ('Yahoo','Alexa','Overture');
my @url =
('http://search.yahoo.com/search?p=link%3Ahttp%3A%2F%2Fdomain.com&ei=UTF-8&f
r=FP-tab-web-t&x=wrt',

'http://www.alexa.com/data/ds/linksin?q=link:domain.com&url=',

'http://inventory.overture.com/d/searchinventory/suggestion/?mkt=us&lang=en_
US&term=');

my @rgx = (qr/of about ((\d|,)+)/,
qr/Sites \s+1\s+-\s+\d+\s+of\s+((\d|,)+)/i,
qr/color=E8E8E8> (\d+)/);

my $t1 = time();
open (FILE, "<$dls") || die "Unable to open $dls: $!";
my @domains = <FILE>;
close (FILE);

my @stream;
my $cthreads;
# Start 5 new threads per search engine:

for (my $i=0;$i<=$#url;$i++){

$stream[$i] = new Thread::Queue;

for (0..4){
my $tid = $i*$_;
my $qnm = "q".($i+1);
$cthreads->[$tid] = threads->new(\&fetchStats, $stream[$i], $url[$i],
$rgx[$i], $nme[$i],$qnm);
#print "Thread number ".($tid + 1)." created.\n";
}

}

# Add our domains to the queue:
foreach my $dom (@domains){
chomp($dom);
#print "Domain '$dom' added to the queue.\n";

foreach my $str (@stream){ $str->enqueue($dom); }
}
# Join all our threads (except ourselves):

foreach my $thr (threads->list) {
if ($thr->tid && !threads::equal($thr, threads->self) ){
#print "Waiting for thread number ".$thr->tid." to join\n";
$thr->join;
#print "Thread number ".$thr->tid." has joined.\n";
}
}
my $t2 = time();
my $t3 = $t2-$t1;
print "Total execution time: $t3 seconds\n";
sub fetchStats
{
my ($upstream,$url,$rgx,$nme,$qnm) = @_;

while (my $dom = $upstream->dequeue){

my $ua = new LWP::UserAgent;
$ua->parse_head(0);
$ua->agent("Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0;)");
$ua->proxy('http','http://59.18.143.176:8080/');
$ua->timeout(2);
$ua->max_size(9400);

#print "Checking $dom\n";

my $response = $ua->get($url.$dom);

if (!$response->is_success){
#$response = $ua->get($url.$dom);
#print "Retrying $dom...\n";
}

if ($response->is_success){

my $txt = $response->content;
$txt =~ s/\r|\n//gs;

my $hits = ($txt =~ $rgx) ? $1 : 0;

print "$dom ($nme): $hits hits\n";

}else{
print "$dom ($nme): timeout\n";
}

if ($upstream->pending < 1){ $upstream->enqueue(undef); }
}
}

Jun 4 '06 #1
0 2983

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: marko | last post by:
I'm having big problems with php and mysql. php segfaults on certain mysql calls, with the following logs in apache error log: child pid 60386 exit signal Segmentation fault (11) and in the...
10
by: Virus | last post by:
Ok well what I am trying to do is have 1.) the background color to change randomly with 5 different colors.(change on page load) 2,) 10 different quotes randomly fadeing in and out in random...
5
by: Naveen Parihar | last post by:
I've a binary that runs on most of our servers but segfaults on one of the servers. Further, even on this specific machine, the binary runs successfully sometimes but segfaults most of the time....
5
by: Matthias Kaeppler | last post by:
Hi, I thought it'd be a better idea to start a new thread dealing directly with my problem. Okay, here's again what's happening: I'm storing boost::shared_ptrS in an std::set. I supposed that on...
1
by: David Douthitt | last post by:
I have a binary statically linked application that was compiled against Red Hat 7.3 (presumably glibc 2.2.5). The application SegFaults when run in a Red Hat Advanced Server environment - with...
4
by: Frank Millman | last post by:
Hi all I am using Python 2.4.1. I have machines running FC4, RH9, and MSW Server 2003 for testing. If I call sax.make_parser() from the interpreter or from a stand-alone program, it works...
9
by: Josh Zenker | last post by:
I've been working on an implementation of Dijkstra's algorithm on and off for the past few days. I thought I was close to being finished, but clearly I've done something wrong. I'm not a very...
6
by: Mike Langworthy | last post by:
i can not seem to get this code to work. someone please help using System; using System.Collections.Generic; using System.Text; namespace ConsoleApplication1 { class Program {
1
by: Stefan Braumeister | last post by:
Hi NG, I got some difficult problem that drives me nuts. I wrote a relatively simple extension that reads binary files and extracts some data. If I call the script, that uses my extension via...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.