Hi all,
I'm having some trouble with a linked list function and was wondering if
anyone could shed any light on it. Basically I have a singly-linked list
which stores pid numbers of a process's children - when a child is fork()ed
its pid is added to the linked list. I then have a SIGCHLD handler which is
supposed to remove the pid from the list when a child exits. The problem I'm
having is that very very occasionally and seemingly unpredictably, my
function to remove an item from the list is just failing silently to do so;
a child exits, the signal handler gets run, but the child's pid does not get
removed from the list. I'm 99% sure this is a problem with my linked list
code and not my signal handling code so I hope this is not off-topic for
comp.lang.c.
Anyway, here's my code. First the definitions for my structures:
struct queue_node {
char ipaddr[IPSIZE+1];
int pid;
struct queue_node *next;
};
struct queue_list {
struct queue_node *head;
int elements;
};
struct queue_list.head points to the first node in the list and I refer to
my list by passing a pointer to a struct queue_list as an argument to
functions.
Here's my qdeletepid() function, this function removes a node from anywhere
in the list if its pid field matches the pid passed to the function:
int qdeletepid (struct queue_list *pqueue, int pid) {
struct queue_node *lcur;
struct queue_node *lprev;
if (pqueue->elements == 0) {
listop=FALSE;
return(0);
}
if (pqueue->elements == 1) {
lcur=pqueue->head;
if (lcur->pid == pid) {
pqueue->head=NULL;
pqueue->elements=0;
free(lcur);
}
listop=FALSE;
return(0);
}
lcur=pqueue->head;
lprev=NULL;
while (lcur!=NULL) {
if (lcur->pid == pid) {
if (lprev==NULL) {
pqueue->head=NULL;
pqueue->elements=0;
} else {
lprev->next=lcur->next;
pqueue->elements=pqueue->elements-1;
}
free(lcur);
}
lprev=lcur;
lcur=lcur->next;
}
return(0);
}
----------
For completeness, here's my qaddpid() function:
int qaddpid (struct queue_list *nqueue, char *ip, int pid) {
struct queue_node *new;
struct queue_node *cur;
struct queue_node *prev;
listop=TRUE;
new= (struct queue_node *) malloc(sizeof(struct queue_node));
if (new==NULL) {
syslog(LOG_NOTICE,"Failed to malloc");
return(1);
}
new->next=NULL;
snprintf(new->ipaddr,IPSIZE+1,"%s",ip);
new->pid=pid;
prev=NULL;
cur=nqueue->head;
while (cur != NULL) {
prev=cur;
cur=cur->next;
}
if (prev!=NULL) {
new->next=prev->next;
prev->next=new;
} else {
nqueue->head=new;
}
nqueue->elements++;
return(0);
}
------
And here's the function I have set to handle sigchld:
void childhandle (int signum) {
pid_t cpid=0;
while ((cpid=waitpid(0,NULL,WNOHANG)) > 0)
qdeletepid(&queue,cpid);
}
--
Any ideas why qdeletepid is failing to remove nodes from the list very
occasionally? As this happens only rarely (but the script forks a lot of
children constantly 24/7 so over the course of a couple of days the list
gets progressively bigger and bigger) it's very difficult to debug without
knowing the reason the problem is occurring.
Help very much appreciated.
~Kieran Simkin
Digital Crocus
http://digital-crocus.com/