473,378 Members | 1,688 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

Stuck with a strange core (C code) -- Please help.

SZ
Hi,

I've hit this core multiple times when running an application (coded
in C)
running on BSD OS. Using gdb getting to the core, it shows:

.....
(gdb) bt
#0 0x8541fa3 in mTimerQUnlink (t=0xd7f5a51, head=0x8fec13c)
at ../../rsvp/eventwheel.c:160
#1 0x854288d in mTimerInsert (t=0xd7f5a51,
callback=0x873ce88 <retran_unacked>, param=0xd7f5a01, time=50,
type=1)
at ../../rsvp/eventwheel.c:536
#2 0x873d266 in retran_unacked (unacked_cb=0xd7f5a00)
at ../../rsvp/rrrmsgi3.c:1425
....
Notice that the address (0xd7f5a00) of "unacketd_cb" (pointer of a
struct)
at level 2 is passed to the function of mTimerInsert as "param". But
"param" becomes 0xd7f5a01 (which added a "1" at the end). All the
subsequent processing is based on this wrong address causing the core.
The application is compiled with gcc with "-O2" along with some other
options.

I know address here can not start with ood number. This is a weired
problem. The stack shown with bt command in gdb seems to be complete.

Anyone knows what direction should I go with for the trouble shooting?

Thanks a lot

-SZ
Nov 14 '05 #1
11 1619
SZ wrote:
Hi,

I've hit this core multiple times when running an application (coded
in C)
running on BSD OS. Using gdb getting to the core, it shows:

....
(gdb) bt
#0 0x8541fa3 in mTimerQUnlink (t=0xd7f5a51, head=0x8fec13c)
at ../../rsvp/eventwheel.c:160
#1 0x854288d in mTimerInsert (t=0xd7f5a51,
callback=0x873ce88 <retran_unacked>, param=0xd7f5a01, time=50,
type=1)
at ../../rsvp/eventwheel.c:536
#2 0x873d266 in retran_unacked (unacked_cb=0xd7f5a00)
at ../../rsvp/rrrmsgi3.c:1425
...
Notice that the address (0xd7f5a00) of "unacketd_cb" (pointer of a
struct)
at level 2 is passed to the function of mTimerInsert as "param". But
"param" becomes 0xd7f5a01 (which added a "1" at the end). All the
subsequent processing is based on this wrong address causing the core.
The application is compiled with gcc with "-O2" along with some other
options.

I know address here can not start with ood number. This is a weired
problem. The stack shown with bt command in gdb seems to be complete.

Anyone knows what direction should I go with for the trouble shooting?


Towards a system or application specific newsgroup or mailing list,
for example, if this is not your own code.
You did not provide us with C code, let alone a minimal example.
Is it something you cooked up yourself? What does the structure
look like? What the function call? ...

It may be an easy to spot error -- but not without source.
Quality crystal balls don't come cheap.
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Nov 14 '05 #2
If I do:

void foo(struct foo *p)
{
char *a = (char *)p;
a++;
...
}

I increment by one the address.

Two problems spring into view:
1:
You are passing a wrong parameter or the routine is
expecting a char pointer that gets incremented in the
code of the routine
2:
There is a memory overwrite somewhere.

Procedure:

Follow the called routine (in assembler if you do
not have the source) and see where is being changed

jacob

Nov 14 '05 #3
SZ
Thanks, Jacob, for your response.

The address was not incremented in the C code purposely. It is
not char * either. Incrementing it would result in something much
larger since the structure has at least 40 bytes.

NOTE: the same routine is being run through multiple times and the problem
does not occur every time. I do have source code available. I've checked
the source code many times and did not found any where the address is purposely
changed before it gets passed to the next function.

So, I tend to believe this is memory violation of some sort. I once added
assert in the related routine to catch ood (bad) pointer, but it
then seem to core somewhere else.

I do not know if there is a good way to catch such memory violation. Would
appreciate it if anyone know a way to catch it.

Thanks again

-SZ
jacob navia <ja***@jacob.remcomp.fr> wrote in message news:<41**********************@news.wanadoo.fr>...
If I do:

void foo(struct foo *p)
{
char *a = (char *)p;
a++;
...
}

I increment by one the address.

Two problems spring into view:
1:
You are passing a wrong parameter or the routine is
expecting a char pointer that gets incremented in the
code of the routine
2:
There is a memory overwrite somewhere.

Procedure:

Follow the called routine (in assembler if you do
not have the source) and see where is being changed

jacob

Nov 14 '05 #4
SZ
Thanks for your response, Michael.

Sorry, I did not make it clear. The function actually are run
through multiple times and not every time it will core. The code
never purposely increment the address. The source is just like the
following (the original source it too long to put here).
int retran_unacked (unacked_cb=0xd7f5a00)
{
int sub = unacked_cb->sub; <---- value of sub is correct here.

other_func(unacked_cb);

... <-- some local operations.

mTimerInsert (&unacked_cb->time, callback_func, unacked_cb, time, type);

...
}

The function itself is not so complicated and the function (other_func())
can not change "unacked_cb". So, I have to assume there must be some
kind of memory violation. But I am not sure if there is a general good
way to track down such problem. I am not even sure the violation is somewhere
near or at totally irrelavent places.

I can mail you the source if needed.

Thanks again.

-SZ
Michael Mair <Mi**********@invalid.invalid> wrote in message news:<2v*************@uni-berlin.de>...
SZ wrote:
....


Towards a system or application specific newsgroup or mailing list,
for example, if this is not your own code.
You did not provide us with C code, let alone a minimal example.
Is it something you cooked up yourself? What does the structure
look like? What the function call? ...

It may be an easy to spot error -- but not without source.
Quality crystal balls don't come cheap.
Cheers
Michael

Nov 14 '05 #5
SZ wrote:

int retran_unacked (unacked_cb=0xd7f5a00)
Since you said you were using some flavor of BSD, that address seems to
refer to some stack... so my guess is that the unacked_cb was declared
as an automatic variable, and in this case...
{
int sub = unacked_cb->sub; <---- value of sub is correct here.

other_func(unacked_cb);

... <-- some local operations.

mTimerInsert (&unacked_cb->time, callback_func, unacked_cb, time, type);
.... this line is wrong. You are passing a pointer a field of an
automatic variable (&unacked_cb->time) to a function that probably
remembers it and will they to modify the pointed integer at some other
point in time when the automatic object is no longer live.
manuel,

...
}

The function itself is not so complicated and the function (other_func())
can not change "unacked_cb". So, I have to assume there must be some
kind of memory violation. But I am not sure if there is a general good
way to track down such problem. I am not even sure the violation is somewhere
near or at totally irrelavent places.

I can mail you the source if needed.

Thanks again.

-SZ
Michael Mair <Mi**********@invalid.invalid> wrote in message news:<2v*************@uni-berlin.de>...
SZ wrote:

...
Towards a system or application specific newsgroup or mailing list,
for example, if this is not your own code.
You did not provide us with C code, let alone a minimal example.
Is it something you cooked up yourself? What does the structure
look like? What the function call? ...

It may be an easy to spot error -- but not without source.
Quality crystal balls don't come cheap.
Cheers
Michael

Nov 14 '05 #6
Hello SZ,
one thing first: Please do not top-post.

Sorry, I did not make it clear. The function actually are run
through multiple times and not every time it will core. The code
never purposely increment the address.
Okay, so this really speaks of stack corruption or some other
flavour of pointer trouble.

The source is just like the
following (the original source it too long to put here).
Note: Many people who post code look-alikes leave out the critical
parts so the original code is really necessary if you cannot
break it down to a minimal example.
int retran_unacked (unacked_cb=0xd7f5a00)
{
int sub = unacked_cb->sub; <---- value of sub is correct here.

other_func(unacked_cb);

... <-- some local operations.

mTimerInsert (&unacked_cb->time, callback_func, unacked_cb, time, type);

...
}

The function itself is not so complicated and the function (other_func())
can not change "unacked_cb".
Is the place in the parameter lists where you pass unacked_cb
or the address of unacked_cb->time const qualified? Otherwise
I would say try it. Your compiler may tell you interesting
things about it.

So, I have to assume there must be some
kind of memory violation. But I am not sure if there is a general good
way to track down such problem. I am not even sure the violation is somewhere
near or at totally irrelavent places.
Okay, this is somewhat offtopic:
<OT>
You claim to have tried finding the error with about any other means,
so I have only one suggestion now:
Find out the addresses where things are stored and watch the contents
with hardware watchpoints. Example:
--------------
$ print &object
0xdeadbeef
$ watch *((struct objtype *)0xdeadbeef)
--------------
Important is to use the actual "address", otherwise the watchpoint
will cease existance after leaving the function where it was defined.
You have to delete the hardware watchpoints before a new run.
</OT>

I can mail you the source if needed.


Don't. If you have some webspace, put it up and post the URL.
There are many people here who know more than I or have at least
sooner time to have a peek at it than I do.
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Nov 14 '05 #7
SZ
Thanks, Michael, please the see the following:

Michael Mair <Mi**********@invalid.invalid> wrote in message news:<2v*************@uni-berlin.de>...

....
Note: Many people who post code look-alikes leave out the critical
parts so the original code is really necessary if you cannot
break it down to a minimal example.
Ok, I've put the source of the problematic function at the end.


Is the place in the parameter lists where you pass unacked_cb
or the address of unacked_cb->time const qualified? Otherwise
I would say try it. Your compiler may tell you interesting
things about it.
I did not put const qualifier. But I just did and compiled without
any warning or error. NOTE: it is not the contents of "unacked_cb"
gets changed, but the pointer itself gets incremented unexpected.
I tried: retran_unacked(RRR_SENT_UNACKED_CB * const unacked_cb CCXT_T CXT)
without any warning with compiler.
Okay, this is somewhat offtopic:
<OT>
You claim to have tried finding the error with about any other means,
so I have only one suggestion now:
Find out the addresses where things are stored and watch the contents
with hardware watchpoints. Example:
--------------
$ print &object
0xdeadbeef
$ watch *((struct objtype *)0xdeadbeef)
--------------
Important is to use the actual "address", otherwise the watchpoint
will cease existance after leaving the function where it was defined.
You have to delete the hardware watchpoints before a new run.
</OT>


The problem here is that unacked_cb is a pointer pointing to a dynamically
allocated memory and there many such pointers. Before the problem occurs,
I don't know which one will have the problem. Therefore, it is not possible
to type the debug command beforehand. I hope the "watch" function can be
coded into the c code to ensure every such pointer will not get changed
during its life cycle.

Notice the remarks I put behand "<--------" signs. The struct of
RRR_SENT_UNACKED_CB really has nothing specially. It has some pointers and
ints defined inside. But in this case, I guess it is irrelavent since we are talking
pointer itself's change, not its content's change.

Thanks lot.

-SZ
Here is the source code:

void
retran_unacked(RRR_SENT_UNACKED_CB *unacked_cb CCXT_T CXT)
{
RRR_NEXT_HOP_CB *next_hop = unacked_cb->next_hop_cb;
int frr_rc;
PSB *psbp;

next_hop->nh_event_usage_count++; <---- next_hop is correct,
so unacked_cb is ok here.
if (RPL_RL && (unacked_cb->sa_resend_attempts >= RPL_RL))
{
if ((unacked_cb->sa_pkt_msg_type == PATH) ||
(unacked_cb->sa_pkt_msg_type == RESV))
{
frr_rc = RRR_FRR_CONTINUE;
if (RRR_EX_FAST_REROUTE())
{
frr_rc = rrr_frr_proc_sent_unacked_msg(
unacked_cb->sa_pkt_msg_type,
unacked_cb->spi_msg_id_info->msg_id_psb_parent CCXT);
}

if (frr_rc == RRR_FRR_CONTINUE)
{
rrr_maybe_send_error_packet(unacked_cb->sa_pkt_msg_type,
unacked_cb->sa_state_handle,
unacked_cb->sa_upstrm_lih_set,
unacked_cb->sa_upstrm_lih
CCXT);
}
}

rrr_delete_sent_unacked_cb(unacked_cb, next_hop CCXT);
}
else
{
rrr_resend_packet(unacked_cb, next_hop CCXT);

if (next_hop->nh_use_msg_ids == ATG_NO)
{
rrr_delete_sent_unacked_cb(unacked_cb, next_hop CCXT);
}
else
{
unacked_cb->sa_resend_attempts++;
if (next_hop->nh_rr_decay != 100) {
unacked_cb->sa_retrans_interval *=
(100 + next_hop->nh_rr_decay);
unacked_cb->sa_retrans_interval /= 100;
} else {
unacked_cb->sa_retrans_interval =
unacked_cb->sa_retrans_interval << 1;
}
if (unacked_cb->sa_pkt_msg_type == PATH_TEAR) {
if (!unacked_cb->spi_msg_id_info) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
psbp = sent_unacked_cb->spi_msg_id_info->msg_id_psb_parent;
if (!psbp) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
if (psbp->rapid_retran & RPL_PATH_TEAR) {
psbp->rapid_retran &= ~(RPL_PATH_TEAR|RPL_PATH_ERROR);
rrr_delete_retries_for_psb(psbp,
RRR_KILL_RETRY_FOR_ALL_MSGS CCXT);
deferred_kill_PSB(psbp CCXT);
}
goto EXIT_LABEL;
}
if (unacked_cb->sa_retrans_interval >= RPL_RM) {
if (unacked_cb->sa_pkt_msg_type == PATH_TEAR) {
if (!unacked_cb->spi_msg_id_info) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
psbp = sent_unacked_cb->spi_msg_id_info->msg_id_psb_parent;
if (!psbp) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
if (psbp->rapid_retran & RPL_PATH_TEAR) {
psbp->rapid_retran &=
~(RPL_PATH_TEAR|RPL_PATH_ERROR);
rrr_delete_retries_for_psb(psbp,
RRR_KILL_RETRY_FOR_ALL_MSGS CCXT);
deferred_kill_PSB(psbp CCXT);
}
goto EXIT_LABEL;
} else if (unacked_cb->sa_pkt_msg_type == PATH_ERR){
if (!unacked_cb->spi_msg_id_info) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
psbp = sent_unacked_cb->spi_msg_id_info->msg_id_psb_parent;
if (!psbp) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
if (psbp->rapid_retran & RPL_PATH_ERROR) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
psbp->rapid_retran &= ~RPL_PATH_ERROR;
frr_rc = rrr_frr_proc_path_tear(psbp,
ATG_MPLS_XC_REL_REAS_IF_DOWN,
TRUE
CCXT);
if (frr_rc == RRR_FRR_CONTINUE) {
psbp->ps_rel_reason = ATG_MPLS_XC_REL_REAS_IF_DOWN;
tear_or_kill_PSB(psbp CCXT);
}
}
goto EXIT_LABEL;
} else if (unacked_cb->sa_pkt_msg_type == RESV_TEAR){
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
} else if (unacked_cb->sa_pkt_msg_type == RESV_ERR) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
unacked_cb->sa_retrans_interval = RPL_RF;
}
unacked_cb->sa_resend_time =
unacked_cb->sa_retrans_interval;

mTimerInsert(&unacked_cb->m_timer, (vfcnptr_2)retran_unacked,
unacked_cb,unacked_cb->sa_resend_time, TTYPE_ONESHOT); <------
The stack shows that
both &unacked_cb->m_timer
and unacked_cb are added by 1.
}
}
EXIT_LABEL:
next_hop->nh_event_usage_count--;
rrr_maybe_free_next_hop(next_hop CCXT);

return;

} /* retran_unacked */
Nov 14 '05 #8
SZ
Thanks, Michael, please the see the following:

Michael Mair <Mi**********@invalid.invalid> wrote in message news:<2v*************@uni-berlin.de>...

....
Note: Many people who post code look-alikes leave out the critical
parts so the original code is really necessary if you cannot
break it down to a minimal example.
Ok, I've put the source of the problematic function at the end.


Is the place in the parameter lists where you pass unacked_cb
or the address of unacked_cb->time const qualified? Otherwise
I would say try it. Your compiler may tell you interesting
things about it.
I did not put const qualifier. But I just did and compiled without
any warning or error. NOTE: it is not the contents of "unacked_cb"
gets changed, but the pointer itself gets incremented unexpected.
I tried: retran_unacked(RRR_SENT_UNACKED_CB * const unacked_cb CCXT_T CXT)
without any warning with compiler.
Okay, this is somewhat offtopic:
<OT>
You claim to have tried finding the error with about any other means,
so I have only one suggestion now:
Find out the addresses where things are stored and watch the contents
with hardware watchpoints. Example:
--------------
$ print &object
0xdeadbeef
$ watch *((struct objtype *)0xdeadbeef)
--------------
Important is to use the actual "address", otherwise the watchpoint
will cease existance after leaving the function where it was defined.
You have to delete the hardware watchpoints before a new run.
</OT>


The problem here is that unacked_cb is a pointer pointing to a dynamically
allocated memory and there many such pointers. Before the problem occurs,
I don't know which one will have the problem. Therefore, it is not possible
to type the debug command beforehand. I hope the "watch" function can be
coded into the c code to ensure every such pointer will not get changed
during its life cycle.

Notice the remarks I put behand "<--------" signs. The struct of
RRR_SENT_UNACKED_CB really has nothing specially. It has some pointers and
ints defined inside. But in this case, I guess it is irrelavent since we are talking
pointer itself's change, not its content's change.

Thanks lot.

-SZ
Here is the source code:

void
retran_unacked(RRR_SENT_UNACKED_CB *unacked_cb CCXT_T CXT)
{
RRR_NEXT_HOP_CB *next_hop = unacked_cb->next_hop_cb;
int frr_rc;
PSB *psbp;

next_hop->nh_event_usage_count++; <---- next_hop is correct,
so unacked_cb is ok here.
if (RPL_RL && (unacked_cb->sa_resend_attempts >= RPL_RL))
{
if ((unacked_cb->sa_pkt_msg_type == PATH) ||
(unacked_cb->sa_pkt_msg_type == RESV))
{
frr_rc = RRR_FRR_CONTINUE;
if (RRR_EX_FAST_REROUTE())
{
frr_rc = rrr_frr_proc_sent_unacked_msg(
unacked_cb->sa_pkt_msg_type,
unacked_cb->spi_msg_id_info->msg_id_psb_parent CCXT);
}

if (frr_rc == RRR_FRR_CONTINUE)
{
rrr_maybe_send_error_packet(unacked_cb->sa_pkt_msg_type,
unacked_cb->sa_state_handle,
unacked_cb->sa_upstrm_lih_set,
unacked_cb->sa_upstrm_lih
CCXT);
}
}

rrr_delete_sent_unacked_cb(unacked_cb, next_hop CCXT);
}
else
{
rrr_resend_packet(unacked_cb, next_hop CCXT);

if (next_hop->nh_use_msg_ids == ATG_NO)
{
rrr_delete_sent_unacked_cb(unacked_cb, next_hop CCXT);
}
else
{
unacked_cb->sa_resend_attempts++;
if (next_hop->nh_rr_decay != 100) {
unacked_cb->sa_retrans_interval *=
(100 + next_hop->nh_rr_decay);
unacked_cb->sa_retrans_interval /= 100;
} else {
unacked_cb->sa_retrans_interval =
unacked_cb->sa_retrans_interval << 1;
}
if (unacked_cb->sa_pkt_msg_type == PATH_TEAR) {
if (!unacked_cb->spi_msg_id_info) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
psbp = sent_unacked_cb->spi_msg_id_info->msg_id_psb_parent;
if (!psbp) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
if (psbp->rapid_retran & RPL_PATH_TEAR) {
psbp->rapid_retran &= ~(RPL_PATH_TEAR|RPL_PATH_ERROR);
rrr_delete_retries_for_psb(psbp,
RRR_KILL_RETRY_FOR_ALL_MSGS CCXT);
deferred_kill_PSB(psbp CCXT);
}
goto EXIT_LABEL;
}
if (unacked_cb->sa_retrans_interval >= RPL_RM) {
if (unacked_cb->sa_pkt_msg_type == PATH_TEAR) {
if (!unacked_cb->spi_msg_id_info) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
psbp = sent_unacked_cb->spi_msg_id_info->msg_id_psb_parent;
if (!psbp) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
if (psbp->rapid_retran & RPL_PATH_TEAR) {
psbp->rapid_retran &=
~(RPL_PATH_TEAR|RPL_PATH_ERROR);
rrr_delete_retries_for_psb(psbp,
RRR_KILL_RETRY_FOR_ALL_MSGS CCXT);
deferred_kill_PSB(psbp CCXT);
}
goto EXIT_LABEL;
} else if (unacked_cb->sa_pkt_msg_type == PATH_ERR){
if (!unacked_cb->spi_msg_id_info) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
psbp = sent_unacked_cb->spi_msg_id_info->msg_id_psb_parent;
if (!psbp) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
if (psbp->rapid_retran & RPL_PATH_ERROR) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
psbp->rapid_retran &= ~RPL_PATH_ERROR;
frr_rc = rrr_frr_proc_path_tear(psbp,
ATG_MPLS_XC_REL_REAS_IF_DOWN,
TRUE
CCXT);
if (frr_rc == RRR_FRR_CONTINUE) {
psbp->ps_rel_reason = ATG_MPLS_XC_REL_REAS_IF_DOWN;
tear_or_kill_PSB(psbp CCXT);
}
}
goto EXIT_LABEL;
} else if (unacked_cb->sa_pkt_msg_type == RESV_TEAR){
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
} else if (unacked_cb->sa_pkt_msg_type == RESV_ERR) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
unacked_cb->sa_retrans_interval = RPL_RF;
}
unacked_cb->sa_resend_time =
unacked_cb->sa_retrans_interval;

mTimerInsert(&unacked_cb->m_timer, (vfcnptr_2)retran_unacked,
unacked_cb,unacked_cb->sa_resend_time, TTYPE_ONESHOT); <------
The stack shows that
both &unacked_cb->m_timer
and unacked_cb are added by 1.
}
}
EXIT_LABEL:
next_hop->nh_event_usage_count--;
rrr_maybe_free_next_hop(next_hop CCXT);

return;

} /* retran_unacked */
Nov 14 '05 #9
see inline
[snip]

/* NOTE: both pointer parameters are void * because one of them may not
point to a valid object - i.e. off by one byte*/
void CheckPointers(const void *p1,const void *p2, const char * context)
{
if (p1!=p2)
{
fprintf(stderr,"%s %p!=%p\n",context,p1,p2); /* set a break point here
*/
}
}
void
retran_unacked(RRR_SENT_UNACKED_CB *unacked_cb CCXT_T CXT)
{
RRR_NEXT_HOP_CB *next_hop = unacked_cb->next_hop_cb;
int frr_rc;
PSB *psbp; RRR_SENT_UNACKED_CB *unacked_cb_back=unacked_cb;
next_hop->nh_event_usage_count++; <---- next_hop is correct,
so unacked_cb is ok here.
if (RPL_RL && (unacked_cb->sa_resend_attempts >= RPL_RL))
{
if ((unacked_cb->sa_pkt_msg_type == PATH) ||
(unacked_cb->sa_pkt_msg_type == RESV))
{
frr_rc = RRR_FRR_CONTINUE;
if (RRR_EX_FAST_REROUTE())
{
frr_rc = rrr_frr_proc_sent_unacked_msg(
unacked_cb->sa_pkt_msg_type,
unacked_cb->spi_msg_id_info->msg_id_psb_parent CCXT);
}

if (frr_rc == RRR_FRR_CONTINUE)
{
rrr_maybe_send_error_packet(unacked_cb->sa_pkt_msg_type,
unacked_cb->sa_state_handle,
unacked_cb->sa_upstrm_lih_set,
unacked_cb->sa_upstrm_lih
CCXT);
}
}

rrr_delete_sent_unacked_cb(unacked_cb, next_hop CCXT);
}
else
{
rrr_resend_packet(unacked_cb, next_hop CCXT);

if (next_hop->nh_use_msg_ids == ATG_NO)
{
rrr_delete_sent_unacked_cb(unacked_cb, next_hop CCXT);
}
else
{
unacked_cb->sa_resend_attempts++;
if (next_hop->nh_rr_decay != 100) {
unacked_cb->sa_retrans_interval *=
(100 + next_hop->nh_rr_decay); unacked_cb->sa_retrans_interval /= 100;
} else {
unacked_cb->sa_retrans_interval =
unacked_cb->sa_retrans_interval << 1;
}
if (unacked_cb->sa_pkt_msg_type == PATH_TEAR) {
if (!unacked_cb->spi_msg_id_info) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
psbp = sent_unacked_cb->spi_msg_id_info->msg_id_psb_parent;
if (!psbp) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
if (psbp->rapid_retran & RPL_PATH_TEAR) {
psbp->rapid_retran &= ~(RPL_PATH_TEAR|RPL_PATH_ERROR);
rrr_delete_retries_for_psb(psbp,
RRR_KILL_RETRY_FOR_ALL_MSGS CCXT);
deferred_kill_PSB(psbp CCXT);
}
goto EXIT_LABEL;
}
if (unacked_cb->sa_retrans_interval >= RPL_RM) {
if (unacked_cb->sa_pkt_msg_type == PATH_TEAR) {
if (!unacked_cb->spi_msg_id_info) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
psbp = sent_unacked_cb->spi_msg_id_info->msg_id_psb_parent; if (!psbp) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
if (psbp->rapid_retran & RPL_PATH_TEAR) {
psbp->rapid_retran &=
~(RPL_PATH_TEAR|RPL_PATH_ERROR);
rrr_delete_retries_for_psb(psbp,
RRR_KILL_RETRY_FOR_ALL_MSGS CCXT);
deferred_kill_PSB(psbp CCXT);
}
goto EXIT_LABEL;
} else if (unacked_cb->sa_pkt_msg_type == PATH_ERR){
if (!unacked_cb->spi_msg_id_info) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
psbp = sent_unacked_cb->spi_msg_id_info->msg_id_psb_parent; if (!psbp) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
if (psbp->rapid_retran & RPL_PATH_ERROR) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
psbp->rapid_retran &= ~RPL_PATH_ERROR;
frr_rc = rrr_frr_proc_path_tear(psbp,
ATG_MPLS_XC_REL_REAS_IF_DOWN,
TRUE
CCXT);
if (frr_rc == RRR_FRR_CONTINUE) {
psbp->ps_rel_reason = ATG_MPLS_XC_REL_REAS_IF_DOWN; tear_or_kill_PSB(psbp CCXT);
}
}
goto EXIT_LABEL;
} else if (unacked_cb->sa_pkt_msg_type == RESV_TEAR){
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
} else if (unacked_cb->sa_pkt_msg_type == RESV_ERR) {
rrr_delete_sent_unacked_cb(unacked_cb,
next_hop CCXT);
goto EXIT_LABEL;
}
unacked_cb->sa_retrans_interval = RPL_RF;
}
unacked_cb->sa_resend_time =
unacked_cb->sa_retrans_interval;
if unacked_cb is bad below then it should be bad above - work back from here
adding calls to CheckPointers. mTimerInsert(&unacked_cb->m_timer, (vfcnptr_2)retran_unacked,
unacked_cb,unacked_cb->sa_resend_time, TTYPE_ONESHOT); <------ The stack shows that
both &unacked_cb->m_timer and unacked_cb are added by 1.

}
}
EXIT_LABEL:
next_hop->nh_event_usage_count--;
rrr_maybe_free_next_hop(next_hop CCXT);

return;

} /* retran_unacked */

see top and inline
Add copius calls to CheckPointers(unacked_cb,unacked_cb_back,"Context
String"); throughout retran_unacked to figure out where unacked_cb is being
corrupted.


Nov 14 '05 #10
SZ
I think the problem is found:

there is a function, which is a couple of layers down from
rrr_resend_packet(), got something like a[i]++ with a[] defined
as local and i uninitialized. This cause it radmomly pick up
a place (in stack) and increment it.

Geez, I hope there is a better systematic way to catch memory
violation like this.

Thanks everyone for your help.

-SZ
Nov 14 '05 #11
sh******@yahoo.com (SZ) wrote:
there is a function, which is a couple of layers down from
rrr_resend_packet(), got something like a[i]++ with a[] defined
as local and i uninitialized. This cause it radmomly pick up
a place (in stack) and increment it.

Geez, I hope there is a better systematic way to catch memory
violation like this.


Well, in this case, a good way to catch it would probably have been not
to rely on a memory tracker and your OS alone, but to turn up the
warning level on your compiler until the knob nearly comes off, and then
turning it one stop back. All compilers I've ever worked with were
capable of telling you that you're using an uninitialised variable,
which would've caught this bug before it got out.
(The reason for the one stop back, btw, is that most compilers' "really
anal-retentive" settings are _too_ strict and pick up things that are,
e.g., only a problem if you're compiling for an embedded system, which
you probably aren't; at least one edition of one compiler famously
complains about its own system headers when set to the highest warning
level. Nevertheless, below the extreme, stricter warnings is better.)

Richard
Nov 14 '05 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: hope | last post by:
Hi Access 97 I'm lost on this code please can you help ================================= Below is some simple code that will concatenate a single field's value from multiple records into a...
3
by: Mads Petersen | last post by:
I'm stuck in this code. Hope you can and will help me. I launch it from excel. I have made the following code work, but not as i whant. I need the ranges to be working with something like xlDown....
8
by: Harvey Twyman | last post by:
I have code written under the CCS 'C' Compiler to run on a PIC microcontroller. Code Extract: ------------------------------- char a,b,c; ------------------------------- c = ( a == b );...
1
by: Fabrizio | last post by:
On a web page i found a very strange html code (see at the en of this message) to set the type of the characters (like bold, italic, etc.) and I have 2 question 1) How interpreter this code ?...
7
by: digimotif | last post by:
All, I'm currently working with a small development company on a PHP application they already have. There are several processing tasks that could be scripted in Python and run in the background...
6
by: marmar12 | last post by:
Hi guys, I'm having a little difficulty and kind of stuck writing this particular code for a grade calculator project. I'm using linux OS. So far, i have: #include <stdio.h> int man() {
3
by: BlueroY | last post by:
hi, I'm working on an exercise, i did a lot of work already and i just can't figure where I'm going wrong, this is what I'm trying to achieve Sample IO...
2
by: almurph | last post by:
H ieveryone, Can you help me please? I am trying to sort a hashtable but get the error: "Cannot implicity convert type void to System.Collections.ArrayList" I am doing the following: ...
11
by: ThaRealneSS | last post by:
Hi. I have made a blackjack code and need a some help on it. It seems to work overall but there are a few bits here and there that need sorting out, and I'm kinda stuck on it so was wondering if...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.