Signal dispositions

Leet Jon

Hi what's the reason for having the default disposition for SIGSEGV,
SIGFPE, SIGBUS etc to be terminating the program, when these signals can
just be ignored by the program? Many programs crash with SIGSEGV -
they'd be much less flakey if the default was to try to carry on.

~Jon~

Nov 2 '07 #1

Subscribe Post Reply

2067

Joachim Schmitz

"Leet Jon" <jo*@nospam.comschrieb im Newsbeitrag
news:sl*******************@nospam.com...

Hi what's the reason for having the default disposition for SIGSEGV,
SIGFPE, SIGBUS etc to be terminating the program, when these signals can
just be ignored by the program? Many programs crash with SIGSEGV -
they'd be much less flakey if the default was to try to carry on.

Carry on with corrupted data? No, that's not a sane default.

Bye, Jojo

Nov 2 '07 #2

Keith Thompson

Leet Jon <jo*@nospam.comwrites:

Hi what's the reason for having the default disposition for SIGSEGV,
SIGFPE, SIGBUS etc to be terminating the program, when these signals can
just be ignored by the program? Many programs crash with SIGSEGV -
they'd be much less flakey if the default was to try to carry on.

The default handling for signals is implementation-defined (C99
7.14p4), so you might get better answers in comp.unix.programmer than
here in comp.lang.c. (I just noticed the cross-post; I've set
followups to comp.unix.programmer.)

However, letting a program continue running by default after a
catastrophic data-corrupting failure would not be a good idea. If a
program dies immediately after "an invalid access to storage" (which
is all the C standard says about SIGSEGV), then you have a good chance
of diagnosing and correcting the problem before putting the code into
production. If the error is ignored, the program will very likely
continue to corrupt your data in subtle ways; tracking it down and
fixing it is going to be difficult if the error occurs at a customer
site, or even during an important demo.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Nov 2 '07 #3

Leet Jon

On 2 Nov 2007 at 19:10, Keith Thompson wrote:

However, letting a program continue running by default after a
catastrophic data-corrupting failure would not be a good idea. If a
program dies immediately after "an invalid access to storage" (which
is all the C standard says about SIGSEGV), then you have a good chance
of diagnosing and correcting the problem before putting the code into
production. If the error is ignored, the program will very likely
continue to corrupt your data in subtle ways; tracking it down and
fixing it is going to be difficult if the error occurs at a customer
site, or even during an important demo.

I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

Who wants their customer to run their program and have it just crash
with a segfault? That hardly comes across as professional. Better to try
your best to carry on and weather the storm than to just dump the user
with a crash.

I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

Nov 2 '07 #4

santosh

Leet Jon wrote:

Hi what's the reason for having the default disposition for SIGSEGV,
SIGFPE, SIGBUS etc to be terminating the program,

That's not specified by the C Standard but is a matter for
implementation.

when these signals can just be ignored by the program?

Do you routinely ignore early signs of serious illness? No? Then why
should one ignore signs of erroneous conditions in a program and allow
it to result in erroneous output?

Many programs crash with SIGSEGV - they'd be much less flakey if the
default was to try to carry on.

Try a little demo yourself. Write a data processing program of any kind
and deliberately code in a bounds violation condition. Then make sure
to catch SIGSEGV and continue execution. Observe if the end result is
what the program is supposed to do.

Signals indicate exceptional situations that the program must imminently
address. Ignoring a signal, regardless of whether it's because of
program error or a normal but exceptional condition, is only likely to
break the program further.

Nov 2 '07 #5

Al Balmer

On Fri, 2 Nov 2007 21:16:31 +0100 (CET), Leet Jon <jo*@nospam.com>
wrote:

>On 2 Nov 2007 at 19:10, Keith Thompson wrote:
>However, letting a program continue running by default after a
catastrophic data-corrupting failure would not be a good idea. If a
program dies immediately after "an invalid access to storage" (which
is all the C standard says about SIGSEGV), then you have a good chance
of diagnosing and correcting the problem before putting the code into
production. If the error is ignored, the program will very likely
continue to corrupt your data in subtle ways; tracking it down and
fixing it is going to be difficult if the error occurs at a customer
site, or even during an important demo.

I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

How on earth would you know what the consequences might be? If the
program in question is calculating my paycheck, I don't want any bad
array access to be ignored.

What kind of programs do you write? Games?

>
Who wants their customer to run their program and have it just crash
with a segfault? That hardly comes across as professional.

What's not professional is writing code that causes segfaults.

Better to try
your best to carry on and weather the storm than to just dump the user
with a crash.

I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.

--
Al Balmer
Sun City, AZ

Nov 2 '07 #6

Shadowman

Leet Jon wrote:

On 2 Nov 2007 at 19:10, Keith Thompson wrote:
>However, letting a program continue running by default after a
catastrophic data-corrupting failure would not be a good idea. If a
program dies immediately after "an invalid access to storage" (which
is all the C standard says about SIGSEGV), then you have a good chance
of diagnosing and correcting the problem before putting the code into
production. If the error is ignored, the program will very likely
continue to corrupt your data in subtle ways; tracking it down and
fixing it is going to be difficult if the error occurs at a customer
site, or even during an important demo.

I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

Who wants their customer to run their program and have it just crash
with a segfault? That hardly comes across as professional. Better to try
your best to carry on and weather the storm than to just dump the user
with a crash.

I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

OK, but you really haven't made a case that it would be worthwhile to
change the default behavior. What's wrong with supplying your own
signal handler when you want something else?

--
SM
rot13 for email

Nov 2 '07 #7

Martin Vuille

Leet Jon <jo*@nospam.comwrote in
news:sl*******************@nospam.com:

Hi what's the reason for having the default disposition for
SIGSEGV, SIGFPE, SIGBUS etc to be terminating the program,
when these signals can just be ignored by the program?

In spite of what you seem to believe, an application is not
allowed to ignore the signal, and is very limited in how it can
handle the signal. To quote POSIX/SUSv3:

"The behavior of a process is undefined after it ignores a SIGFPE,
SIGILL, SIGSEGV, or SIGBUS signal that was not generated by kill
( ),sigqueue( ),or raise( )."

and

"The behavior of a process is undefined after it returns normally
from a signal-catching function for a SIGBUS, SIGFPE, SIGILL, or
SIGSEGV signal that was not generated by kill( ),sigqueue( ),or
raise( )."

MV

--
I do not want replies; please follow-up to the group.

Nov 2 '07 #8

jameskuyper

Leet Jon wrote:

On 2 Nov 2007 at 19:10, Keith Thompson wrote:
However, letting a program continue running by default after a
catastrophic data-corrupting failure would not be a good idea. If a
program dies immediately after "an invalid access to storage" (which
is all the C standard says about SIGSEGV), then you have a good chance
of diagnosing and correcting the problem before putting the code into
production. If the error is ignored, the program will very likely
continue to corrupt your data in subtle ways; tracking it down and
fixing it is going to be difficult if the error occurs at a customer
site, or even during an important demo.

I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

In general, if that bad array access is a write, it may completely
mess up some other part of the program. Also, code sufficiently
defective to generate a bad array access is extremely unlikely to
generate only one such access; they usually produce large numbers of
them.

Who wants their customer to run their program and have it just crash
with a segfault?

Given the choice between crashing, and continuing to run, I strongly
prefer the crash. If someone desperately needs that program to be
running, they presumably need it to run correctly, and that's highly
unlikely after an ignored SIGSEG signal.

Nov 2 '07 #9

Keith Thompson

Leet Jon <jo*@nospam.comwrites:

On 2 Nov 2007 at 19:10, Keith Thompson wrote:
>However, letting a program continue running by default after a
catastrophic data-corrupting failure would not be a good idea. If a
program dies immediately after "an invalid access to storage" (which
is all the C standard says about SIGSEGV), then you have a good chance
of diagnosing and correcting the problem before putting the code into
production. If the error is ignored, the program will very likely
continue to corrupt your data in subtle ways; tracking it down and
fixing it is going to be difficult if the error occurs at a customer
site, or even during an important demo.

I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

Who wants their customer to run their program and have it just crash
with a segfault? That hardly comes across as professional. Better to try
your best to carry on and weather the storm than to just dump the user
with a crash.

Better to continue with bad data? Better to corrupt the user's
important files than to crash and leave them in their initial state?
Better to continue operating incorrectly and produce wrong answers, as
long as it *looks* good?

I don't think so.

Your goal should be for your code never to produce a SIGSEGV in the
first place. Since all software has bugs, that's not always
achievable, but you should certainly want to *know* when the program
produces SIGSEGV (or any other signal that indicates a problem).

I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

At the very least, you might consider handling the signal and logging
the error (and, preferably, cleanly shutting down the program). If
you just ignore it, then you might never know that there's a problem
-- except that users will decide that your software is unreliable.

What you advocate is the equivalent of putting a piece of black tape
over the oil warning light on your car's dashboard. It makes for a
more pleasant driving experience -- until your engine seizes up and
leaves you stranded in the middle of nowhere.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Looking for software development work in the San Diego area.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Nov 3 '07 #10

Barry Margolin

In article <11**********************@19g2000hsx.googlegroups. com>,
ja*********@verizon.net wrote:

Leet Jon wrote:
On 2 Nov 2007 at 19:10, Keith Thompson wrote:
However, letting a program continue running by default after a
catastrophic data-corrupting failure would not be a good idea. If a
program dies immediately after "an invalid access to storage" (which
is all the C standard says about SIGSEGV), then you have a good chance
of diagnosing and correcting the problem before putting the code into
production. If the error is ignored, the program will very likely
continue to corrupt your data in subtle ways; tracking it down and
fixing it is going to be difficult if the error occurs at a customer
site, or even during an important demo.
I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

In general, if that bad array access is a write, it may completely
mess up some other part of the program.

If the write gets a SIGSEGV, it doesn't actually write anything. The
meaning of that signal is that you tried to write to a nonexistent
virtual address. So if by "mess up some other part of the program" you
meant that it would overwrite that part's data, that obviously can't
happen.

On the other hand, if some other part of the program was expecting to
read what you wrote, it will certainly be messed up by the lack of that
data.

Who wants their customer to run their program and have it just crash
with a segfault?

Given the choice between crashing, and continuing to run, I strongly
prefer the crash. If someone desperately needs that program to be
running, they presumably need it to run correctly, and that's highly
unlikely after an ignored SIGSEG signal.

Agreed. Almost any time a program gets one of these signals, it means
it has a serious bug. It's better to find out that it's broken than to
pretend it isn't.

--
Barry Margolin, ba****@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***

Nov 3 '07 #11

Logan Shaw

Leet Jon wrote:

I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

And very often this will not be the case and quitting before more damage
happens is the best thing.

For example, consider a program which moves a tree of files from one
filesystem to another by copying them and then deleting the originals
once the copy finishes successfully. And suppose you get a SIGSEGV
while building the list of files to copy. If you ignore it, you might
end up making copies of half the files, then deleting all the originals!

If you really want to recover from local faults, then use a language that
does bounds checking on arrays and pointer/reference dereferencing and
throws exceptions when these things happen. Then if you know such
errors really won't corrupt the state of the larger program and that the
fault is really localized, you can write an exception handler to do the
error recovery and contain the fault within whatever bounds you've
pre-determined it actually *can* be confined within.

Who wants their customer to run their program and have it just crash
with a segfault?

I'd much rather the customer encounter a segfault, file a bug report,
and give me a chance to fix it than I would have it just silently fail
and let the error continue, corrupting data or whatever else for who
knows how many years upon years. There was a trend in business a
decade or two ago called "total quality management" (or TQM), and the
basic idea was that when faults happen, you should not whitewash over
them, and you should instead stop what you're doing and not proceed
until you've corrected the problem. This was carried a little too far
(like most trendy business ideas), but there is some merit to this
approach. Ignoring failures just (a) causes problems and (b) encourages
people to stop caring about whether they cause failures.

- lOGAN

Nov 3 '07 #12

Rainer Weikusat

Leet Jon <jo*@nospam.comwrites:

On 2 Nov 2007 at 19:10, Keith Thompson wrote:
>However, letting a program continue running by default after a
catastrophic data-corrupting failure would not be a good idea. If a
program dies immediately after "an invalid access to storage" (which
is all the C standard says about SIGSEGV), then you have a good chance
of diagnosing and correcting the problem before putting the code into
production. If the error is ignored, the program will very likely
continue to corrupt your data in subtle ways; tracking it down and
fixing it is going to be difficult if the error occurs at a customer
site, or even during an important demo.

I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

Who wants their customer to run their program and have it just crash
with a segfault? That hardly comes across as professional. Better to try
your best to carry on and weather the storm than to just dump the user
with a crash.

I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

Let me paraphrase this ('my' is here supposed to mean 'Leet Jon'):
My code is full of ocasionally happening invalid memory accesses,
which I am to lazy to debug, even if I could. But my customers have no
way if knowing this, except, unfortunately, these invalid memory
accesses lead to the kernel terminating the process. Since they cannot
possibly tell if some output of the program has been generated in the
course of the algoritms they think it would be performing on the data
they fed to it, has instead been calculated using left-over register
contents from arbitrary functions, which could not be replaced because
of faulting load instructions and intermediate results having
vanished into nowhere land because the stores intended to store them
faulted, too, not taking into account that the control flow has
been mostly unpredictable due to corrupted stack frames, they would
just happily accpet it. I am convinced it works most of the time.

Now, for the sake of the argument, let's swap 'program' with 'electric
device', invalid memory access with 'improperly isolated flow of
current' and 'works most of time' with 'only kills someone every now
and then'.

Except for 'traditional lenience' wrt to software, there is no
'functional' difference.

Nov 3 '07 #13

Leet Jon

On 2 Nov 2007 at 20:34, Al Balmer wrote:

>>I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.

Perhaps you are unaware that some C code is run in safety-critical
environments - having a program that dumps core at the drop of a hat
rather than carrying on running could literally be the difference
between life and death. OK, so *maybe* the error condition causing the
SIGSEGV will propagate and bring the program down later, but taking that
chance is a better option than immediately failing.

~Jon~

Nov 3 '07 #14

Joachim Schmitz

"Leet Jon" <jo*@nospam.comschrieb im Newsbeitrag
news:sl*******************@nospam.com...

On 2 Nov 2007 at 20:34, Al Balmer wrote:

>>>I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.

Perhaps you are unaware that some C code is run in safety-critical
environments - having a program that dumps core at the drop of a hat
rather than carrying on running could literally be the difference
between life and death. OK, so *maybe* the error condition causing the
SIGSEGV will propagate and bring the program down later, but taking that
chance is a better option than immediately failing.

I beg your pardon? A program in a safety-critical environment should be
tested properly so that SIGSEGVs simply don't happen.
Als I'd rather have the program crash and a human operator take over control

Exapmple a flight auto pilot, program gets a SIGSEV but continues without
telling anybody and the plane crashes into a mountain as a result of it's
wrong calculations. Alternative: The progam abends, The system tells the
pilot about it and the pilot takes over control.

Bye, Jojo

Nov 3 '07 #15

Tor Rustad

Leet Jon wrote:

On 2 Nov 2007 at 20:34, Al Balmer wrote:

>>I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.
In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.

Perhaps you are unaware that some C code is run in safety-critical
environments - having a program that dumps core at the drop of a hat
rather than carrying on running could literally be the difference
between life and death.

In safety-critical systems, you don't want to depend on a module in an
undefined state. In a fault-tolerant design, you avoid depending on
single point of failure modules.

How are you supposed to detect a HW fault in time, if ignoring
signals/exceptions?

The way this usually works, is that faults are not ignored, but when
detected, the module is taken down by a monitor program, some error
recovery can be performed by restarting the module, if that fails, the
module is shut down for good.

The system continue working, by resuming processing in independent HW,
from a well-defined state.

OK, so *maybe* the error condition causing the
SIGSEGV will propagate and bring the program down later, but taking that
chance is a better option than immediately failing.

Hopefully, you are not programming a nuclear plant control system.

--
Tor <bw****@wvtqvm.vw | tr i-za-h a-z>

Nov 3 '07 #16

Golden California Girls

Leet Jon wrote:

On 2 Nov 2007 at 20:34, Al Balmer wrote:

>>I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.
In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.

Perhaps you are unaware that some C code is run in safety-critical
environments - having a program that dumps core at the drop of a hat
rather than carrying on running could literally be the difference
between life and death. OK, so *maybe* the error condition causing the
SIGSEGV will propagate and bring the program down later, but taking that
chance is a better option than immediately failing.

~Jon~

The error handler which you must provide will bring the controls it is doing
into a state that is normal or safe that won't cause death, ring a klaxon and
turn on a warning light and then crash. It should arrange to make itself not
restartable until it has been pulled from the working environment and placed on
a test bench! Anything else would be criminal.

If that means that the anti-lock brake system light stays on with the check
vehicle light flashing and you are operating on analog backup only then that is
what it means! (To put this into context)
You might obtain a copy of National Bureau of Standards (NBS) Computer Science
and Technology series, Special Publication 500-75, February 1981 "Validation,
Verification, and Testing of Computer Software" by W. Richards Adrion, Martha A.
Branstad, John C. Cherniavsky. Library of Congress Card Number 80-600199. I'm
sure other publications have followed this, but you will get a sense of what the
responsibility of the programmer is to design a test suite to prove the program
works as expected under all conditions expected and unexpected.

Nov 3 '07 #17

Giorgos Keramidas

On Fri, 2 Nov 2007 21:16:31 +0100 (CET), Leet Jon <jo*@nospam.comwrote:

>On 2 Nov 2007 at 19:10, Keith Thompson wrote:
>However, letting a program continue running by default after a
catastrophic data-corrupting failure would not be a good idea. If a
program dies immediately after "an invalid access to storage" (which
is all the C standard says about SIGSEGV), then you have a good
chance of diagnosing and correcting the problem before putting the
code into production. If the error is ignored, the program will very
likely continue to corrupt your data in subtle ways; tracking it down
and fixing it is going to be difficult if the error occurs at a
customer site, or even during an important demo.

I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

So the program may 'think' it has saved the important dataset from a
medical patient's important test, but the data has disappeared because
it was written ... well, nowhere in particular.

Do you *really* want this program to go on?

Nov 3 '07 #18

Logan Shaw

Leet Jon wrote:

On 2 Nov 2007 at 20:34, Al Balmer wrote:

>>I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.
In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.

Perhaps you are unaware that some C code is run in safety-critical
environments - having a program that dumps core at the drop of a hat
rather than carrying on running could literally be the difference
between life and death.

If a SIGSEGV can be the difference between life and death, then such
code has *no* *right* to ever *cause* a SIGSEGV, regardless of how the
system is going to respond to the SIGSEGV (ignoring it and letting
the program continue, or aborting it).

There are several solutions that could proper here:
(1) Keep the code simple enough that you can use mathematics to
prove it correct. This has been been done successfully with
some designs. It's not easy, but then we're talking about a
life or death situation here.
(2) Exhaustively test the code. Sometimes this is not possible
due to exponential explosion of test cases, but sometimes
it actually is.
(3) Nearly-exhaustively test the code. Maybe testing every possible
program path isn't possible, but very thorough test coverage
(not just of lines of code, but of "interesting" combination
of inputs) is possible. That might be acceptable if combined
with other quality efforts.
(4) Use a system where, on a *local* basis, *individual* faults can
be determined to be harmless and the program can proceed.
Notice that this is not the same thing as ignoring SIGSEGV
for the entire program and assuming all invalid memory
accesses are OK. Instead, what I'm talking about is a
system where you can say "if THIS block of code goes
outside the bounds of THAT array, then THAT ONE THING
should not be a fatal error, and here is the routine that
will do the error handling and keep the system in a known
good state".

Of course, it's silly to be having a discussion about safety-critical
software in comp.unix.programmer. Maybe there's one that I don't know
about, but as far as I know, there isn't a version of Unix that is
meant to be used in an environment like that. In fact, where I've
checked, license agreements often specifically exclude the use of the
software in such an environment. And for good reason: a system that
can get somebody killed needs to use software that's simpler that Unix.

- Logan

Nov 3 '07 #19

santosh

Logan Shaw wrote:

<snip>

You might consider dropping c.l.c. from the cross-post and perhaps
replace it with comp.programming and set followups to the same.

Nov 3 '07 #20

Golden California Girls

Leet Jon wrote:

On 2 Nov 2007 at 20:34, Al Balmer wrote:

>>I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.
In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.

Perhaps you are unaware that some C code is run in safety-critical
environments - having a program that dumps core at the drop of a hat
rather than carrying on running could literally be the difference
between life and death. OK, so *maybe* the error condition causing the
SIGSEGV will propagate and bring the program down later, but taking that
chance is a better option than immediately failing.

~Jon~

Continue after error
http://www.netcomp.monash.edu.au/cpe...~tgallagh.html

yeah right.

Nov 3 '07 #21

CBFalconer

Leet Jon wrote:

>

.... snip ...

>
Perhaps you are unaware that some C code is run in safety-critical
environments - having a program that dumps core at the drop of a
hat rather than carrying on running could literally be the
difference between life and death. OK, so *maybe* the error
condition causing the SIGSEGV will propagate and bring the program
down later, but taking that chance is a better option than
immediately failing.

Apparently you are unaware that such a program has absolutely no
business running in such a 'safety critical environment'.

--
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>
Try the download section.

--
Posted via a free Usenet account from http://www.teranews.com

Nov 3 '07 #22

Al Balmer

On Sat, 03 Nov 2007 04:14:45 GMT, al****@brothers.orgy (Almond) wrote:

>Send any feedback, ideas, suggestions, test results to

Here's some feedback: Your advertising, release notes, and privacy
policy are inappropriate here, even in a sig block.

Limit your signature to three or four lines, which is plenty of space
to include your URL.

--
Al Balmer
Sun City, AZ

Nov 4 '07 #23

Al Balmer

On Sat, 3 Nov 2007 10:15:52 +0100 (CET), Leet Jon <jo*@nospam.com>
wrote:

>On 2 Nov 2007 at 20:34, Al Balmer wrote:

>>>I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.

Perhaps you are unaware that some C code is run in safety-critical
environments

That's pretty funny, considering I wrote safety-critical code for the
process control industry for over twenty years. Food, petroleum,
polymers, paper, you name it.

If the coolant control program on a PVC reactor crashes, you don't
ignore it and keep cooking. You kill not only the program, but the
process. Otherwise, you kill people.

>- having a program that dumps core at the drop of a hat
rather than carrying on running could literally be the difference
between life and death. OK, so *maybe* the error condition causing the
SIGSEGV will propagate and bring the program down later, but taking that
chance is a better option than immediately failing.

Don't bother applying for a job here. We don't insist that all
new-hires be expert, but we do want them to be trainable.

--
Al Balmer
Sun City, AZ

Nov 4 '07 #24

Jim Cochrane

["Followup-To:" header set to comp.unix.programmer.]
On 2007-11-03, Leet Jon <jo*@nospam.comwrote:

On 2 Nov 2007 at 20:34, Al Balmer wrote:

>>>I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.

Perhaps you are unaware that some C code is run in safety-critical
environments - having a program that dumps core at the drop of a hat
rather than carrying on running could literally be the difference
between life and death. OK, so *maybe* the error condition causing the

Allowing a process with corrupted data to continue running can also
cause death.

SIGSEGV will propagate and bring the program down later, but taking that
chance is a better option than immediately failing.

~Jon~

Nov 6 '07 #25

Nick Keighley

On 3 Nov, 09:15, Leet Jon <j...@nospam.comwrote:

On 2 Nov 2007 at 20:34, Al Balmer wrote:

>I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.

Perhaps you are unaware that some C code is run in safety-critical
environments

OHMYGOD

*please* tell me you don't write safety critical code!

- having a program that dumps core at the drop of a hat
rather than carrying on running could literally be the difference
between life and death. OK, so *maybe* the error condition causing the
SIGSEGV will propagate and bring the program down later, but taking that
chance is a better option than immediately failing.

--
Nick Keighley

Nov 6 '07 #26

Golden California Girls

Nick Keighley wrote:

On 3 Nov, 09:15, Leet Jon <j...@nospam.comwrote:
>On 2 Nov 2007 at 20:34, Al Balmer wrote:

>>>I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.
In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.
Perhaps you are unaware that some C code is run in safety-critical
environments

OHMYGOD

*please* tell me you don't write safety critical code!

>- having a program that dumps core at the drop of a hat
rather than carrying on running could literally be the difference
between life and death. OK, so *maybe* the error condition causing the
SIGSEGV will propagate and bring the program down later, but taking that
chance is a better option than immediately failing.

--
Nick Keighley

I suspect he did, until his boss found out what he was planning and canned his
ass. Now he's looking to prove his boss wrong. That of he's a troll.

Nov 6 '07 #27

Charlie Gordon

"Al Balmer" <al******@att.neta écrit dans le message de news:
r6********************************@4ax.com...

On Fri, 2 Nov 2007 21:16:31 +0100 (CET), Leet Jon <jo*@nospam.com>
wrote:

>>On 2 Nov 2007 at 19:10, Keith Thompson wrote:
>>However, letting a program continue running by default after a
catastrophic data-corrupting failure would not be a good idea. If a
program dies immediately after "an invalid access to storage" (which
is all the C standard says about SIGSEGV), then you have a good chance
of diagnosing and correcting the problem before putting the code into
production. If the error is ignored, the program will very likely
continue to corrupt your data in subtle ways; tracking it down and
fixing it is going to be difficult if the error occurs at a customer
site, or even during an important demo.

I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

How on earth would you know what the consequences might be? If the
program in question is calculating my paycheck, I don't want any bad
array access to be ignored.

Someone else might want to check first if the error is worth such a drastic
treatment.

With your suggested behaviour, the paycheck is not printed, and who knows
when the problem will be fixed... If you can wait for your paycheck, you'll
be OK, else too bad.

Alternately, let it print the damn check, there is a good chance the check
will be correct and arrice in time. There is some possibility that the
error is so small as to not be worth reporting. If the error is large, the
you can complain and have it fixed... Or you will not complain and wait for
the bank to figure where these millions came from ;-)

If you are the payer, you probably want the process to stop. If you are the
payee, it is not so obvious.

What kind of programs do you write? Games?
>>
Who wants their customer to run their program and have it just crash
with a segfault? That hardly comes across as professional.

What's not professional is writing code that causes segfaults.

>Better to try
your best to carry on and weather the storm than to just dump the user
with a crash.

I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.

If they are, they should be logged and reported yet best efforts should be
extended to minimize the impact on the user. Warning the user of potential
malfunction, requesting urgent attention may be more appropriate than a core
dump with no warning and no restart. Use common sense to determine what be
least impact the user. When the oil gauge trips, the dashboard turns a
light on, it does not immediately block the engine, fire the ejector seats
and vaporize the contents of the trunk.

--
Chqrlie.

Nov 7 '07 #28

James Kuyper

Charlie Gordon wrote:

"Al Balmer" <al******@att.neta écrit dans le message de news:
r6********************************@4ax.com...
>On Fri, 2 Nov 2007 21:16:31 +0100 (CET), Leet Jon <jo*@nospam.com>
wrote:

>>On 2 Nov 2007 at 19:10, Keith Thompson wrote:

....

>How on earth would you know what the consequences might be? If the
program in question is calculating my paycheck, I don't want any bad
array access to be ignored.

Someone else might want to check first if the error is worth such a drastic
treatment.

With your suggested behaviour, the paycheck is not printed, and who knows
when the problem will be fixed... If you can wait for your paycheck, you'll
be OK, else too bad.

You're much better off with the program aborting than with it producing
large amounts of problems, which is the most likely case. It might
transfer money from one account to another, without making the balancing
adjustment to another account. It might accidentally wipe out all of the
employee records. It might accidentally wipe out a small random subset
of the employee records, which is worse because it will take longer to
notice.

Alternately, let it print the damn check, there is a good chance the check
will be correct and arrice in time. There is some possibility that the
error is so small as to not be worth reporting. If the error is large, the
you can complain and have it fixed... Or you will not complain and wait for
the bank to figure where these millions came from ;-)

Everything about that paragraph is wrong. The chances are not good that
the paycheck will be correct and arrive on time. There's a large
probability that the error will be a big one. There is no error so small
that it's not worth reporting; tax auditors tend to get very concerned
about even small errors, because they think they might be a signs of
something more serious (and they are right to think that). If the error
is large, fixing it can be very expensive for the payer, and a lot of
hassle for the payee.

....

>>I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.
In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.

If they are, they should be logged and reported yet best efforts should be
extended to minimize the impact on the user. Warning the user of potential
malfunction, requesting urgent attention may be more appropriate than a core
dump with no warning and no restart.

The core dump IS your warning, and restart should NOT be attempted until
the problem has been resolved, otherwise you could easily add to the
damage created by the first run of the program.

Use common sense to determine what be
least impact the user. When the oil gauge trips, the dashboard turns a
light on, it does not immediately block the engine, fire the ejector seats
and vaporize the contents of the trunk.

Yes, but the oil guage isn't analogous to a SIGSEGV. A better analogy
would be to compare a SIGSEGV to the motion sensor alarm which triggers
an air bag to explode in your face. When that air bag explodes, you are
guaranteed to lose control of the car, if you haven't already done so
(as I can unfortunately testify to from personal experience). However,
if that motion sensor trips, the situation is generally so serious that
you're probably better off losing control with an airbag in your face,
then you would be if you retained control with no airbag protection.
This isn't necessarily true; in some cases the airbag can kill you when
you would have survived without it, but in general you're safer with it
exploding in your face. That is a very accurate analogy to a SIGSEGV.

Nov 7 '07 #29

Al Balmer

On Wed, 7 Nov 2007 15:38:28 +0100, "Charlie Gordon" <ne**@chqrlie.org>
wrote:

>"Al Balmer" <al******@att.neta écrit dans le message de news:
r6********************************@4ax.com...
>On Fri, 2 Nov 2007 21:16:31 +0100 (CET), Leet Jon <jo*@nospam.com>
wrote:

>>>On 2 Nov 2007 at 19:10, Keith Thompson wrote:
However, letting a program continue running by default after a
catastrophic data-corrupting failure would not be a good idea. If a
program dies immediately after "an invalid access to storage" (which
is all the C standard says about SIGSEGV), then you have a good chance
of diagnosing and correcting the problem before putting the code into
production. If the error is ignored, the program will very likely
continue to corrupt your data in subtle ways; tracking it down and
fixing it is going to be difficult if the error occurs at a customer
site, or even during an important demo.

I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

How on earth would you know what the consequences might be? If the
program in question is calculating my paycheck, I don't want any bad
array access to be ignored.

Someone else might want to check first if the error is worth such a drastic
treatment.

With your suggested behaviour, the paycheck is not printed, and who knows
when the problem will be fixed... If you can wait for your paycheck, you'll
be OK, else too bad.

And if the error is in a control process that blows up a reactor and
kills a few people? How do you correct that mistake?

I think your point is that the problem analysis should take account of
the consequences of an error - that's obvious. Basic systems
engineering. I'm not advocating that the only possible way to treat a
segfault is to stop the program, though in a properly designed control
system, it's usually the best way.

>
Alternately, let it print the damn check, there is a good chance the check
will be correct and arrice in time. There is some possibility that the
error is so small as to not be worth reporting. If the error is large, the
you can complain and have it fixed... Or you will not complain and wait for
the bank to figure where these millions came from ;-)

All of which will cause more problems, eventually, both to the payer
and the payee. If the system stops, it *will* get fixed. People in
data processing take payroll runs *very* seriously. Did you imagine
that they would just not pay anybody else, and hope for a better run
next week?

>
If you are the payer, you probably want the process to stop. If you are the
payee, it is not so obvious.

>What kind of programs do you write? Games?
>>>
Who wants their customer to run their program and have it just crash
with a segfault? That hardly comes across as professional.

What's not professional is writing code that causes segfaults.

>>Better to try
your best to carry on and weather the storm than to just dump the user
with a crash.

I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.

If they are, they should be logged and reported yet best efforts should be
extended to minimize the impact on the user. Warning the user of potential
malfunction, requesting urgent attention may be more appropriate than a core
dump with no warning and no restart.

How do you warn of a segfault before it happens?

Use common sense to determine what be
least impact the user. When the oil gauge trips, the dashboard turns a
light on, it does not immediately block the engine, fire the ejector seats
and vaporize the contents of the trunk.

Not "common sense." Systems analysis.

--
Al Balmer
Sun City, AZ

Nov 7 '07 #30

Signal dispositions

Similar topics