473,406 Members | 2,633 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

Graceful failures

Hello all,

I'm nearing the completion of my first graphical console game and my
thoughts have turned to the subject of gracefully handling runtime
errors. During development I like to not handle exceptions, so that
program execution will halt and I can immediately read the traceback
to see what's up. Once the bugs are more or less worked out, I have a
system ready wherein any exceptions are caught and an attempt is made
to write the traceback object to a log file. This system suits me for
the upcoming testing phase - if my beta testers do anything to make
the game crash, I can read about the error and its causes in the log
later.

But what about the eventual production version that I distribute to
the public? My current understanding is that there's always a chance a
computer program can fail in some way at runtime, disk access being an
example. Suppose the startup code for my game fails to load one of the
images from disk. I can't think of any reasonable way to rescue
things, and under such conditions I would prefer that the program
gracefully exits. The aforementioned traceback logging is fine for me
as the developer, but useless to the end user of the production
version. So - I'm real curious - what are the canonical ways that a
computer application should gracefully fail?

I'm somewhat reluctant to write a bunch of code for pretty windows
that pop up with some message to the effect of, "internal error, game
exiting." My main reservation here is it doesn't seem it can be done
cross platform, and since my game is in a graphical console there's no
place for stdout to write. However, if this is the best way to go
about error handling, I'm willing to write the code. And I would
appreciate advice about how to handle the cross platform problem. :)

Thanks in advance for any help!

Jake
Jul 18 '05 #1
3 1429
On Mon, 29 Dec 2003 14:14:43 -0800, Jacob H wrote:
I'm somewhat reluctant to write a bunch of code for pretty windows that
pop up with some message to the effect of, "internal error, game
exiting." My main reservation here is it doesn't seem it can be done
cross platform, and since my game is in a graphical console there's no
place for stdout to write. However, if this is the best way to go about
error handling, I'm willing to write the code. And I would appreciate
advice about how to handle the cross platform problem. :)


I can't speak for any "canonical" way of doing things, but I do have some
thoughts on the matter from my experience in customer service and some
computer tutoring I used to do. Maybe they can be of some use to you.
Forgive me as I ramble for a minute, but I haven't spent time collecting
these thoughts.

Most computer users react to error messages with fear and panic. They
feel like the computer is telling that *they* have done something wrong.
This misconception is compounded by the generally unintelligible error
messages thrown by most programs. Computers have a way of making even
some of the most intelligent people feel stupid.

The second most common type of user understands that they didn't
necessarily do anything wrong, and just wishes the program would work.
They might thumb through the manual to see if it has anything on the
matter. If they are given an obvious way to find a solution to the
problem, they will take it, but they generally won't work that hard.

The rest of us will dig in and try to solve the problem. This could be
anything from using google, to posting on usenet, to reading a disassembly
of the core dump.
My "dream error dialog box" would do the following:

Tell the user in a very neutral and nontechnical manner that it has
encountered a problem and tell the user what it was trying to do when it
failed.

for instance:
"""
Foo has encountered a problem.
Foo was unable to load the necessary images to continue.
"""

The program should avoid words like "error." You might consider having
the program keep a stack of descriptions of what it's trying to accomplish
at any given moment. That allows you to differentiate between being
unable to open the configuration file with the intent of initially loading
the configuration and opening the configuration file with the intent of
reverting to original settings.

Next, tell the user whether or not they can continue and what some of the
consequences of continuing might be.

The error dialog box should allow the user the option of viewing technical
details, but should clearly label them as technical details.

If possible, the user should be presented with possible courses of action
to correct the problem.

Finally, I really, really wish that users were presented with the option
of being automagically taken to the technical support website where
they're allowed to see and discuss with others how to solve their
particular problem. This is where I get a little bit hazy in my ideas. I
envision a website/forum where a user can read and discuss possible
solutions to their problem *and* similar problems. The proper web page
could be located from the stack trace. If they're the first one to
encounter that particular stack trace, then tell them that they're the
first, politely ask them to describe the problem (even if the
description's useless to you, it makes the user feel included) and tell
them that your development staff has been notified and will be making
contact with them soon. Offer to keep them updated via email when other
people post about having the same problem.

This type of system could be very useful to your development team and tech
support staff by allowing you to identify, track and fix the real life errors
that your users are encountering.

anyways...

HTH

Sam Walters

P.S.

If you haven't already, look at the Interface Hall of Shame to see what
*not* to do:
http://digilander.libero.it/chiedilo...tect/shame.htm
Jul 18 '05 #2
It is quite a coincidence that a thread like this pops up right when I
feel I have to start one - with almost exactly the same subject - myself.

Error handling can easily be the hardest task in a program, and that's why
its being neglected most of the time.

Am Wed, 31 Dec 2003 05:28:08 +0000 schrieb Samuel Walters:
My "dream error dialog box" would do the following:

Tell the user in a very neutral and nontechnical manner that it has
encountered a problem and tell the user what it was trying to do when it
failed.

for instance:
"""
Foo has encountered a problem.
Foo was unable to load the necessary images to continue.
"""
I would like it much more detailled, like:

"""
Foo has encountered a problem. It was trying to load a necessary image
from the file '/usr/share/Foo/images/up.png'. This file does apparently
not exist. The problem may be caused by a broken installation of Foo.
Alas, Foo cannot be continued and will be closed.
"""

(I'm not the best error message designer, but I hope you get the point:
Tell exactly *what* is missing, so an even moderatly experienced user can
try to fix it. I *hate* messages like "Cannot find image". What image?
Where is it supposed to be?)
The error dialog box should allow the user the option of viewing
technical details, but should clearly label them as technical details.
Which means the details I want go there - fine with me.
If possible, the user should be presented with possible courses of
action to correct the problem.
+1
Finally, I really, really wish that users were presented with the option
of being automagically taken to the technical support website where
they're allowed to see and discuss with others how to solve their
particular problem.

[snip]

This, of course, is a bit overkill for a little freeware program, but
sounds good for a "big" application with a hefty price tag.

Handling every conceivable error right is quite a challenge and lots of
work, especially in the test department.

Hans-Joachim Widmaier
Jul 18 '05 #3
|Thus Spake Hans-Joachim Widmaier On the now historical date of Wed, 31
Dec 2003 18:19:00 +0100|
It is quite a coincidence that a thread like this pops up right when I
feel I have to start one - with almost exactly the same subject -
myself.

Error handling can easily be the hardest task in a program, and that's
why its being neglected most of the time.
Agreed. Another problem is that programmers often forget what it's like
for people who don't understand computers. It's hard to overestimate the
apprehension many users feel towards computers. I've seen very competent
and intelligent people frozen with fear in front of a word processor
because they were afraid that they'd break the computer.

I would like it much more detailed, like:
Most people that will read this post would prefer a more detailed message.
Of course one should tailor the system to your program's target audience,
but let's assume, for the sake of this particular discussion, that the
errors will mostly be read by non-technically oriented people. I think
that detailed error reports should always be readily available, but should
not necessarily be the first thing presented.

"""
Foo has encountered a problem. It was trying to load a necessary image
from the file '/usr/share/Foo/images/up.png'. This file does apparently
not exist. The problem may be caused by a broken installation of Foo.
Alas, Foo cannot be continued and will be closed. """
That's a good message. I would avoid putting raw filenames in the
non-technical description. (Unless, of course, the user was trying to
open a document, then it's okay to list the path.) I would also avoid
using the words like "broken." Both of these things look normal to you and
me, but to a lot of people they're Scary Things(tm). By simply hiding the
filename behind a button that says "Technical Details" you have said "Hey,
you might not understand this, but that's OK because it's meant for the
geeks." Heck, they might take a look at it and feel proud that they
understand more of it than they thought they would.

I like the increased specificity of your error message. I might suggest:

"""
Foo has encountered a problem. It was trying to load necessary image
files. At least one image apparently does not exist. Uninstalling and then
Reinstalling Foo may correct this problem. Unfortunately, Foo cannot be
continued and will be closed. """

(I'm not the best error message designer, but I hope you get the point:
Nor am I. Most of the reason I offered up my opinion is so that other
people might clue me into new ideas on the matter.

Which means the details I want go there - fine with me.
As most people like you and me wouldn't mind one extra click to get at the
juicy details.

This, of course, is a bit overkill for a little freeware program, but
sounds good for a "big" application with a hefty price tag.
It makes as much sense as installing something like bugzilla. (more on
that in a minute)
Handling every conceivable error right is quite a challenge and lots of
work, especially in the test department.


Let me digress for a moment and explain how I stumbled into this idea of
allowing the user to jump to an online information system based on their
particular error. It might make more sense if you know where it came
from. Or, at least, you can pin-point the critical flaw in my logic.

My last development job was with a large company that preprocessed medical
insurance claims. Essentially, we took on contracts to collect insurance
claims from doctors, verify that the data was well-formatted (there are
several hundred formats for medical insurance claims) translate them into
the single format requested by our client, transmit them to the client,
receive the response and deliver the responses back to the doctors. Karl
Marx would have hated us. We were nothing but big-time middle-men. When
asked where I worked, I used to say "I work for a huge quasi-governmental
corporation that exists solely to shuffle paper."

The method for transferring data was plain old 56k modems and a simple
BBS. Our competitive advantage in the market was that we handled all
end-user support and that we offered a pretty gui program tailored
specifically for the medical field. (At this point, anyone who's worked
in that field knows exactly who I worked for.) The gui program,
essentially a fancy modem driver that allowed the user to track which
files had been sent and pair them with the proper responses.
Unfortunately the code for it was one huge tangled mess. It was the first
program I was asked to make changes too and, as I hadn't mastered the
Borland C++ debugger, I resorted to the tried and true method of keeping a
log file while debugging. Every time that the program began an action, I
had it write and flush, in plain english, what it was attempting to do. I
found that when I sent the product on to QA, that having that log was
invaluable. After some discussion with our tech support department, we
decided to keep the code in for the production version. It was a smashing
success. The front line tech support people were able to get much more
reliable information about how the user's particular error came about.
They started a database around the log-file so that they could reference
the solutions for people with the same or similar problems. Most
importantly, for me at least, I knew exactly where the end users were
having problems. I didn't have to deal with errors like "Version Foo
failed an assertion on line X in module Bar." I was able to implement
more graceful error handling.

So, I read the OP and thought to myself: Couldn't python keep a stack of
the 'goals' it's attempting to achieve, then if an exception was thrown or
an assertion failed couldn't this become part of the information about the
error. You would push goals onto the stack, then pop them off if the
program was reasonably certain that the action wasn't going to cause a
problem. Then, couldn't that information be used to see if other users
were having the same or similar problems? Couldn't that information be
used to register and track problems via a hybrid of a message-board and
bugzilla? Well, I don't see why not. In fact, since Python's exception
system is so wonderful you could have the error dialog marshal error codes
and basic stack-trace info into the get portion of a url and probably even
make it open a browser with the press of a button. The user wouldn't have
to know that all this went on in the background. All they would know is "I
pressed a button and it took me to a place where people wanted to help me
make it work."

I want to work on something like that someday. I've already got a couple
of projects on my plate, so at the moment this is blue sky thinking, but
don't you think it would be nice if things could work this way? Maybe
when my current project becomes stable, I'll see if I can add this as a
feature.

Sam Walters

--
Never forget the halloween documents.
http://www.opensource.org/halloween/
""" Where will Microsoft try to drag you today?
Do you really want to go there?"""

Jul 18 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

20
by: MickeyBob | last post by:
How does one detect the EOF gracefully? Assuming I have a pickle file containing an unknown number of objects, how can I read (i.e., pickle.load()) until the EOF is encountered without generating...
1
by: mmacrobert | last post by:
Hi there, I'm having trouble making some code do a "graceful" recovery for mathematical operations. Certain functions e.g. log10, will call "matherr" and provides an avenue for intervention in a...
0
by: db2sysc | last post by:
All: I am trying to enable Performance Trace Class(30,8) IFCID(125) to collect RID failures and generate report with SQLACTIVITY REPORT WORKLOAD(SCAN). Is there any other better way to capture...
1
by: Mainak Sarcar | last post by:
Hi, I am writing an terminal app that would allow you to connect to POP3 server and receive mail data in raw format. Now the POP3 server expects a graceful disconnection by sending a "quit\r\n"...
74
by: lovecreatesbeauty | last post by:
My small function works, but I have some questions. And I want to listen to you on How it is implemented? 1. The function does not check if parameter x is larger or smaller than parameter y. ...
8
by: emailwrong | last post by:
My codes are very ugly , and not well organized . I want it looks graceful . How can I improve ?
3
machismo350
by: machismo350 | last post by:
In order to communicate on new certificates,i am doing an apache restart (using apachectl -k graceful) ,which should take the new .crt & .key for communication but this takes more time and any...
2
by: csgonan | last post by:
I have a new 64 bit apache 2.2.4 server on Solaris 10 with openssl 0.9.8e. When I DO NOT have the ssl.conf file included and I "apachectl graceful" to apache, all my processes that are gracefully...
158
by: jacob navia | last post by:
1: It is not possible to check EVERY malloc result within complex software. 2: The reasonable solution (use a garbage collector) is not possible for whatever reasons. 3: A solution like the...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.