By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,950 Members | 1,019 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,950 IT Pros & Developers. It's quick & easy.

Return code of 4294967295 (UINT_MAX)

P: n/a
If anybody has any insight into this problem I'm running into I would
really appreciate if you could write to me...

I'm running a simple C++ program on Solaris 8 that forks and execs a
bunch of processes. It's been running fine for years, but now that
I've moved to faster hardware, I'm running into a problem that
surfaces more frequently as the hardware I'm using gets better/faster
-- it seems like some sort of race condition issue.

Basically, at random, a handful of the process immediately fail
(before actually doing anything) and return an exit code of 4294967295
(UINT_MAX). I imagine that this is just an umbrella status code for
all unexpected/unexplained errors, so I'm not sure if it means
anything?

One thing to note is that if I just trap this error and execute the
process again, it runs fine. It just seems like at the time the
fork/exec takes place, something in the system temporarily screws up
but I don't know what. Of course I do have a workaround (just re-run
the process) but I'd like to know what's going on.

If anybody has encounted anything like this, please let me know, and I
can provide you with more information if need be...thanks.
Nov 14 '05 #1
Share this Question
Share on Google+
7 Replies


P: n/a

"Vineet" <vi*************@mantas.com> wrote in message news:4a**************************@posting.google.c om...
: If anybody has any insight into this problem I'm running into I would
: really appreciate if you could write to me...
:
: I'm running a simple C++ program on Solaris 8 that forks and execs a
: bunch of processes. It's been running fine for years, but now that
: I've moved to faster hardware, I'm running into a problem that
: surfaces more frequently as the hardware I'm using gets better/faster
: -- it seems like some sort of race condition issue.

Sounds like fork is failing. If the program is not running as root,
you are probably exceeding maxuproc (see your kernel parameters
documentation). If it is running as root, you're exceeding nprocs.
Nprocs determines how many structures controlling processes are
created. Maxuproc determines how many processes a user can
have - it should always be less than half nprocs and is usually quite
a bit smaller. Default parameters for most machines are not set up
properly to handle heavy usage by a small number of programs or users.

One other problem, it is likely that you are not handling wait situations
properly and may have a large number of zombie processes. These
will consume limited process resources. Even if your processes
are being reaped by init it's possible to get in race conditions where zombie's
are created faster than init can reap them. Without seeing any
actual code - like your fork-exec code - it's impossible to say.

Post some code if you want more help.

Good luck,

Dan

:
: Basically, at random, a handful of the process immediately fail
: (before actually doing anything) and return an exit code of 4294967295
: (UINT_MAX). I imagine that this is just an umbrella status code for
: all unexpected/unexplained errors, so I'm not sure if it means
: anything?

It's a -1.

:
: One thing to note is that if I just trap this error and execute the
: process again, it runs fine. It just seems like at the time the
: fork/exec takes place, something in the system temporarily screws up
: but I don't know what. Of course I do have a workaround (just re-run
: the process) but I'd like to know what's going on.
:
: If anybody has encounted anything like this, please let me know, and I
: can provide you with more information if need be...thanks.
Nov 14 '05 #2

P: n/a
joe
vi*************@mantas.com (Vineet) writes:
If anybody has any insight into this problem I'm running into I
would really appreciate if you could write to me...

I'm running a simple C++ program on Solaris 8 that forks and execs a
bunch of processes. It's been running fine for years, but now that
I've moved to faster hardware, I'm running into a problem that
surfaces more frequently as the hardware I'm using gets better/faster
-- it seems like some sort of race condition issue.

Basically, at random, a handful of the process immediately fail
(before actually doing anything) and return an exit code of
4294967295 (UINT_MAX). I imagine that this is just an umbrella
status code for all unexpected/unexplained errors, so I'm not sure
if it means anything?


That's -1, which the man page documents fork() to return in case of an
error. In that case you should check to see what the value of errno
is. You can get a textual version of the error by calling either
perror() or strerror(errno). That will probably enlighten things.

Joe
--
Folks who don't know why America is the Land of Promise should be here
during an election campaign.
-- Milton Berle
Nov 14 '05 #3

P: n/a

Post some code if you want more help.


Really, guys, this discussion belongs off-line or in a forum that is more
appropriate. This newsgroup doesn't discuss platform-specific stuff like
processes/threads, etc. This is a language newsgroup.

-Howard
Nov 14 '05 #4

P: n/a
"Howard" <al*****@hotmail.com> writes:
Post some code if you want more help.


Really, guys, this discussion belongs off-line or in a forum that is more
appropriate. This newsgroup doesn't discuss platform-specific stuff like
processes/threads, etc. This is a language newsgroup.


This thread is cross-posted to comp.lang.c++, comp.unix.programmer,
comp.lang.c, and comp.unix.solaris. It's probably appropriate in
comp.unix.programmer and/or comp.unix.solaris, which do discuss
platform-specific stuff. Please trim the newsgroups line on any
followups.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #5

P: n/a
vi*************@mantas.com (Vineet) writes:
Basically, at random, a handful of the process immediately fail
(before actually doing anything) and return an exit code of 4294967295
(UINT_MAX). I imagine that this is just an umbrella status code for
all unexpected/unexplained errors, so I'm not sure if it means
anything?
A process cannot fail with an exit code of 4294967295 (UINT_MAX);
the only valid exit codes are between 0 and 255 (inclusive).

So the first question is: what is returning -1 (whatever returns
a number with all bits set is more ikely to return -1 than UINT_MAX)
One thing to note is that if I just trap this error and execute the
process again, it runs fine. It just seems like at the time the
fork/exec takes place, something in the system temporarily screws up
but I don't know what. Of course I do have a workaround (just re-run
the process) but I'd like to know what's going on.


Have you tried "truss -f"?

Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.
Nov 14 '05 #6

P: n/a
Howard wrote:
.... snip ...
Really, guys, this discussion belongs off-line or in a forum that
is more appropriate. This newsgroup doesn't discuss
platform-specific stuff like processes/threads, etc. This is a
language newsgroup.


So, instead of simply muttering, set followups.

--
Replies should be to the newsgroup
Replace this line to publish your e-mail address.
Nov 14 '05 #7

P: n/a
In article <40***********************@news.xs4all.nl>, Casper H.S. Dik wrote:
fork/exec takes place, something in the system temporarily screws up
but I don't know what. Of course I do have a workaround (just re-run
the process) but I'd like to know what's going on.


Have you tried "truss -f"?

This might help spot the difference between 2 truss outputs
(which you name on the command line). Random interleaving
of parent and child contributions is left for you to sort
out in this version. I might have done more if I'd thoght
I'd need it regularly.

#!/usr/bin/perl -w

sub display
{

for ($ln=$lineno-5; $ln<=$lineno; $ln++) {
next if ($ln<1);
printf("%s(%d) %s\n", $ARGV[0], $ln, $left{$ln});
}
print "\n";
for ($ln=$lineno-5; $ln<=$lineno; $ln++) {
next if ($ln<1);
printf("%s(%d) %s\n", $ARGV[1], $ln, $right{$ln});
}

}

sub lineparse
{
$_=shift;

TEST: while (1) {
$syscall="NA";
$result="NA";
if ($_ =~ /\s(\S+)$/) {
$result=$1;
}
if ($_ =~ /^\d+:\s+([^\(]+)\(/) {
$syscall=$1;
last TEST;
}
if ($_ =~ /^\d+:\s+\*\*\*.*\*\*\*$/) {
print "$_\n";
last TEST;
}
die("RE not matched: $_\n");
}
}

#######################################

open(LFH, "<$ARGV[0]") or die("open $ARGV[0]");
open(RFH, "<$ARGV[1]") or die("open $ARGV[1]");

for($lineno=1;;$lineno++) {

$left=<LFH>;
$right=<RFH>;
if ( (!defined($left)) && (defined($right)) ) {
die("end of $ARGV[0]");
}
if ( (defined($left)) && (!defined($right)) ) {
die("end of $ARGV[1]");
}
chomp($left);
chomp($right);

#print "DEBUG $left\n";

lineparse($left);
$left_syscall=$syscall;
$left_result=$result;
$left{$lineno}=$left;
lineparse($right);
$right_syscall=$syscall;
$right_result=$result;
$right{$lineno}=$right;

if ($right_syscall ne $left_syscall) {
print "syscall difference\n\n";
display();
exit(1);
}
if ( ($right_result =~ /^\d+$/) && ($left_result !~ /^\d+$/)) {
print "Non-Numerical Result (left)\n\n";
display();
exit(1);
}
if ( ($right_result !~ /^\d+$/) && ($left_result =~ /^\d+$/)) {
print "Non-Numerical Result (right)\n\n";
display();
exit(1);
}
if ( ( ("0" eq $right_result) && ("0" ne $left_result) ) ||
( ("0" ne $right_result) && ("0" eq $left_result) ) ){
print "Numerical Results\n\n";
display();
exit(1);
}

}

exit(0);
--
Elvis Notargiacomo master AT barefaced DOT cheek
http://www.notatla.org.uk/goen/
Nov 14 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.