473,387 Members | 1,876 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Return code of 4294967295 (UINT_MAX)

If anybody has any insight into this problem I'm running into I would
really appreciate if you could write to me...

I'm running a simple C++ program on Solaris 8 that forks and execs a
bunch of processes. It's been running fine for years, but now that
I've moved to faster hardware, I'm running into a problem that
surfaces more frequently as the hardware I'm using gets better/faster
-- it seems like some sort of race condition issue.

Basically, at random, a handful of the process immediately fail
(before actually doing anything) and return an exit code of 4294967295
(UINT_MAX). I imagine that this is just an umbrella status code for
all unexpected/unexplained errors, so I'm not sure if it means
anything?

One thing to note is that if I just trap this error and execute the
process again, it runs fine. It just seems like at the time the
fork/exec takes place, something in the system temporarily screws up
but I don't know what. Of course I do have a workaround (just re-run
the process) but I'd like to know what's going on.

If anybody has encounted anything like this, please let me know, and I
can provide you with more information if need be...thanks.
Nov 14 '05 #1
7 10743

"Vineet" <vi*************@mantas.com> wrote in message news:4a**************************@posting.google.c om...
: If anybody has any insight into this problem I'm running into I would
: really appreciate if you could write to me...
:
: I'm running a simple C++ program on Solaris 8 that forks and execs a
: bunch of processes. It's been running fine for years, but now that
: I've moved to faster hardware, I'm running into a problem that
: surfaces more frequently as the hardware I'm using gets better/faster
: -- it seems like some sort of race condition issue.

Sounds like fork is failing. If the program is not running as root,
you are probably exceeding maxuproc (see your kernel parameters
documentation). If it is running as root, you're exceeding nprocs.
Nprocs determines how many structures controlling processes are
created. Maxuproc determines how many processes a user can
have - it should always be less than half nprocs and is usually quite
a bit smaller. Default parameters for most machines are not set up
properly to handle heavy usage by a small number of programs or users.

One other problem, it is likely that you are not handling wait situations
properly and may have a large number of zombie processes. These
will consume limited process resources. Even if your processes
are being reaped by init it's possible to get in race conditions where zombie's
are created faster than init can reap them. Without seeing any
actual code - like your fork-exec code - it's impossible to say.

Post some code if you want more help.

Good luck,

Dan

:
: Basically, at random, a handful of the process immediately fail
: (before actually doing anything) and return an exit code of 4294967295
: (UINT_MAX). I imagine that this is just an umbrella status code for
: all unexpected/unexplained errors, so I'm not sure if it means
: anything?

It's a -1.

:
: One thing to note is that if I just trap this error and execute the
: process again, it runs fine. It just seems like at the time the
: fork/exec takes place, something in the system temporarily screws up
: but I don't know what. Of course I do have a workaround (just re-run
: the process) but I'd like to know what's going on.
:
: If anybody has encounted anything like this, please let me know, and I
: can provide you with more information if need be...thanks.
Nov 14 '05 #2
joe
vi*************@mantas.com (Vineet) writes:
If anybody has any insight into this problem I'm running into I
would really appreciate if you could write to me...

I'm running a simple C++ program on Solaris 8 that forks and execs a
bunch of processes. It's been running fine for years, but now that
I've moved to faster hardware, I'm running into a problem that
surfaces more frequently as the hardware I'm using gets better/faster
-- it seems like some sort of race condition issue.

Basically, at random, a handful of the process immediately fail
(before actually doing anything) and return an exit code of
4294967295 (UINT_MAX). I imagine that this is just an umbrella
status code for all unexpected/unexplained errors, so I'm not sure
if it means anything?


That's -1, which the man page documents fork() to return in case of an
error. In that case you should check to see what the value of errno
is. You can get a textual version of the error by calling either
perror() or strerror(errno). That will probably enlighten things.

Joe
--
Folks who don't know why America is the Land of Promise should be here
during an election campaign.
-- Milton Berle
Nov 14 '05 #3

Post some code if you want more help.


Really, guys, this discussion belongs off-line or in a forum that is more
appropriate. This newsgroup doesn't discuss platform-specific stuff like
processes/threads, etc. This is a language newsgroup.

-Howard
Nov 14 '05 #4
"Howard" <al*****@hotmail.com> writes:
Post some code if you want more help.


Really, guys, this discussion belongs off-line or in a forum that is more
appropriate. This newsgroup doesn't discuss platform-specific stuff like
processes/threads, etc. This is a language newsgroup.


This thread is cross-posted to comp.lang.c++, comp.unix.programmer,
comp.lang.c, and comp.unix.solaris. It's probably appropriate in
comp.unix.programmer and/or comp.unix.solaris, which do discuss
platform-specific stuff. Please trim the newsgroups line on any
followups.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #5
vi*************@mantas.com (Vineet) writes:
Basically, at random, a handful of the process immediately fail
(before actually doing anything) and return an exit code of 4294967295
(UINT_MAX). I imagine that this is just an umbrella status code for
all unexpected/unexplained errors, so I'm not sure if it means
anything?
A process cannot fail with an exit code of 4294967295 (UINT_MAX);
the only valid exit codes are between 0 and 255 (inclusive).

So the first question is: what is returning -1 (whatever returns
a number with all bits set is more ikely to return -1 than UINT_MAX)
One thing to note is that if I just trap this error and execute the
process again, it runs fine. It just seems like at the time the
fork/exec takes place, something in the system temporarily screws up
but I don't know what. Of course I do have a workaround (just re-run
the process) but I'd like to know what's going on.


Have you tried "truss -f"?

Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.
Nov 14 '05 #6
Howard wrote:
.... snip ...
Really, guys, this discussion belongs off-line or in a forum that
is more appropriate. This newsgroup doesn't discuss
platform-specific stuff like processes/threads, etc. This is a
language newsgroup.


So, instead of simply muttering, set followups.

--
Replies should be to the newsgroup
Replace this line to publish your e-mail address.
Nov 14 '05 #7
In article <40***********************@news.xs4all.nl>, Casper H.S. Dik wrote:
fork/exec takes place, something in the system temporarily screws up
but I don't know what. Of course I do have a workaround (just re-run
the process) but I'd like to know what's going on.


Have you tried "truss -f"?

This might help spot the difference between 2 truss outputs
(which you name on the command line). Random interleaving
of parent and child contributions is left for you to sort
out in this version. I might have done more if I'd thoght
I'd need it regularly.

#!/usr/bin/perl -w

sub display
{

for ($ln=$lineno-5; $ln<=$lineno; $ln++) {
next if ($ln<1);
printf("%s(%d) %s\n", $ARGV[0], $ln, $left{$ln});
}
print "\n";
for ($ln=$lineno-5; $ln<=$lineno; $ln++) {
next if ($ln<1);
printf("%s(%d) %s\n", $ARGV[1], $ln, $right{$ln});
}

}

sub lineparse
{
$_=shift;

TEST: while (1) {
$syscall="NA";
$result="NA";
if ($_ =~ /\s(\S+)$/) {
$result=$1;
}
if ($_ =~ /^\d+:\s+([^\(]+)\(/) {
$syscall=$1;
last TEST;
}
if ($_ =~ /^\d+:\s+\*\*\*.*\*\*\*$/) {
print "$_\n";
last TEST;
}
die("RE not matched: $_\n");
}
}

#######################################

open(LFH, "<$ARGV[0]") or die("open $ARGV[0]");
open(RFH, "<$ARGV[1]") or die("open $ARGV[1]");

for($lineno=1;;$lineno++) {

$left=<LFH>;
$right=<RFH>;
if ( (!defined($left)) && (defined($right)) ) {
die("end of $ARGV[0]");
}
if ( (defined($left)) && (!defined($right)) ) {
die("end of $ARGV[1]");
}
chomp($left);
chomp($right);

#print "DEBUG $left\n";

lineparse($left);
$left_syscall=$syscall;
$left_result=$result;
$left{$lineno}=$left;
lineparse($right);
$right_syscall=$syscall;
$right_result=$result;
$right{$lineno}=$right;

if ($right_syscall ne $left_syscall) {
print "syscall difference\n\n";
display();
exit(1);
}
if ( ($right_result =~ /^\d+$/) && ($left_result !~ /^\d+$/)) {
print "Non-Numerical Result (left)\n\n";
display();
exit(1);
}
if ( ($right_result !~ /^\d+$/) && ($left_result =~ /^\d+$/)) {
print "Non-Numerical Result (right)\n\n";
display();
exit(1);
}
if ( ( ("0" eq $right_result) && ("0" ne $left_result) ) ||
( ("0" ne $right_result) && ("0" eq $left_result) ) ){
print "Numerical Results\n\n";
display();
exit(1);
}

}

exit(0);
--
Elvis Notargiacomo master AT barefaced DOT cheek
http://www.notatla.org.uk/goen/
Nov 14 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Vineet | last post by:
If anybody has any insight into this problem I'm running into I would really appreciate if you could write to me... I'm running a simple C++ program on Solaris 8 that forks and execs a bunch of...
7
by: Starx | last post by:
I am writing a fraction class and I was testing my addition operator to find out how big the numerator and denominator can be before an overflow occurs. I was doing it like this: fraction...
2
by: Tom D | last post by:
I'm running MySQL 4.1 under Linux. I need to have a MyISAM table with more than 4G rows of data. I've read the manual regarding the MAX_ROWS option on tables. This table has a fixed row...
7
by: Mantorok Redgormor | last post by:
#include <stdio.h> #include <limits.h> int main(void) { unsigned int mask; int a = -1; mask = 1u << (CHAR_BIT * sizeof mask - 1);
7
by: BMarsh | last post by:
Hi all, I have a slight problem understanding the following code that I saw on a Unix-PAM tutorial (not OT!) The following code will compare and old string to a new one, bombing out if 'max'...
18
by: joshc | last post by:
I've got two bits of code that I would like some more experienced folks to check for conformance to the Standard. I've tried my best to read the standard and search around and I think and hope this...
16
by: Pedro Graca | last post by:
I have a file with different ways to write numbers ---- 8< (cut) -------- 0: zero, zilch,, nada, ,,,, empty , void, oh 1: one 7: seven 2: two, too ---- >8 -------------- ...
3
by: p.lavarre | last post by:
Subject: Python CTypes translation of (pv != NULL) And so then the next related Faq is: Q: How should I test for ((void *) -1)? A: (pv == 0xffffFFFF) works often.
24
by: Yevgen Muntyan | last post by:
Hey, Is it correct that number of value bits in int and unsigned int representation may be the same? If it is so, then INT_MIN may be -(INT_MAX+1) (in mathematical sense), i.e. abs(INT_MIN) is...
12
by: subramanian100in | last post by:
Below is my understanding about count algorithms. Return type of count and count_if algorithms is iterator_traits<InputIterator>::difference_type. If the container contains more than...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.