I'm implementing an algorithm and the computational flow is a somewhat
deep. That is, fcn A makes many calls to fcn B which makes many calls
to fcn C, and so on. The return value of the outermost fcn is a boolean
and there are certain places within the inner functions where it may
become apparent that the return value is false. In this case I'd like
to halt the computation immediately and return false.
My current approach is to have any of the inner functions throw an
exception if the return value is ever determined early like this. Then
within fcn A I'll catch the exception and return false. Is this a good
approach or is there a more C++ way to handle this? It's sort of a
glorified goto, but it does get the job done.
On a related note, if an exception is never thrown, is there any
performance penalty to enclosing a block of code within a try-catch
block. The exceptional case above should be rare and I don't want to
hurt the common case by doing this.
Thanks,
Mark
Mar 30 '06
40 3158
Phlip wrote: Kaz Kylheku wrote:
Oh man, what an incompetent moron.
ANSI C gives us the clock() function for doing this type of thing. No need for Windows extensions. You must be describing yourself. clock() has a resolution in tens of milliseconds.
So does GetTickCount(). ISTR that on some WinCE systems, they crank it
up a bit, but normally it's 100Hz.
Everyone knows it's not good for profiling.
Idiot, you need a high resolution hardware clock in order to time a
/single/ event accurately. If you can repeat the event enough times,
you can use averaging to magnify the precision of a low resolution
clock.
You can even time events that are much shorter than the granularity of
the clock.
Suppose that you have a clock that increments every millisecond. Over
many trials, the event you are timing reads as zero milliseconds about
90% of the time, but 10% of the time, it registers as one millisecond.From that, you can estimate the duration of the event to be 0.1
milliseconds. One way to look at it is that there is a 10%
probability that the increment of the clock lands in the middle of the
event, which means that the event spans 10% of the clock period. This
estimate does assume that there is no bias introduced by some
dependency between the clock and the timing of event.
I think this has to do with that probability and statistics stuff I had
to pass once upon a time to get a CS degree. What do you think?
Or would you have lost your cool over a POSIX method posted here?
I haven't lost my cool; I just like to insult stupid people, because it
feels good.
Why don't you go net-psychoanalyze someone else.
Noah Roberts wrote: Of course that is good because after a depth > 800 you get a few microseconds faster using exceptions on every compiler and architecture.
You might consider focussing on people you can actually learn from here.
(And also me;)
--
Phlip http://www.greencheese.org/ZeekLand <-- NOT a blog!!!
Noah Roberts wrote: Let's not take multiple samples of time and average them together. 5000 is not enough samples??!! I could do more but then I would have to take the exception test out as it would take way too long.
5000 is the number times the function is called, not the number of
times the clock is sampled.
Calling the function isn't sampling. Reading the clock is sampling.
You can do the math to get the average if you really want...its easy...(v2 - v1)/5000
That depends on whether v2 and v1 have a good enough resolution
relative to the 5000, since they represent a single sampled clock
delta.
In the case of the return test, you witnessed that there was usually no
increment of the clock at all, so a zero average would be computed. At
odd times, you might catch a single increment of the tick by 10
milliseconds, and at those times, the above formula would yield 1/500th
of a millisecond.
A more accurate estimate is somwhere between zero and 1/500 ms, and can
be obtained with more clock samples. Let's not parametrize anything interesting, like the call stack depth.
Granted, it isn't a perfect test but it also doesn't have to be. With differences that great you don't need to get into much detail. It is sufficient to show that in a simple case like above, the one you described, using exceptions as a return mechanism (something they were never intended to be)
Regardless of their intent, slow or fast, exceptions are a dynamic
non-local return mechanism. That is what they are. This is semantically
proved by the fact that in fact function activations terminate and
control ends up at a higher level.
is several orders of magnitude slower than using the _appropriate_ mechanism built for that purpose.
You don't appear to be qualified to prescribe to others what is an
appropriate mechanism for a given purpose.
I reproduced a situation in which returning to a top level by exception
was faster than cascading returns.
I didn't deserve to be insulted for proposing the skeptical hypothesis
that under some circumstances, it /might/ be faster to return by
exception.
I defended that hypothesis. And thus my wristwatch tells me that I'm,
oh, right about done here.
Kaz Kylheku wrote: Phlip wrote: The topic we have all been avoiding:
Is it possible, somewhere, somehow, that throwing the exception could actually be faster than returning?
Yes. Here is a benchmark program.
bool ex_top(int max_depth, int depth) { ex_middle(max_d epth, depth + 1); }
Doesn't gcc complain about the lack of a return? The results:
chrono_bias = 0.000000400s
max_depth = 100 return_time = 0.000002000s exception_time = 0.000009000s
max_depth = 200 return_time = 0.000003000s exception_time = 0.000009400s
max_depth = 300 return_time = 0.000004900s exception_time = 0.000009900s
max_depth = 400 return_time = 0.000006300s exception_time = 0.000010000s
max_depth = 500 return_time = 0.000007700s exception_time = 0.000010800s
max_depth = 600 return_time = 0.000009400s exception_time = 0.000010500s
max_depth = 700 return_time = 0.000010900s exception_time = 0.000011300s
max_depth = 800 return_time = 0.000012800s exception_time = 0.000011100s
max_depth = 900 return_time = 0.000014200s exception_time = 0.000011900s
max_depth = 1000 return_time = 0.000016400s exception_time = 0.000012100s
As you can see, as the stack grows deeper, the exception mechanism eventually becomes faster. The break-even point is somewhere at a depth of about 700 to 800.
The results suggest that that exceptions have a high fixed cost on this compiler, but a marginal additional cost to jump deeper.
OK, I'll bite:
Pentium M 2GHz, Solaris snv_b33, Sun Studio 11.
Built with -O2.
chrono_bias = 0.000000540s
max_depth = 100
return_time = 0.000001060s
exception_time = 0.000005960s
max_depth = 200
return_time = 0.000002060s
exception_time = 0.000011260s
max_depth = 300
return_time = 0.000002360s
exception_time = 0.000016460s
max_depth = 400
return_time = 0.000003960s
exception_time = 0.000022360s
max_depth = 500
return_time = 0.000004860s
exception_time = 0.000027460s
max_depth = 600
return_time = 0.000006060s
exception_time = 0.000033260s
max_depth = 700
return_time = 0.000007160s
exception_time = 0.000038560s
max_depth = 800
return_time = 0.000008360s
exception_time = 0.000043560s
max_depth = 900
return_time = 0.000009360s
exception_time = 0.000049660s
max_depth = 1000
return_time = 0.000010260s
exception_time = 0.000054660s
As you can see, with this compiler, exceptions are always slower.
--
Ian Collins.
Kaz Kylheku wrote: Noah Roberts wrote: Let's not take multiple samples of time and average them together.
5000 is not enough samples??!! I could do more but then I would have to take the exception test out as it would take way too long.
5000 is the number times the function is called, not the number of times the clock is sampled.
Calling the function isn't sampling. Reading the clock is sampling.
WTF are you talking about? You do it that way you DEFINATELY have to
account for all of the following and more:
All calls to timing functions.
All time spent outside the program (os premption)
All time restablishing program counters, etc...
Since each run takes less than your measurable clock there is no way to
account for any of them. No...you run multiple times and take the time
pre/post iterations and then average. Then all the little stuff is
insignificant.
You can't take an average of any unit that is smaller than your
smallest measure. Only after you take enough samples to add up to at
least one of your measuring units can you even think about it.
I'm surprised you passed your statistics class with your methodology.
I could run 5000 iterations several times and take an average and this
would be more accurate, but the variation would not alter results this
vastly different. It is simply not necissary to get that detailed. It
would be like taking several samples of a blade of grass and comparing
to several samples of full grown adult fir trees. You don't have to
sample more than one of each to know what the outcome will be.
Besides, I did run the tests several times to see if there was ever a
significant value change and there wasn't. Exceptions always take at
least 2000x longer on this arch/compiler....no amount of multi-sample
averaging will change that.
Ian Collins wrote: OK, I'll bite:
Pentium M 2GHz, Solaris snv_b33, Sun Studio 11.
Built with -O2.
chrono_bias = 0.000000540s
Wouldn't you know it, that awful clock() function that may return "just
about anything, useful or otherwise" is actually working on a
different OS and development platform.
max_depth = 400 return_time = 0.000003960s exception_time = 0.000022360s
Wow, that is quite interesting. Thanks for the data points.
GNU C++ works more according to my intuition about exceptions. I
suspected that just mashing through the stack would be faster than
actually taking the branches through the return points, regardless of
whatever fixed cost. That's why I wrote the test that way: to see how a
delta in the depth would affect the two approaches.
Still, both compilers could use some work. That fixed cost in g++'s
implementation is quite disturbing. Something happens at the start of
the exception processing on upon its termination which eats up a lot of
time. But once the unwinding loop actually gets going, it's quite
rapid. If that apparently fixed cost could be drastically reduced,
exceptions could compete with normal returns in broader circumstances.
I'm recompiling the program with -pg to get some profiling info out of
it.
The profiling shows that less time is spent in the ex_() functions.
part of the reason for that is that they never return. The way I set up
the test, the recursion just dives straight down and then throws. The
other half of the work, returning, is never done.
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls us/call us/call name
30.40 3.07 3.07 184000000 0.02 0.02 rt_bottom(int,
int)
26.24 5.72 2.65 184000000 0.01 0.01 rt_middle(int,
int)
24.75 8.22 2.50 184000000 0.01 0.01 rt_top(int, int)
6.73 8.90 0.68 184000000 0.00 0.00 ex_top(int, int)
5.94 9.50 0.60 184000000 0.00 0.00 ex_bottom(int,
int)
4.85 9.99 0.49 184000000 0.00 0.00 ex_middle(int,
int)
0.79 10.07 0.08 3000000 0.03 0.03 stop_chrono()
0.20 10.09 0.02 3000000 0.01 0.01 start_chrono()
0.10 10.10 0.01 1000000 0.01 1.78 ex_test(int)
0.00 10.10 0.00 1000000 0.00 8.22 rt_test(int)
0.00 10.10 0.00 21 0.00 0.00 reset_chrono()
0.00 10.10 0.00 21 0.00 0.00
get_average_tim e()
0.00 10.10 0.00 10 0.00 0.00 putchar
Still, 0.60 seconds spent in all the ex_bottom() calls versus 3.07
seconds in all the rt_bottom() calls. That's a whopping discrepancy
even considering that one of the two actually returns whereas the other
never does. Is it that much more expensive to return from a function
than to enter into it?
It would be nice to profile into the run-time support routines that
support exception handling, which is what I was /really/ after, but it
apperas that for that we'd have to get libgcc.a recompiled with
profiling support.
As you can see, with this compiler, exceptions are always slower.
By the way, I'm skeptical of your results. I have it from a reliable
source that
"after a depth > 800 you get a few microseconds faster
using exceptions on every compiler and architecture. "
Hee hee. Sigh.
Kaz Kylheku wrote: By the way, I'm skeptical of your results. I have it from a reliable source that
"after a depth > 800 you get a few microseconds faster using exceptions on every compiler and architecture. "
Hee hee. Sigh.
.... and _I'm_ the moron.
Ian Collins wrote: bool ex_top(int max_depth, int depth) { ex_middle(max_d epth, depth + 1); } Doesn't gcc complain about the lack of a return?
Ah yes it does! But not if you have -O2 on the command line.
Before I posted the program, I wanted to have a peek at the assembly
language output, to make sure that there was no funny optimization,
like Scheme-like tail calls. But I didn't specify -O2. Now I remember
that the warnings did appear.
The optimized assembly language output shows that gcc has analyzed the
functions and turned the whole thing into tail calls (which I was
hoping wouldn't happen!)
Actually each function builds up and tears down the stack frame, but
then does a jump to the next one. With -fomit-frame-pointer, that
disappears, and here is what ex_top looks like, haha, with identifiers
demangled through c++filt:
..globl ex_top(int, int)
.type ex_top(int, int),@function
ex_top(int, int):
..LFB31:
incl 8(%esp)
..LCFI34:
jmp ex_middle(int, int)
Increment the depth and jump!
This is why the exception returns appear so fast and don't appear to
become more expensive with increasing depth. There are no additional
stack frames to search through.
Let's re-do the test with -O2 and -fno-optimize-sibling-calls. I
changed the ex_() functions to have void returns:
Now the exception times get quite gross. Your Sun compiler trounces
g++:
max_depth = 100
return_time = 0.000001670s
exception_time = 0.000200370s
max_depth = 200
return_time = 0.000003270s
exception_time = 0.000388070s
Ick!
Kaz Kylheku wrote: Wow, that is quite interesting. Thanks for the data points.
GNU C++ works more according to my intuition about exceptions. I suspected that just mashing through the stack would be faster than actually taking the branches through the return points, regardless of whatever fixed cost. That's why I wrote the test that way: to see how a delta in the depth would affect the two approaches.
Have you tried higher levels of optimisation? At -O4, CC optimises most
of the code away in the non-exception case, giving a constant time less
than chrono_bias.
Still, both compilers could use some work. That fixed cost in g++'s implementation is quite disturbing. Something happens at the start of the exception processing on upon its termination which eats up a lot of time. But once the unwinding loop actually gets going, it's quite rapid. If that apparently fixed cost could be drastically reduced, exceptions could compete with normal returns in broader circumstances.
Considering exceptions are for exceptional circumstances, I'm only realy
concern with normal operation, if the trade of is expensive throwing so
be it.
--
Ian Collins.
Ian Collins wrote: Considering exceptions are for exceptional circumstances, I'm only realy concern with normal operation, if the trade of is expensive throwing so be it.
I don't know if you missed it, but that is exactly what this argument
is all about...using exceptions as a standard return mechanism: http://groups.google.com/group/comp....3fe58585dd209c This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Jesper Denmark |
last post by:
Hi,
Is deep serialization possible using XML. Know any good
tutorials?.
By deep serialization I mean the ability that an object
being serialized automatically serialize its members. E.g
Class A
{
|
by: Matt Kruse |
last post by:
Consider the following two functions:
function func1() {
var o ;
if ( (o=document.forms)
&& (o=o)
&& (o=o.elements)
&& (o=o.sel)
&& (o=o.options)
&& (o=o)
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
| |
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
|
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |