hello,
i'm trying to answer a "prove an upper bound on the number of machine
instructions required to process M connections on N objects" exercise
from an algorithms book. i don't understand a very basic detail of how
i should go about answering the exercise. here's a simplified example
question and example programme to hopefully get at what i'd like to
know:
Prove an upper bound on the number of machine instructions required to
process N objects using the below programme. You may
assume, for example, that any C assignment statement always requires
less than c instructions, for some fixed constant c.
/* a very simple and silly example "algorithm" */
#include <stdio.h>
#define N 10000
int main(void)
{
int n = 0;
int x[3] = { 1, 4, 9 };
long unsigned total = 0;
while( n < N ) {
total += x[ n % 3 ];
n++;
}
printf("%lu\n", total);
return 0;
}
i know the "upper bound" aspect of the question isn't particularly
meaningful with this programme, as after the value of N has been
decided there's nothing variable about the number of times the loop is
run -- upper and lower bound the same -- one fixed, by N, value. but i
wanted to leave the question as unchanged as possible.
to answer that exercise, is it a case of assigning a certain number of
machine instructions to each and every C statement, or operation, in
the loop, adding those up, and then multiplying that by the number of
loops?
it's the "You may assume, for example, that any C assignment statement
always requires less than c instructions, for some fixed constant c."
part i'm unsure on.
how exactly would you go about answering the above exercise (ignoring
the sillyness part of it)?
thanks, ben. 8 1891
ben wrote: hello,
Not about CLC...
What do you do in big-Oh with constants?
e.g. inner loop takes O(35n) [35 opcodes per pass, not 35 statements
though] that's O(n) time... then the outer loop is O(nlogn) so its the
product or O(n^2logn) time.
.... so the question is really moot.
If you want really precise timings you're gonna have to sample.
Compilers vary too much and they have too many options to enumerate all
possible results..
Tom
In article <11**********************@o13g2000cwo.googlegroups .com>, Tom
St Denis <to********@gmail.com> wrote: e.g. inner loop takes O(35n) [35 opcodes per pass, not 35 statements though]
yes, that's the part that i'm asking about and hoping to find out
about. my question is about this part of the exercise question:
"You may assume, for example, that any C assignment statement always
requires less than c instructions, for some fixed constant c."
how would you carry out that part with the example loop -- just for one
loop is all that's needed.
the higher level aspect of the exercise question that you're talking
about Tom, i understand -- that's why i gave a very simple bit of code
and labelled it silly and stated "after the value of N has been
decided there's nothing variable about the number of times the loop is
run -- upper and lower bound the same -- one fixed, by N, value. but i
wanted to leave the question as unchanged as possible" -- there's no
need for the variable run aspect to get at what i'm asking about.
so basically how do you apply "You may assume, for example, that any C
assignment statement always requires less than c instructions, for some
fixed constant c." to
while( n < N ) {
total += x[ n % 3 ];
n++;
}
?
thanks, ben.
ben wrote: In article <11**********************@o13g2000cwo.googlegroups .com>, Tom St Denis <to********@gmail.com> wrote:
e.g. inner loop takes O(35n) [35 opcodes per pass, not 35 statements though] yes, that's the part that i'm asking about and hoping to find out about. my question is about this part of the exercise question:
"You may assume, for example, that any C assignment statement always requires less than c instructions, for some fixed constant c."
how would you carry out that part with the example loop -- just for one loop is all that's needed.
I am not sure how this is meant -- plainly just looking for assignments
seems stupid. "Operation" is more meaningful
[snip]
so basically how do you apply "You may assume, for example, that any C assignment statement always requires less than c instructions, for some fixed constant c." to
while( n < N ) { total += x[ n % 3 ]; n++; }
Assignments: 2 per loop -> 2*c*N. Complete nonsense.
Operations:
n % 3 -> temp1
x + temp1 -> temp2
total + *temp2 -> total
or:
*temp2 -> temp3
total + temp3 -> total
n + 1 -> n
and of course
n < N -> stay in the loop
else -> leave the loop
or intermediate assignment of n<N -> temp0 and test afterwards.
Would be [5..7]*c*N.
IMO, the exercise should be stated in a clearer way.
Cheers
Michael
--
E-Mail: Mine is a gmx dot de address.
In article <38*************@individual.net>, Michael Mair
<Mi**********@invalid.invalid> wrote: ben wrote: In article <11**********************@o13g2000cwo.googlegroups .com>, Tom St Denis <to********@gmail.com> wrote:
e.g. inner loop takes O(35n) [35 opcodes per pass, not 35 statements though] yes, that's the part that i'm asking about and hoping to find out about. my question is about this part of the exercise question:
"You may assume, for example, that any C assignment statement always requires less than c instructions, for some fixed constant c."
how would you carry out that part with the example loop -- just for one loop is all that's needed.
I am not sure how this is meant -- plainly just looking for assignments seems stupid. "Operation" is more meaningful
[snip]
so basically how do you apply "You may assume, for example, that any C assignment statement always requires less than c instructions, for some fixed constant c." to
while( n < N ) { total += x[ n % 3 ]; n++; }
Assignments: 2 per loop -> 2*c*N. Complete nonsense.
Operations: n % 3 -> temp1 x + temp1 -> temp2 total + *temp2 -> total or: *temp2 -> temp3 total + temp3 -> total
n + 1 -> n
and of course n < N -> stay in the loop else -> leave the loop or intermediate assignment of n<N -> temp0 and test afterwards.
Would be [5..7]*c*N.
right. great. IMO, the exercise should be stated in a clearer way.
great. it was unclear to me also, party for the 'assignment' reason
which i also thought was odd and partly not understanding exactly how
it should be broken down. so i wasn't sure if it was unclear to me
through my lack of knowledge, or due to the exercise being badly
written -- a bit of both was the answer to that.
thanks very much for the reply Michael,
ben.
ben wrote:
<snip>
Prove an upper bound on the number of machine instructions required
to process N objects using the below programme. You may assume, for example, that any C assignment statement always requires less than c instructions, for some fixed constant c.
/* a very simple and silly example "algorithm" */ #include <stdio.h> #define N 10000
int main(void) { int n = 0; int x[3] = { 1, 4, 9 }; long unsigned total = 0;
while( n < N ) { total += x[ n % 3 ]; n++; }
printf("%lu\n", total); return 0; }
In my opinion, this is a waste of time. Instruction count no longer
has much of a bearing on how much time a program takes to execute. To
use your example program, with N set to 0x4FFFFFFFu:
[mark@icepick ~]$ gcc -save-temps -Wall -O3 -o bar bar.c && time ./bar
&& wc -l bar.s
1968526669
real 0m16.683s
user 0m16.682s
sys 0m0.002s
44 bar.s <- lines of assembly
[mark@icepick ~]$ gcc -save-temps -Wall -O3 -march=pentium4 -o bar
bar.c && time ./bar && wc -l bar.s
1968526669
real 0m4.313s
user 0m4.312s
sys 0m0.001s
78 bar.s <- lines of assembly
As you can see, a change in an optimization setting roughly doubled the
assembly output, but resulted in a 4x speed improvement. I picked the
quickest of 3 trials each, for the record.
It may be an interesting academic exercise, but its real world
applicability is questionable. Probably not the answer you wanted, but
an answer nonetheless.
Mark F. Haigh mf*****@sbcglobal.net
In article <11**********************@g14g2000cwa.googlegroups .com>,
Mark F. Haigh <mf*****@sbcglobal.net> wrote: ben wrote:
<snip>
Prove an upper bound on the number of machine instructions required
to process N objects using the below programme. You may assume, for example, that any C assignment statement always requires less than c instructions, for some fixed constant c.
/* a very simple and silly example "algorithm" */ #include <stdio.h> #define N 10000
int main(void) { int n = 0; int x[3] = { 1, 4, 9 }; long unsigned total = 0;
while( n < N ) { total += x[ n % 3 ]; n++; }
printf("%lu\n", total); return 0; }
In my opinion, this is a waste of time. Instruction count no longer has much of a bearing on how much time a program takes to execute. To use your example program, with N set to 0x4FFFFFFFu:
[mark@icepick ~]$ gcc -save-temps -Wall -O3 -o bar bar.c && time ./bar && wc -l bar.s 1968526669
real 0m16.683s user 0m16.682s sys 0m0.002s 44 bar.s <- lines of assembly
[mark@icepick ~]$ gcc -save-temps -Wall -O3 -march=pentium4 -o bar bar.c && time ./bar && wc -l bar.s 1968526669
real 0m4.313s user 0m4.312s sys 0m0.001s 78 bar.s <- lines of assembly
As you can see, a change in an optimization setting roughly doubled the assembly output, but resulted in a 4x speed improvement. I picked the quickest of 3 trials each, for the record.
It may be an interesting academic exercise, but its real world applicability is questionable. Probably not the answer you wanted, but an answer nonetheless.
Mark,
well, it sounds like a good point. if you're right then it's a very
good answer and one i'd definitely want. the exercise is just from a
book. i don't have to do it as stated by the exercise. if the way
that's detailed isn't so useful and there's a much more useful way then
i'd prefer to learn and do it a way that's useful. i do intend to use
this stuff in the real world -- that's what i'm learning it for.
i'm just wondering though, if you're comparing operation counts of code
compiled using the same settings, and the different programmes are
written in a similar fashion (that is, for example, not comparing one
programme that unrolls a loop and another that doesn't -- code written
in the same style) then the number of operations is a reasonable way to
compare? basically, use the same base settings and style. (although
"code written in the same style" is probably pretty ambiguous)
further thought: different operations take different amounts of time
don't they? -- i think so. so that'd make operation counting less
useful. for instance i think accessing a word of memory that isn't in
the cpu's cache can take several clock cycles, whereas if that data is
in cache the access might take just one clock cycle. correct?
so basically you're saying timing is the way to measure algorithms?
thanks, ben
ben wrote: well, it sounds like a good point. if you're right then it's a very good answer and one i'd definitely want. the exercise is just from a book. i don't have to do it as stated by the exercise. if the way that's detailed isn't so useful and there's a much more useful way
then i'd prefer to learn and do it a way that's useful. i do intend to use this stuff in the real world -- that's what i'm learning it for.
Good decision. Can't remember who said "In theory, theory and practice
are the same, in practice, they're different." i'm just wondering though, if you're comparing operation counts of
code compiled using the same settings, and the different programmes are written in a similar fashion (that is, for example, not comparing one programme that unrolls a loop and another that doesn't -- code
written in the same style) then the number of operations is a reasonable way
to compare? basically, use the same base settings and style. (although "code written in the same style" is probably pretty ambiguous)
These days, it would be unwise to manually unroll a loop in code
without timings to back it up. Compilers typically understand the
associated branching costs (whether cheap or expensive, cost of
mispredicted branches, etc) and pipeline scheduling better than most
people do. further thought: different operations take different amounts of time don't they? -- i think so. so that'd make operation counting less useful.
Yes, although it depends on the processor architecture exactly how
different the numbers can be. Some instructions can be executed in
parallel. Some instructions may need to be internally broken down into
smaller micro-ops, each of which is individually executed.
for instance i think accessing a word of memory that isn't in the cpu's cache can take several clock cycles, whereas if that data
is in cache the access might take just one clock cycle. correct?
Worse, much of the time, unless you're dealing with older or embedded
processors. It can take hundreds or even thousands of cycles on newer
fast machines. Typically it's not one cycle from cache, either, and it
will depend on which cache it's coming from.
For the sake of illustration, let's revisit a previous topic-- loop
unrolling. Let's say you manually unrolled a bunch of loops all over
your code. Are you sure that this new bulk of code does not force some
other critical part of code out of cache, resulting in several hundred
cycle wastage every time it has to be re-fetched? Now the several
cycles you saved looks like nothing. Think of the compiler as your
friend that helps you find the sweet spot amongst many different
competing factors. so basically you're saying timing is the way to measure algorithms?
Absolutely. First, optimize algorithmic complexity. Second, implement
it cleanly. Third, play with your compiler settings/hints and profile
it. Fourth and last, which is rarely needed, micro-optimize.
The fastest linked list in the world will get creamed by a slow hash or
tree if it's used inappropriately. Compilers do a better job of
optimizing if the code is implemented cleanly. Profiling shows you
where the bulk of _time_ is spent, so you don't go chasing false leads
and trying to optimize things that will give you a .00001% increase.
Mark F. Haigh mf*****@sbcglobal.net
Mark,
ok, thanks for the info
ben.
In article <11**********************@o13g2000cwo.googlegroups .com>,
Mark F. Haigh <mf*****@sbcglobal.net> wrote: ben wrote: well, it sounds like a good point. if you're right then it's a very good answer and one i'd definitely want. the exercise is just from a book. i don't have to do it as stated by the exercise. if the way that's detailed isn't so useful and there's a much more useful way
then i'd prefer to learn and do it a way that's useful. i do intend to use this stuff in the real world -- that's what i'm learning it for.
Good decision. Can't remember who said "In theory, theory and practice are the same, in practice, they're different."
i'm just wondering though, if you're comparing operation counts of
code compiled using the same settings, and the different programmes are written in a similar fashion (that is, for example, not comparing one programme that unrolls a loop and another that doesn't -- code written in the same style) then the number of operations is a reasonable way to compare? basically, use the same base settings and style. (although "code written in the same style" is probably pretty ambiguous)
These days, it would be unwise to manually unroll a loop in code without timings to back it up. Compilers typically understand the associated branching costs (whether cheap or expensive, cost of mispredicted branches, etc) and pipeline scheduling better than most people do.
further thought: different operations take different amounts of time don't they? -- i think so. so that'd make operation counting less useful.
Yes, although it depends on the processor architecture exactly how different the numbers can be. Some instructions can be executed in parallel. Some instructions may need to be internally broken down into smaller micro-ops, each of which is individually executed.
for instance i think accessing a word of memory that isn't in the cpu's cache can take several clock cycles, whereas if that data is in cache the access might take just one clock cycle. correct?
Worse, much of the time, unless you're dealing with older or embedded processors. It can take hundreds or even thousands of cycles on newer fast machines. Typically it's not one cycle from cache, either, and it will depend on which cache it's coming from.
For the sake of illustration, let's revisit a previous topic-- loop unrolling. Let's say you manually unrolled a bunch of loops all over your code. Are you sure that this new bulk of code does not force some other critical part of code out of cache, resulting in several hundred cycle wastage every time it has to be re-fetched? Now the several cycles you saved looks like nothing. Think of the compiler as your friend that helps you find the sweet spot amongst many different competing factors.
so basically you're saying timing is the way to measure algorithms?
Absolutely. First, optimize algorithmic complexity. Second, implement it cleanly. Third, play with your compiler settings/hints and profile it. Fourth and last, which is rarely needed, micro-optimize.
The fastest linked list in the world will get creamed by a slow hash or tree if it's used inappropriately. Compilers do a better job of optimizing if the code is implemented cleanly. Profiling shows you where the bulk of _time_ is spent, so you don't go chasing false leads and trying to optimize things that will give you a .00001% increase.
Mark F. Haigh mf*****@sbcglobal.net This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Mike P |
last post by:
I will be passing my function a two dimensional array of varying length.
Within that array is one data point, and the number of times it should loop
through.
So, for example, I might pass this...
|
by: Minti |
last post by:
Hi there everybody, I was just wondering that too many people choose
to use
language like Java because of its architecture independence, this AI
is
achieved because Java is as such a...
|
by: Kosio |
last post by:
Hello,
I know of a way to extract digits from a number using the %10 and
divide by 10. But I am wondering if there is an algorithm out there
that does not use a divide by 10 feature.
The...
|
by: aruna.mysore |
last post by:
Hi all,
I have a simple definitioin in a C file something like this.
main()
{
char a;
.......
int k;
}
|
by: jmcgill |
last post by:
Hello.
Is there a method for computing the number of digits, in a given numeric
base, of N factorial, without actually computing the factorial?
For example, 8! has 5 digits in base 10; 10! has...
|
by: VK |
last post by:
http://groups.google.com/group/comp.lang.javascript/browse_frm/thread/
b495b4898808fde0>
is more than one month old - this may pose problem for posting over
some news servers. This is why I'm...
|
by: H.S. |
last post by:
Hello,
I am trying out a few methods with which to test of a given number is
practically zero. as an example, does the following test correctly if a
given number is zero within machine...
|
by: coaassign |
last post by:
Hi all
I need to get a count the no of instructions executed in a for loop in exactly one second...
Can any one please give me the code for this...
THANKS IN ADVANCE
|
by: justinanthonyhamilton |
last post by:
Hello, everyone. I recently took a class on Data Structures in C++
(using D.S. Malik's C++ Programming: Program Design Including Data
Structures), and while I learned a good about specific data...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
| |