object file size is reduced after build

>We have large code on which we are solving the bugs. For every bug we

>change the code part either add or delete some of the code part. More
is adding the code part and very less is deleting the code.

But the question is , after build the object file( changed code) size
is less than the object file ( unchanged code)???. As we increased
the lines of code the object file size should be increased but here it
is decresing .. why?

The "line of code" is an incredibly silly measure of code complexity.
Except for preprocessor directives, many compilers will accept
putting all of the code on one line. (Although the compiler is not
required do accept long lines, many do anyway).

Do not calculate the number of lines of code, especially not where
management can see it.

Dec 19 '07 #2

Kaz Kylheku

On Dec 17, 11:19 pm, jayapal <jayapal...@gmail.comwrote:

Hi all,

We have large code on which we are solving the bugs. For every bug we
change the code part either add or delete some of the code part. More
is adding the code part and very less is deleting the code.

But the question is , after build the object file( changed code) size
is less than the object file ( unchanged code)???. As we increased
the lines of code the object file size should be increased but here it
is decresing .. why?

I can add two lines of code that can drastically cut down the size of
an object file, provided that the compiler eliminates unreachable
basic blocks properly:

if (0) {

and, lower down:

}

:)
Maybe the code you are writing contains a lot of redundancy that the
compiler is able to recognize and eliminate. E.g. suppose you have two
functions that translate to exactly the same machine code. The
compiler can detect that and merge them into one function. (At the
object file level, one function can still be given two names).

This kind of squeezing could be done at the basic block level also,
not just one whole functions. Suppose you have a construct like:

if (condition()) {
S1;
} else {
S2;
}

S1 and S2 are different statements. Suppose you add something to S1
which makes it equivalent to S2. It means that the same logic is then
executed regardless of which way the condition goes:

if (condition()) {
S2;
} else {
S2;
}

The compiler could recognize the situation and simply reorganize the
code to:

condition(); /* necessary to call for any side effects */
S2;

And so adding lines to S1 which made it the same as S2 actually caused
the program to shrink.

Dec 19 '07 #3

Gordon Burditt wrote:
....

The "line of code" is an incredibly silly measure of code complexity.

OK. So what would you recommend to replace it, subject to the following
constraints:
1) It must be at least as easy to calculate as LOC.
2) It must be at least as free of subjective bias as LOC.
3) It must be at least as accurate a measure of code complexity as LOC.

From what you've said, constraint 3 should be easy to meet; how about
the other two?

Dec 19 '07 #4

James Kuyper said:

Gordon Burditt wrote:
...
>The "line of code" is an incredibly silly measure of code complexity.

OK. So what would you recommend to replace it, subject to the following
constraints:
1) It must be at least as easy to calculate as LOC.

That is unreasonable. If we want even a slight improvement on LOC, we
should be prepared to pay a little something for that improvement. Not
much, but a little. See below.

2) It must be at least as free of subjective bias as LOC.
3) It must be at least as accurate a measure of code complexity as LOC.

From what you've said, constraint 3 should be easy to meet; how about
the other two?

LOC is basically a count of newlines. Counting semicolons would be a
marginal improvement (for C code, anyway), and is just as quick to
calculate (but, unlike LOC, it does mean you'll have to cut some code).

I would argue that a semicolon count would approximate the complexity far
more accurately than a newline count (although of course it is still
pitifully inadequate).

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Dec 19 '07 #5

>The "line of code" is an incredibly silly measure of code complexity.

>
OK. So what would you recommend to replace it, subject to the following
constraints:
1) It must be at least as easy to calculate as LOC.
2) It must be at least as free of subjective bias as LOC.
3) It must be at least as accurate a measure of code complexity as LOC.

From what you've said, constraint 3 should be easy to meet; how about
the other two?

How about 42? It's much easier to calculate, and is free of
subjective bias.

Dec 20 '07 #6

James Kuyper wrote:

Gordon Burditt wrote:
...
>The "line of code" is an incredibly silly measure of code complexity.

OK. So what would you recommend to replace it, subject to the following
constraints:
1) It must be at least as easy to calculate as LOC.
2) It must be at least as free of subjective bias as LOC.
3) It must be at least as accurate a measure of code complexity as LOC.

From what you've said, constraint 3 should be easy to meet; how about
the other two?

Ah, obviously the two following snippets have widely differing
complexity:

if (i = foo(baz)) goo(flimdiddle);

and

i = foo(baz);
if (i)
{
goo(flimdiddle);
}

and the second is worth 5 times as much.

--
Merry Christmas, Happy Hanukah, Happy New Year
Joyeux Noel, Bonne Annee.
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Dec 20 '07 #7

Gordon Burditt wrote:

>>The "line of code" is an incredibly silly measure of code complexity.
OK. So what would you recommend to replace it, subject to the following
constraints:
1) It must be at least as easy to calculate as LOC.
2) It must be at least as free of subjective bias as LOC.
3) It must be at least as accurate a measure of code complexity as LOC.

From what you've said, constraint 3 should be easy to meet; how about
the other two?

How about 42? It's much easier to calculate, and is free of
subjective bias.

No that does not meet constraint 3. Claiming that it does would imply
that LOC is completely and absolutely unrelated to code complexity.
Exaggerations to the contrary notwithstanding, that's simply not true.

Dec 20 '07 #8

James Kuyper said:

<snip>

I'll concede Richard Heathfield's point that the complexity of C code
can probably be measured somewhat more accurately by a count of
semicolons than by a count of newline characters, which is obviously
equally easy to calculate (but applicable only to C code). However, it's
only a small increment in accuracy. The only way to significantly
improve on LOC as a measure requires a much more complicated algorithm.

Here's a further suggestion that, IMHO, significantly improves on LOC
without requiring a great deal of complexity.

(a) pre-process the source - this saves headaches with comments and
conditional compilation and stuff;
(b) now get counting:

* for each semicolon, count 1.
* for each left parenthesis, count 1.
* for each operator, count 1.
* for each instance of 'if', count 1.
* for each instance of 'for' or 'while', count 2.
* for each instance of 'case', count 1.
* for each instance of 'continue', count 1.
* for each instance of 'break', count 1.
* for each instance of 'goto', count 5.
* for each instance of 'setjmp' or 'longjmp', count 10.

(Adjust figures to taste.)

While this is a fair bit more work to implement than "count semicolons",
it's still not too bad, and could be done in a few minutes by any
reasonably competent C programmer.

BTW if you want to know complexity density rather than absolute complexity,
divide by LOC (or perhaps by file size) at the end.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Dec 20 '07 #9

Ben Bacarisse

James Kuyper <ja*********@verizon.netwrites:

Gordon Burditt wrote:
...
>The "line of code" is an incredibly silly measure of code complexity.

OK. So what would you recommend to replace it, subject to the
following constraints:
1) It must be at least as easy to calculate as LOC.
2) It must be at least as free of subjective bias as LOC.
3) It must be at least as accurate a measure of code complexity as LOC.

From what you've said, constraint 3 should be easy to meet; how about
the other two?

Relaxing (1), I'd suggest simply counting constructs that make a
choice. Add 1 for every "if", "while" (including "do while"), "for"
and "switch". One would want to count something (1?) for every
function body so as to encourage lots of decomposition.

To get slightly more complex, I'd be inclined to add one for every
"else" that was not an "else if" and thus also for every switch case
except the first (since the switch already counts one). In such a
scheme I'd count a "while" (and probably a "for") as 2. You'd need to
take a view on "&&" and "||" which are "if"s in all but name.

I once write a program did something like this for student programs (I
wanted to encourage simple solutions) but that had a "compounding"
metric: each construct had a score >1 and nesting multiplied the
scores. Functions (as a syntactic form) were "free", of course, but
they carried the score of their "insides".

--
Ben.

Dec 20 '07 #10

Ben Bacarisse wrote:

James Kuyper <ja*********@verizon.netwrites:

>Gordon Burditt wrote:
...
>>The "line of code" is an incredibly silly measure of code complexity.
OK. So what would you recommend to replace it, subject to the
following constraints:
1) It must be at least as easy to calculate as LOC.
2) It must be at least as free of subjective bias as LOC.
3) It must be at least as accurate a measure of code complexity as LOC.

From what you've said, constraint 3 should be easy to meet; how about
the other two?

Relaxing (1), I'd suggest simply counting constructs that make a
choice. Add 1 for every "if", "while" (including "do while"), "for"
and "switch". One would want to count something (1?) for every
function body so as to encourage lots of decomposition.

To get slightly more complex, I'd be inclined to add one for every
"else" that was not an "else if" and thus also for every switch case
except the first (since the switch already counts one). In such a
scheme I'd count a "while" (and probably a "for") as 2. You'd need to
take a view on "&&" and "||" which are "if"s in all but name.

I once write a program did something like this for student programs (I
wanted to encourage simple solutions) but that had a "compounding"
metric: each construct had a score >1 and nesting multiplied the
scores. Functions (as a syntactic form) were "free", of course, but
they carried the score of their "insides".

There's actually a formal metric which is based upon an algorithm
similar to the one you describe. I can't remember the name, though the
word "Cocomo" comes up when I'm thinking about it.

I was required by my company to take a course in software estimation
which explained such things. I found it less than useful. The techniques
they taught required data collection that took a lot more time than I
could easily afford. They required collecting of enough data to
calibrate the coefficients in the (assumed-to-be linear) relationship
between code complexity and development time. They had the built-in
assumption that a single programmer with a known constant productivity
would be assigned to a given task without interruption until that task
was complete.

They required me to know so much about a program to estimate it's
complexity, that the program had to be almost completely written before
I could do the calculation needed to estimate how long it would take to
write; at that point, multiplying the currently expended time by 1.5
would have been a far more accurate estimate, and much easier to calculate.

Those assumptions disagreed with so many different aspects of the
reality of my group that I've never found those techniques useful. I
suspect that those techniques were intended to be used at an
organizational level significantly higher than mine (I currently manage
1.5 other people; before layoffs mandated by NASA budget cuts I managed
a maximum of 2.5 people).

Dec 20 '07 #11

santosh

Ben Bacarisse wrote:

James Kuyper <ja*********@verizon.netwrites:

>Gordon Burditt wrote:
...
>>The "line of code" is an incredibly silly measure of code
complexity.

OK. So what would you recommend to replace it, subject to the
following constraints:
1) It must be at least as easy to calculate as LOC.
2) It must be at least as free of subjective bias as LOC.
3) It must be at least as accurate a measure of code complexity as
LOC.

From what you've said, constraint 3 should be easy to meet; how about
the other two?

Relaxing (1), I'd suggest simply counting constructs that make a
choice. Add 1 for every "if", "while" (including "do while"), "for"
and "switch". One would want to count something (1?) for every
function body so as to encourage lots of decomposition.

To get slightly more complex, I'd be inclined to add one for every
"else" that was not an "else if" and thus also for every switch case
except the first (since the switch already counts one). In such a
scheme I'd count a "while" (and probably a "for") as 2. You'd need to
take a view on "&&" and "||" which are "if"s in all but name.

I once write a program did something like this for student programs (I
wanted to encourage simple solutions) but that had a "compounding"
metric: each construct had a score >1 and nesting multiplied the
scores. Functions (as a syntactic form) were "free", of course, but
they carried the score of their "insides".

Interesting! Do goto break etc., count as a negative score?

Dec 20 '07 #12

pete

Richard Heathfield wrote:

>
James Kuyper said:

<snip>

I'll concede Richard Heathfield's point that the complexity of C code
can probably be measured somewhat more accurately by a count of
semicolons than by a count of newline characters, which is obviously
equally easy to calculate (but applicable only to C code). However, it's
only a small increment in accuracy. The only way to significantly
improve on LOC as a measure requires a much more complicated algorithm.

Here's a further suggestion that, IMHO, significantly improves on LOC
without requiring a great deal of complexity.

(a) pre-process the source - this saves headaches with comments and
conditional compilation and stuff;
(b) now get counting:

* for each semicolon, count 1.
* for each left parenthesis, count 1.
* for each operator, count 1.
* for each instance of 'if', count 1.
* for each instance of 'for' or 'while', count 2.
* for each instance of 'case', count 1.
* for each instance of 'continue', count 1.
* for each instance of 'break', count 1.
* for each instance of 'goto', count 5.
* for each instance of 'setjmp' or 'longjmp', count 10.

(Adjust figures to taste.)

While this is a fair bit more work
to implement than "count semicolons",
it's still not too bad, and could be done in a few minutes by any
reasonably competent C programmer.

BTW if you want to know complexity density
rather than absolute complexity,
divide by LOC (or perhaps by file size) at the end.

I like the semicolon count.

This construct:

do {
} while(--count != 0);

translates into only two opcodes in Microchip PIC assembly,
and that has a very reduced instruction set.

(1)(decrement, skip next instruction if result is zero)
(2)(jump to loop start)

--
pete

Dec 20 '07 #13

pete said:

<snip>

This construct:

do {
} while(--count != 0);

translates into only two opcodes in Microchip PIC assembly,
and that has a very reduced instruction set.

(1)(decrement, skip next instruction if result is zero)
(2)(jump to loop start)

....which is a bit silly, since it should reduce to one: MOV count, 0

In any case, target language complexity is generally not the issue. What
we're interested in is how complex the source code is, and:

do {
} while(--count != 0);

is considerably more complex than:

count = 0;

Using the rough n' ready guide I posted earlier, your code would score one
for the semicolon, one for the left paren, one for --, one for !=, and two
for 'while', making a score of 6 for that fragment, compared to 2 for
count = 0; - and that seems to me to be a reasonable reflection of the
added source level complexity of the (pointless) do-loop.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Dec 20 '07 #14

pete

Richard Heathfield wrote:

>
pete said:

<snip>

This construct:

do {
} while(--count != 0);

translates into only two opcodes in Microchip PIC assembly,
and that has a very reduced instruction set.

(1)(decrement, skip next instruction if result is zero)
(2)(jump to loop start)

...which is a bit silly, since it should reduce to one: MOV count, 0

For possibley large initial values of count:

do {
/*
** This comment is meant to
** represent an arbitrary amount of useful code
*/
} while(--count != 0);

the non commented portion of what could actually be useful code,
translates into only two opcodes in Microchip PIC assembly.

--
pete

Dec 20 '07 #15

CBFalconer wrote:

>
Ah, obviously the two following snippets have widely differing
complexity:

if (i = foo(baz)) goo(flimdiddle);

and

i = foo(baz);
if (i)
{
goo(flimdiddle);
}

and the second is worth 5 times as much.

Not a meaningful example. LOC is a statistical measure: like all such
measures, you can't apply it to small population sizes. I _think_ we
can agree a 5 MLOC programme is probably less complex than a 50 MLOC one?

And sure, you can artificially inflate linecounts - but your fingers
will start to complain doing that on even a thousand-line programme, let
alone a million-liner...
--
Mark McIntyre

CLC FAQ <http://c-faq.com/>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>

Dec 20 '07 #16

jameskuyper

Mark McIntyre wrote:
.....

Not a meaningful example. LOC is a statistical measure: like all such
measures, you can't apply it to small population sizes. I _think_ we
can agree a 5 MLOC programme is probably less complex than a 50 MLOC one?

And sure, you can artificially inflate linecounts - but your fingers
will start to complain doing that on even a thousand-line programme, let
alone a million-liner...

It would be relatively straightforward to modify one of those pretty-C
programs to perform the inflation for you. If you're being paid by the
line, it could be an investment well worth the effort. :-)

Dec 20 '07 #17

ja*********@verizon.net wrote:

Mark McIntyre wrote:
....
>Not a meaningful example. LOC is a statistical measure: like all such
measures, you can't apply it to small population sizes. I _think_ we
can agree a 5 MLOC programme is probably less complex than a 50 MLOC one?

And sure, you can artificially inflate linecounts - but your fingers
will start to complain doing that on even a thousand-line programme, let
alone a million-liner...

It would be relatively straightforward to modify one of those pretty-C
programs to perform the inflation for you. If you're being paid by the
line, it could be an investment well worth the effort. :-)

Wouldn't work - everyone in the firm would use it, and the PHBs would
just rebase the payments scale... :-(

--
Mark McIntyre

CLC FAQ <http://c-faq.com/>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>

Dec 20 '07 #18

Richard Heathfield wrote:

>

.... snip ...

>
Here's a further suggestion that, IMHO, significantly improves on
LOC without requiring a great deal of complexity.

(a) pre-process the source - this saves headaches with comments
and conditional compilation and stuff;
(b) now get counting:

* for each semicolon, count 1.
* for each left parenthesis, count 1.
* for each operator, count 1.
* for each instance of 'if', count 1.
* for each instance of 'for' or 'while', count 2.
* for each instance of 'case', count 1.
* for each instance of 'continue', count 1.
* for each instance of 'break', count 1.
* for each instance of 'goto', count 5.
* for each instance of 'setjmp' or 'longjmp', count 10.

(Adjust figures to taste.)

A simpler method is to use a suitable reference compiler (say gcc
3.2.1) and compile for pure code (i.e. no debuggery, no symbol
names, etc.) on a given platform (say the X86). Also eliminate any
optimization. Now count bytes of generated code, ignoring
relocation tables.

--
Merry Christmas, Happy Hanukah, Happy New Year
Joyeux Noel, Bonne Annee.
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Dec 20 '07 #19

Mark McIntyre wrote:

CBFalconer wrote:

>Ah, obviously the two following snippets have widely differing
complexity:

if (i = foo(baz)) goo(flimdiddle);
and
i = foo(baz);
if (i)
{
goo(flimdiddle);
}

and the second is worth 5 times as much.

Not a meaningful example. LOC is a statistical measure: like all
such measures, you can't apply it to small population sizes. I
_think_ we can agree a 5 MLOC programme is probably less complex
than a 50 MLOC one?

And sure, you can artificially inflate linecounts - but your
fingers will start to complain doing that on even a thousand-line
programme, let alone a million-liner...

Any you are claiming that you don't see the equivalent of my second
example every day or more?

--
Merry Christmas, Happy Hanukah, Happy New Year
Joyeux Noel, Bonne Annee.
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Dec 20 '07 #20

Default User

Mark McIntyre wrote:

ja*********@verizon.net wrote:

It would be relatively straightforward to modify one of those
pretty-C programs to perform the inflation for you. If you're being
paid by the line, it could be an investment well worth the effort.
:-)

Wouldn't work - everyone in the firm would use it, and the PHBs would
just rebase the payments scale... :-(

Well, duh, you don't TELL cow-orkers about it.

Brian

Dec 20 '07 #21

CBFalconer wrote:

Mark McIntyre wrote:
>CBFalconer wrote:

>>Ah, obviously the two following snippets have widely differing
complexity:

if (i = foo(baz)) goo(flimdiddle);
and
i = foo(baz);
if (i)
{
goo(flimdiddle);
}

and the second is worth 5 times as much.
Not a meaningful example. LOC is a statistical measure: like all
such measures, you can't apply it to small population sizes. I
_think_ we can agree a 5 MLOC programme is probably less complex
than a 50 MLOC one?

And sure, you can artificially inflate linecounts - but your
fingers will start to complain doing that on even a thousand-line
programme, let alone a million-liner...

Any you are claiming that you don't see the equivalent of my second
example every day or more?

I'm saying that the examples were statistically meaningless. If you
sampled the codebase of any large programme at random, some samples will
be more spaced out than others, as a result of programmer preference,
developments in house style and probably editor technology. Two random
snippets won't tell you anything useful however.

By the way I write in style 2 pretty much exclusively, to improve
clarity. Don't you?

--
Mark McIntyre

CLC FAQ <http://c-faq.com/>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>

Dec 20 '07 #22

jacob navia

Mark McIntyre wrote:

CBFalconer wrote:
>Mark McIntyre wrote:
>>CBFalconer wrote:

Ah, obviously the two following snippets have widely differing
complexity:

if (i = foo(baz)) goo(flimdiddle);
and
i = foo(baz);
if (i)
{
goo(flimdiddle);
}

and the second is worth 5 times as much.
Not a meaningful example. LOC is a statistical measure: like all
such measures, you can't apply it to small population sizes. I
_think_ we can agree a 5 MLOC programme is probably less complex
than a 50 MLOC one?

And sure, you can artificially inflate linecounts - but your
fingers will start to complain doing that on even a thousand-line
programme, let alone a million-liner...

Any you are claiming that you don't see the equivalent of my second
example every day or more?

I'm saying that the examples were statistically meaningless. If you
sampled the codebase of any large programme at random, some samples will
be more spaced out than others, as a result of programmer preference,
developments in house style and probably editor technology. Two random
snippets won't tell you anything useful however.

By the way I write in style 2 pretty much exclusively, to improve
clarity. Don't you?

What is clarity?

It is being able to see more of the program in the screen at a given
time without having to scroll down or up...
Verbose programs that go over lines and lines need constant scrolling to
see read them.

With less verbosity and unnecessary white space, you can see
the whole function in the screen at a single page, what makes much more
to improve things that verbose line style.

As anything, this rule can be perverted too, and you get obfuscated
code.

But

if (i=foo(baz)) goo(sss);

is perfectly clear and readable to me...
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Dec 20 '07 #23

santosh

jacob navia wrote:

Mark McIntyre wrote:
>CBFalconer wrote:
>>Mark McIntyre wrote:
CBFalconer wrote:

Ah, obviously the two following snippets have widely differing
complexity:
>
if (i = foo(baz)) goo(flimdiddle);
and
i = foo(baz);
if (i)
{
goo(flimdiddle);
}
>
and the second is worth 5 times as much.
Not a meaningful example. LOC is a statistical measure: like all
such measures, you can't apply it to small population sizes. I
_think_ we can agree a 5 MLOC programme is probably less complex
than a 50 MLOC one?

And sure, you can artificially inflate linecounts - but your
fingers will start to complain doing that on even a thousand-line
programme, let alone a million-liner...

Any you are claiming that you don't see the equivalent of my second
example every day or more?

I'm saying that the examples were statistically meaningless. If you
sampled the codebase of any large programme at random, some samples
will be more spaced out than others, as a result of programmer
preference, developments in house style and probably editor
technology. Two random snippets won't tell you anything useful
however.

By the way I write in style 2 pretty much exclusively, to improve
clarity. Don't you?

What is clarity?

It's a measure of how quickly and easily a programmer reasonably
competent in the language can understand the purport of a piece of
code.

It is being able to see more of the program in the screen at a given
time without having to scroll down or up...
Verbose programs that go over lines and lines need constant scrolling
to see read them.

Either extreme defeats clarity. Excessively terse code makes you waste
time figuring it out while excessively verbose code makes you waste
time ploughing through it.

>
With less verbosity and unnecessary white space, you can see
the whole function in the screen at a single page, what makes much
more to improve things that verbose line style.

As anything, this rule can be perverted too, and you get obfuscated
code.

But

if (i=foo(baz)) goo(sss);

is perfectly clear and readable to me...

Yes. This particular construct is a common idiom and is can be quickly
recognised. But imagine hundreds of thousands of lines like this, all
cramped together. Some vertical space is okay.

I'd prefer:

if (i = foo(baz))
goo(flimdiddle);

Dec 20 '07 #24

>Ah, obviously the two following snippets have widely differing

>complexity:

if (i = foo(baz)) goo(flimdiddle);

and

i = foo(baz);
if (i)
{
goo(flimdiddle);
}

and the second is worth 5 times as much.

Not a meaningful example. LOC is a statistical measure: like all such
measures, you can't apply it to small population sizes. I _think_ we
can agree a 5 MLOC programme is probably less complex than a 50 MLOC one?

No: not when there are simple "LOC inflater" algorithms like
"replace every newline with 1 trillion newlines".

Remember, use of a metric for performance evaluations or payment
for work alters the result of the work next time. You need to
assume people will try to "cheat", especially if it's not illegal.

>And sure, you can artificially inflate linecounts - but your fingers
will start to complain doing that on even a thousand-line programme, let
alone a million-liner...

Not when you have a computer do it for you.

You might get a metric that is a little harder to cheat with
a definition like:
- Count the number of lines of code *after* running it
through GNU indent with specified command-line options.
- Lines that are empty or consist entirely of whitespace or
comments or parts of comments or empty statements do not count.

Dec 21 '07 #25

>>>The "line of code" is an incredibly silly measure of code complexity.

>>OK. So what would you recommend to replace it, subject to the following
constraints:
1) It must be at least as easy to calculate as LOC.
2) It must be at least as free of subjective bias as LOC.
3) It must be at least as accurate a measure of code complexity as LOC.

From what you've said, constraint 3 should be easy to meet; how about
the other two?

How about 42? It's much easier to calculate, and is free of
subjective bias.

No that does not meet constraint 3. Claiming that it does would imply
that LOC is completely and absolutely unrelated to code complexity.

I claim that LOC is completely and absolutely unrelated to code
complexity *written by a programmer who knows what your metric is
and who knows s/he will be paid according to the metric*.

>Exaggerations to the contrary notwithstanding, that's simply not true.

Dec 21 '07 #26

Ian Collins

jacob navia wrote:

>
With less verbosity and unnecessary white space, you can see
the whole function in the screen at a single page, what makes much more
to improve things that verbose line style.

If you have to scroll to see a whole function on any half decent
display, your function is too long.

--
Ian Collins.

Dec 21 '07 #27

santosh

Gordon Burditt wrote:

<snip about code complexity>

You might get a metric that is a little harder to cheat with
a definition like:
- Count the number of lines of code *after* running it
through GNU indent with specified command-line options.
- Lines that are empty or consist entirely of whitespace or
comments or parts of comments or empty statements do not count.

This is good simple way to quantify code volume, but it has nothing to
do with code complexity unless you claim volume and complexity are
directly proportional.

Complexity is far more subjective that a mere count of code statements
etc. What may be "complex" to one programmer would be routine stuff for
another.

Dec 21 '07 #28

Gordon Burditt wrote:
....

I claim that LOC is completely and absolutely unrelated to code
complexity *written by a programmer who knows what your metric is
and who knows s/he will be paid according to the metric*.

The times when I've been asked by upper management to give them line
counts, it had nothing to do with determining how much anyone would get
paid, and the code had not been written with the expectation that there
would be any such a connection.

In my project, how well the program works has a lot more to do with
anyone's pay level than how big it is. If anything, a large size might
count (VERY slightly) against the programmer, by implying that the
program was overly complex for the tasks it needed to perform.

How well the program works is, in turn, less important than meeting
deadlines, as long as the code does meet it's minimum requirements by
the time the deadline arrives. Since the requirements are more easily
negotiable than deadlines, this means that meeting deadlines is the
single most important thing. I'm not happy with that fact, but I can
understand why meeting deadlines is a high priority, when several dozen
scientific teams from around the world are waiting for our programs to
work correctly before they can test whether theirs are working correctly.

Dec 21 '07 #29

On Fri, 21 Dec 2007 13:25:54 +0000, James Kuyper wrote:

How well the program works is, in turn, less important than meeting
deadlines,

How horribly true that is

>as long as the code does meet it's minimum requirements by
the time the deadline arrives.

or thereabouts... :-(

You missed out the first priority of course -coming in on budget...

Dec 21 '07 #30

Ian Collins wrote:

CBFalconer wrote:

.... snip ...

>
>Just to avoid misconceptions, here is my recommended bit:

if (i = foo(baz)) goo(flimdiddle);

which is quite obvious. goo is called if i is nonzero. There
are adequate blanks to separate ids from operators, etc.

All fine and dandy until you want to set a break/watch point on
that particular call to goo().

I don't. :-)

--
Merry Christmas, Happy Hanukah, Happy New Year
Joyeux Noel, Bonne Annee.
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>
--
Posted via a free Usenet account from http://www.teranews.com

Dec 23 '07 #31

Richard

CBFalconer <cb********@yahoo.comwrites:

Ian Collins wrote:
>CBFalconer wrote:

... snip ...
>>
>>Just to avoid misconceptions, here is my recommended bit:

if (i = foo(baz)) goo(flimdiddle);

which is quite obvious. goo is called if i is nonzero. There
are adequate blanks to separate ids from operators, etc.

All fine and dandy until you want to set a break/watch point on
that particular call to goo().

I don't. :-)

Which is why it's the general consensus that your code layout is
atrocious and not to be used as a paragon of good style. Having multiple
statements on one line like this makes code very hard to read and
debug. This is never, ever a good idea.

if (i = foo(baz))
goo(flimdiddle);

Shows immediately at a casual glance without even reading the code in
detail that a logic condition can cause a subroutine call and the
maintainer might want to keep an eye there. But this has been discussed
ad-infinitum.

Dec 23 '07 #32

On Fri, 21 Dec 2007 19:09:49 -0500, CBFalconer wrote:

Ian Collins wrote:
>CBFalconer wrote:

>>Just to avoid misconceptions, here is my recommended bit:

if (i = foo(baz)) goo(flimdiddle);

All fine and dandy until you want to set a break/watch point on that
particular call to goo().

I don't. :-)

Which is nice for you, but a bit of a nightmare for the poor maintenance
droid who has to disentangle your code. I'm all for compactness, but not
at the expense of either ease of maintenance or clarity for less expert
programmers. Writing dense code just for the sake of it smacks of, well,
elitism.

Dec 23 '07 #33

Mark McIntyre wrote:

CBFalconer wrote:
>Ian Collins wrote:
>>CBFalconer wrote:

Just to avoid misconceptions, here is my recommended bit:

if (i = foo(baz)) goo(flimdiddle);

All fine and dandy until you want to set a break/watch point on
that particular call to goo().

I don't. :-)

Which is nice for you, but a bit of a nightmare for the poor
maintenance droid who has to disentangle your code. I'm all for
compactness, but not at the expense of either ease of maintenance
or clarity for less expert programmers. Writing dense code just
for the sake of it smacks of, well, elitism.

Ah, but the debugger never crosses my mind, because I virtually
never use them. If it is of real use to the end user, he is quite
welcome to insert a <returnbefore the goo call and recompile.

My objective is to make the source code readable and sensible. It
is something like paragraphing prose. Every so often a totally
blank line appears to separate thoughts. More often the separation
is done by breaking into functions. The inline directive is very
handy for combining this with minimum function call overhead.

--
Merry Christmas, Happy Hanukah, Happy New Year
Joyeux Noel, Bonne Annee.
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Dec 24 '07 #34

Richard

CBFalconer <cb********@yahoo.comwrites:

Mark McIntyre wrote:
>CBFalconer wrote:
>>Ian Collins wrote:
CBFalconer wrote:

Just to avoid misconceptions, here is my recommended bit:
>
if (i = foo(baz)) goo(flimdiddle);

All fine and dandy until you want to set a break/watch point on
that particular call to goo().

I don't. :-)

Which is nice for you, but a bit of a nightmare for the poor
maintenance droid who has to disentangle your code. I'm all for
compactness, but not at the expense of either ease of maintenance
or clarity for less expert programmers. Writing dense code just
for the sake of it smacks of, well, elitism.

Ah, but the debugger never crosses my mind, because I virtually
never use them. If it is of real use to the end user, he is quite
welcome to insert a <returnbefore the goo call and recompile.

My objective is to make the source code readable and sensible. It

Which, if you read the comments, you have failed to do.

Dec 24 '07 #35

Richard wrote:

CBFalconer <cb********@yahoo.comwrites:

>Mark McIntyre wrote:
>>CBFalconer wrote:
Ian Collins wrote:
CBFalconer wrote:
>
>Just to avoid misconceptions, here is my recommended bit:
>>
> if (i = foo(baz)) goo(flimdiddle);

....

>My objective is to make the source code readable and sensible. It

Which, if you read the comments, you have failed to do.

He read the comments; he just has his own inscrutable sense of what
makes code readable which is different from yours and mine.

Dec 24 '07 #36

Randy Howard

On Thu, 20 Dec 2007 07:07:51 -0600, James Kuyper wrote
(in article <HOtaj.28849$JW4.1654@trnddc05>):

I'll concede Richard Heathfield's point that the complexity of C code
can probably be measured somewhat more accurately by a count of
semicolons than by a count of newline characters, which is obviously
equally easy to calculate (but applicable only to C code). However, it's
only a small increment in accuracy. The only way to significantly
improve on LOC as a measure requires a much more complicated algorithm.

Even a pathetically bad C programmer (like one that thinks "C/C++" is a
language) could come up with

/*
* int some_function(void)
*
* ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;
* Calculate something, taking no arguments, and return 0 on
* success, and 1 on failure.
* ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;
*/
So, the measurement would either be completely worthless, and open to
abuse just like the newline one, or much smarter in how it measures.

Even so, the "complexity" of the code is not, imo, directly related to
the number of ; chars as actual ends of C code 'lines', as complexity
is far more than just the amount of it, but what each line contains.

--
Randy Howard (2reply remove FOOBAR)
"The power of accurate observation is called cynicism by those
who have not got it." - George Bernard Shaw

Dec 24 '07 #37

Randy Howard

On Thu, 20 Dec 2007 16:01:44 -0600, Default User wrote
(in article <5t*************@mid.individual.net>):

Mark McIntyre wrote:

>ja*********@verizon.net wrote:

>>It would be relatively straightforward to modify one of those
pretty-C programs to perform the inflation for you. If you're being
paid by the line, it could be an investment well worth the effort.
-)

Wouldn't work - everyone in the firm would use it, and the PHBs would
just rebase the payments scale... :-(

Well, duh, you don't TELL cow-orkers about it.

The first rule of Code Club, is that we don't talk about Code Club.
--
Randy Howard (2reply remove FOOBAR)
"The power of accurate observation is called cynicism by those
who have not got it." - George Bernard Shaw

Dec 24 '07 #38

santosh

Randy Howard wrote:

On Thu, 20 Dec 2007 07:07:51 -0600, James Kuyper wrote
(in article <HOtaj.28849$JW4.1654@trnddc05>):

>I'll concede Richard Heathfield's point that the complexity of C code
can probably be measured somewhat more accurately by a count of
semicolons than by a count of newline characters, which is obviously
equally easy to calculate (but applicable only to C code). However,
it's only a small increment in accuracy. The only way to
significantly improve on LOC as a measure requires a much more
complicated algorithm.

Even a pathetically bad C programmer (like one that thinks "C/C++" is
a language) could come up with

/*
* int some_function(void)
*
* ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;
* Calculate something, taking no arguments, and return 0 on
* success, and 1 on failure.
* ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;
*/
So, the measurement would either be completely worthless, and open to
abuse just like the newline one, or much smarter in how it measures.

Presumably comments would be skipped.

Dec 25 '07 #39

>/*

> * int some_function(void)
*
* ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;
* Calculate something, taking no arguments, and return 0 on
* success, and 1 on failure.
* ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;
*/
So, the measurement would either be completely worthless, and open to
abuse just like the newline one, or much smarter in how it measures.

Presumably comments would be skipped.

#include <stdio.h>

int main(void)
{
printf("Hello, world\n");;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;
return 0;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;;;;;;;;
}

My patch to this hole would be "run the source through GNU indent
with a specified and fixed list of options, remove comments, remove
empty lines and lines consisting only of whitespace, then count
lines. Since GNU indent knows how to parse a good portion of C,
this is getting a lot more complicated.

An immediate problem with this so-called "fix" is that you can insert
goto X; X:
before any statement that doesn't already have a label on it, provided
that you generate labels that aren't already in use. And you can do it
repeatedly.

Other things you can insert before a statement:

while (0);
if (VARIABLE) { ; } else { ; }
(where VARIABLE is declared, in scope, and of integer or pointer type).
do { ; } while (0);
for (;0;) { ; }

It's going to take fairly complicated logic (of the same level as a compiler
optimizer) to determine that code like:

int n;

if(n) { goto k22571; k22571: for(;0;) { while(0); } } else { do { ; } while (0); }

is completely useless.

Dec 25 '07 #40