c# is a good way to learn c

> Hi Olaf_Baeyens, You didn't say anything about my example of LoopTo(),

so I presume that you don't care about it,
but, at any rate, I recently fixed a bug in LoopTo().

Actually no time to check it all.

When I program in unmanaged C++ as I have done the last few months, I get a
lot of spare time waiting for the compiler to compile, so I can read and
react. But now since I am back to .NET both unmanaged C++ and C# it compiles
so fast that I don't have time anymore to read it all.

It is amazing the difference of using brain power when I program in C++ or
C#.
Programming unmanaged C++ is like doing your examination, when I come home
then I am burned up. But now that I work with C#, not only I do I have a lot
of energy when I get home, I also can do more thinking about the program I
am busy with. I have a much broader view of what the program will do and
what direction I am going to take. Thus becoming far more productive.

You could say that C++ is hurting my eyes because I have to focus on the
very complicated keyword stuff and macros's to make my program work. While
in C# the syntax is so easily readable that I can sit back and look at the
whole picture in a very relaxed state. Which is very odd since I have years
of C++ experience and very lmited C# experience.

Programming C++ is like driving through a fog, your get tired very fast.

An example (I don't have time to make it the same example, just a quick
example)

This is the C# way:

override public System.String Text {
set {
Title.Text=value;
Title.Invalidate(true);
}
get {return Title.Text;}
}

------------------
This is the managed C++ way.

public: __property virtual System::String __gc * get_Extension() {
return m_pDataset->Extension.c_str();
}
public: __property virtual void set_Extension(System::String __gc
*asDataSetExt) {
m_pDataset->Extension=LPCTSTR((char*)(void*)...
}

------------------
And this is the unmanaged C++ way;

public: __declspec(property(get=get_Prefix, put=set_Prefix)) string Prefix;
public: virtual string get_Prefix() {
return m_sScanFilePrefix;
};
public: virtual set_Prefix(string aValue) {
m_sScanFilePrefix=aValue;
}

-----------------

Jul 21 '05 #253

Jeff_Relf

Hi Olaf_Baeyens ( and Greg_Cox ),

It remains to be seen whether Visual_Studio itself, for example,
would ever run under the Dot_Net_Program.

Further, even if that happened,
it remains to be seen if it would become a success or not.
Coders could always use a debugger written in C instead.

Imagine a debugger running under the Dot_Net_Program ! ?

Remember, Dot_Net is an untested pig which forbids #define.
C goes back to 1970... Firefox is written in C++, Linux in C.

Jul 21 '05 #254

Jeff_Relf

Hi Olaf, Re: The code you showed me,

I didn't understand it.
I've found that it helps if you include a main() to test your examples.

Further, although I have Visual_Studio_2003,
and therefore I can compile and run managed code,
I'm not inclined to do so.

You told me: << It is amazing
the difference of using brain power when I program in C++ or C#.
Programming unmanaged C++ is like doing your examination,
when I come home then I am burned up.
But now that I work with C#,
not only I do I have a lot of energy when I get home,
I also can do more thinking about the program I am busy with.
I have a much broader view of
what the program will do and what direction I am going to take.
Thus becoming far more productive.

You could say that C++ is hurting my eyes because I have to focus on
the very complicated keyword stuff and macros's to make my program work.
While in C# the syntax is so easily readable that
I can sit back and look at the whole picture in a very relaxed state.
Which is very odd since
I have years of C++ experience and very lmited C# experience.

Programming C++ is like driving through a fog,
your get tired very fast. >>

I'm glad you're enjoying the Dot_Net_Program,
but I won't use it because it doesn't allow #define.
Further, I've never seen anything in Dot_Net that interested me.

Jul 21 '05 #255

Jeff_Relf

Hi Greg_Cox, Re: VS's block_select and quick_macros, You told me: <<
I'll have to remember both of those. Thanks. >>

You're welcome, of course.
I had been wondering for a long time what debugger MicroSoft-proper used.
I'm very glad to hear it was Visual_Studio.

Jul 21 '05 #256

> I didn't understand it.

I've found that it helps if you include a main() to test your examples.
I tried to show the readability between the different languages and
operation modes.
The C# syntax language doesn't make me go blind. ;-) Much easier to read,
and thus understand.
I'm glad you're enjoying the Dot_Net_Program,
but I won't use it because it doesn't allow #define.
Further, I've never seen anything in Dot_Net that interested me.

It is your choice.
It has a lot of very interesting stuff. So many things, so little time.

Jul 21 '05 #257

Jeff_Relf

Hi Greg_Cox ( and Olaf_Baeyens ),

I used MS_C_6 for years, I'm using only MS_CPP_7_1 right now,
Visual_Studio_Net_2003... but I have no plans to use managed code.

I switched to C++ from C so that I could declare variables
without braces, {}, and do & referencing, instead of pointers.
For example: { func(); int Y, & X = Y ; func(); }

I just like the way it looked and porting my code was no big deal.

The C99 specs allows that stuff, but MicroSoft's C doesn't.

Nevertheless, C++'s String, cout and the STL have never appealed to me.
I like to keep a closer eye on allocations/deallocations,
and I have my own way of handling dynamic arrays.

Jul 21 '05 #258

Linønut

Olaf Baeyens poked his little head through the XP firewall and said:

And this is the unmanaged C++ way;

public: __declspec(property(get=get_Prefix, put=set_Prefix)) string Prefix;
public: virtual string get_Prefix() {
return m_sScanFilePrefix;
};
public: virtual set_Prefix(string aValue) {
m_sScanFilePrefix=aValue;
}

That isn't C++, either.

--
When all you have is a hammer, everything looks like a nail.

Jul 21 '05 #259

> > And this is the unmanaged C++ way;

public: __declspec(property(get=get_Prefix, put=set_Prefix)) string Prefix; public: virtual string get_Prefix() {
return m_sScanFilePrefix;
};
public: virtual set_Prefix(string aValue) {
m_sScanFilePrefix=aValue;
}

That isn't C++, either.

According to VC++ 2003 it is accepted. ;-)
But maybe no standard.

Jul 21 '05 #260

Linønut

Greg Cox poked his little head through the XP firewall and said:

In article <p4********************@comcast.com>, =?iso-8859-1?Q?
li*********@bone.com says...
Greg Cox poked his little head through the XP firewall and said:
> That might be true if Microsoft hadn't released a .NET version of Visual
> C++. With that they can immediately reuse 95%+ of the existing code
> base and integrate any desired .NET features for the next version.

Yeah, the same you can can "integrate" C code into C++ projects.

Well, yea. What do you think C++ is? Or, in other words, do you think
that the formula handling code in Excel that is written in C++ has to be
completely rewritten to make Excel .NET compatable?

No. Except you softies sound like you are taking a risk leaving the C++
code and memory leaks in place.

--
When all you have is a hammer, everything looks like a nail.

Jul 21 '05 #261

Ernest Borgnine

Greg Cox wrote:

Given my eighteen years at Microsoft as direct experience, what you
describe sounds very likely how they will move Office into the future.

I don't think Microsoft is going to make it in the next 5 years.

I expect huge declines in their revenues as their Office product sucumbs
to Open Office.

The Australians are already cutting their food supply.

Jul 21 '05 #262

chrisv

Sean Hederman wrote:

Oh, and chrisv, I agree wholeheartedly with you about Jeff, so...

Cheers Jeff.

And no, I'm not running away, it's not worth my time to try and educate
someone without the ability to see their mistakes. Carry on in your happy
world, assured that you're the greatest programmer the world has ever seen.
And those of us who can actually write *useful* and easily maintainable
programs will continue to get better jobs, whilst you sit in your safe
little 13 year tenure.

*plonk*

I knew you had potential. 8)

Jul 21 '05 #263

chrisv

Lee Marvin wrote:

Jeff_Relf wrote:

*plonk*

Jul 21 '05 #264

DM McGowan II

Tom Shelton wrote:

On 2005-04-06, Linønut <linønu*@bone.com> wrote:

tab poked his little head through the XP firewall and said:

I still think C++ is the ultimate language. Enough rope to shoot off

your

foot and then build a new one. <grin> Why jack off with Java when you
can get pure programming.

That is funny. I have, at work, now beat C++ programmers, 4 times,
in getting the job done. And my solutions are still in place.
The C++ guys were finally layed off. They took to long to finish,
and spent all their time debating what is faster.

Big deal. They sound like doofuses. They problem don't even know how to
use the STL. You could probably beat them just as well if you used straight
C.

I have now also seen a C# guy kick the hell out of a C++ guy in building
GUI's to Databases. You know, the most common business stuff.

I don't do GUIs at the moment. Other people do it here, using Qt.

You must not be in the mainstream of coding, but a side application.

Nah, I just wrote the C++ socket library and an XDR packet library
implementation for our group, not to mention a unit test library, an NT
service base class, our audio library, a dictionary XML markup languagelibrary,
and a library of string, filename, and other odds and ends. All in C++.

Most of which exist as native .NET components... And would never have
to be written in C# because they already exist. This is what I meant by
Time-To-Market. With C#, you spend more time worrying about the problem
then writting socket libraries and service base classes...

Sockets - System.Net and System.Net.Sockets
Service base class - System.ServiceProcess.ServiceBase (along with a
whole bunch of classes to support the manipulation of services).
Unit Testing - NUnit (not part of .NET properly, but it is a freely
available product written entirely in C#)
String - System.String, System.Text, System.Text.RegularExpressions
FileName - System.IO, System.IO.Path
XML stuff - System.XML. I'm not sure what your's does exactly, but I'm
sure there is something close or would be simpler to implement.

I'm not dissing you. I'm just pointing out that there is a reason that
VB and Java were more popular for buisness programming... Less bugs, and
less time to market.

Yeah, a side application, that's the right term.

There's more to programming than "kewl apps" and GUIs. In fact, the GUI
code is by far the easiest part of most applications.

Actually, I disagree :) I have always like the non-gui parts the best.

Tom I totally agree...time to market is the key, whether your market is
the Accounting Department down the hall or Sally Customer on the other
side of the planet.

With Mono or .NET you can have your cake and eat it too. C# and the
Framework rocks!

Now if only Mr. Gates would kindly follow my suggestion and make Windows
Open Source...at least the kernel.

Jul 21 '05 #265

Ernest Borgnine

Jeff_Relf wrote:

int Swap_32 ( int X ) { uint_8_P P = ( uint_8_P ) & X + 4 ;
return * -- P << 24 | * -- P << 16 | * -- P << 8 | * -- P ; }
main() { printf( "%x", Swap_32( 0x84838281 ) ); }

Then John Henry he did hammah,

He did make his hammah soun',

Says now one more lick fore quittin' time,

An' I'll beat this steam drill down,

An' I'll beat this steam drill down.

http://www.ibiblio.org/john_henry/folk.html

Jul 21 '05 #266

Jeff_Relf

Hi Ernestine, Ya wrote: <>

When coding, it helps to know the order of execution,
Spooky apparently has no clue about that.

Unable to recognize a simple coder, he also has no clue about who I am.

Jul 21 '05 #267

Linønut

Greg Cox poked his little head through the XP firewall and said:

No. Except you softies sound like you are taking a risk leaving the C++
code and memory leaks in place.

To be accurate, I'm an ex-softie...

As to the rest, what are you talking about? Are you actually saying it
would be better to dump a million lines of existing C++ code with 99%+
of the bugs already fixed and rewrite from scratch rather than reuse the
code and spend resources fixing the remaining bugs?

I don't know. You tell me. You're the one "in the know".

I anticipate that, upon arriving at work today, my XP box will inform me
that I have a critical patch to install for Office.

--
When all you have is a hammer, everything looks like a nail.

Jul 21 '05 #268

In comp.os.linux.advocacy, Jeff_Relf
<Me@Privacy.NET>
wrote
on 12 Apr 2005 07:24:44 GMT
<Je************************@Cotse.NET>:

Hi Ernestine, Ya wrote: <>

When coding, it helps to know the order of execution,
Spooky apparently has no clue about that.
I have more than a clue than you might think.

The HP PA Risc architecture, for instance (a Unix
workstation running HP/UX at one point which I had the
privilege of using for some years in my prior job --
nice machine, actually), had some very interesting ideas
of exactly when to execute an instruction, for example.
Basically, the processor had the option (and usually took
it) of executing the instruction following a conditional
branch, even though the branch was taken.

This presumably gave compiler optimizers fits.

http://h21007.www2.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,1701,958!33!250,00.html

<excerpt>

pa-risc 1.1 architecture and instruction set reference manual
[...]
concept of delayed branching

All branch instructions exhibit the delayed branch
feature. This implies that the major effect of the branch
instruction, the actual transfer of control, occurs one
instruction after the execution of the branch. As a result,
the instruction following the branch (located in the delay
slot of the branch instruction) is executed before control
passes to the branch destination. The concept of delayed
branching is illustrated in Figure 42.

Execution of the delay slot instruction, however, may be
skipped ("nullified") by setting the "nullify" bit in the
branch instruction to 1.

delayed branching
program segment
Location Instruction Comment
100 STW r3, 0(r6) ; non-branch instruction
104 BLR r8, r0 ; branch to location 200
108 ADD r7, r2, r3 ; instruction in delay slot
10C OR r6, r5, r9 ; next instruction in linear code
; sequence
.. .
.. .
.. .
200 LDW 0(r3), r4 ; target of branch instruction

execution sequence
Location Instruction Comment
100 STW r3, 0(r6) ;
104 BLR r8, r0 ;
108 ADD r7, r2, r3 ; delay slot
; instruction is executed before
200 LDW 0(r3), r4 ; execution of
; target instruction

</excerpt>

Also, the compiler, when presented with your example (paraphrased):

int swap32(int v)
{
unsigned char * p = (unsigned char *) &v;
p += 4;
return (*--p << 24)|(*--p << 16)|(*--p << 8)|(*--p);
}

has the option of evaluating the expression either
left-to-right or right-to-left, at the whim of the compiler
designer(s). (Unless Kelsey tells me different -- he's more
familiar with the C/C++ specifications than I am. :-))

However, some of the best optimization is done within
the compiler by reading the code *backwards* -- e.g., the
compiler might forbear a final store for a local variable
to save time, as the variable will just be discarded
later. In fact, the better compilers don't bother with
storing local variables at all, if they don't need to and
sufficient registers are available. There are a number of
other possibilities, some of which I've already alluded to.

I'm also familiar with some of the problems facing optimizers.
Briefly put, the code

int * selectAPointer(int * ap, int * bp, int * cp);

int doSomething()
{
int a = 1, b=2, c=3, *p;
int d1 = a+b+c;
p = selectAPointer(&a,&b,&c);
*p ++;
int d2 = a+b+c;
}

would have to be very carefully handled, as p might very
well point to a, b, or c after selectAPointer() returns,
or it might point to somewhere compiletely different -- but
there's no way for the compiler to tell here. Therefore,
both d1 and d2 will probably have to be explicitly coded as
something along the lines of:

MOV a(SP), r1
ADD b(SP), r1
ADD c(SP), r1
MOV r1, d1(SP)

Without the selectAPointer() call, the compiler has
the option of *merging* d1 and d2, conceptually (using
a method called "common subexpression elimination"), and
potentially making the code more efficient -- if only
by 4 instructions and a now-redundant stackframe location.

It might also remove the last instruction, and keep
d1 in a register for the length of the code.

I bring all this up to illustrate that your pattern of
linear thinking, generally desirable though it is, has
some drawbacks.

Be clear in your coding now, and your coding will be
clear to you later, when you reread it. :-) It will
also be clear to the compiler, when *it* reads it.

Of course, knowing the way you code, the code may very
well be clear to you already; just don't expect it to
make sense to the rest of us, without careful analysis.
The ideal, at least to a manager's thinking, is to shuffle
people around at need between projects, both to keep them
interested and to maximize utilization of resource --
this may require reading and modifying someone else's code.

There is at least one hacker [according to Stephen Levy,
anyway] out there who writes extremely ingenious -- most
would say incomprehensible -- code with nary a comment.
I would subscribe to that notion myself, except that
most machine code is pretty much as one might see from
the foregoing: without the comments, one might as well be
looking at gobbledygook, although the machine "understands"
it perfectly well as long as it knows which registers to
modify and which memory locations to wiggle. PA Risc is
slightly unusual in its "delay slot", but is otherwise
more or less typical regarding machines, though it has
more registers (32 at the user level; I can't say I played
with the space registers all that much) than one's typical
Intel hardware, and is probably better designed from an
instruction standpoint, since it didn't have to worry about
8080 source code compatibility. And even then, the above
is *assembly*; the actual machine code would be little more
than a bunch of hex numbers or charge packets.

It was once common for system programmers to pore over
pages of printouts dumped from a console in old IBM
hardware, for example (fortunately, that was before
my time). Of course, that was also when an IBM machine
had all of maybe 8K or 16K of RAM.

I've also caught MS VC++ (version 5.0) in a bona fide
memory corruption condition. I'll admit I'm not sure who
created the bug (as it was running on NT3.5 or NT4, whose
memory allocation was slightly suspect) but the assembly
listing from a section of code I was working on at the
time was quite corrupted for the space of about 10 lines.
As Ronald Reagan once said, "trust but verify".

However, compiler mangling is not the first thing that
comes to mind when one's code does the goofy, though --
most likely, it's one's code.

Unable to recognize a simple coder, he also has no clue about who I am.

There are more things in Heaven and Earth, Horatio, than are dreamt of
in your philosophies.
-- Hamlet, Act I, Scene 5

--
#191, ew****@earthlink.net
It's still legal to go .sigless.

Jul 21 '05 #269

[snips]

The Ghost In The Machine wrote:

typedef unsigned __int8 * uint_8_P ;
__int32 Swap_32 ( __int32 X ) { uint_8_P P = ( uint_8_P ) & X + 4 ;
return * -- P << 24 | * -- P << 16 | * -- P << 8 | * -- P ;
}

Hm...where to start?

[1] __int8 isn't exactly standard.
[2] The expression

* -- P << 24 | * -- P << 16 | * -- P << 8 | * -- P

can be evaluated in any order the compiler chooses, as '|' is
commutative (a|b == b|a) and associative (a|(b|c) == (a|b)|c).
(If overloaded, of course, things get interesting. I'd
frankly have to research it.) Of course, because of the side
effects, the commutativity sort of goes out the window, but
AFAIK (Kelsey, are you still with us or have you given up on
this thread? :-) ) expressions of this sort are undefined for
precisely this reason.

Sort of still here. As a simple rule of thumb, if code relies on order of
evaluation, it's broken. Further, a statement may modify the value of an
object at most once between sequence points or invoke undefined behavior;
the code above modifies P what, four times?

This sort of thing - Relf's raison d'etre if you will - is generally
referred to as "unwarranted friendliness with the compiler", or "my
compiler does it this way, so it's okay." This assumes the compiler will
never change (unwarranted assumption right there), that the code will never
move to another compiler (or another version of the same compiler, which
works differently) and that nothing else will ever change, such as building
the code with a compiler that splits operations across processors (or CPU
threads).

Jul 21 '05 #270

[snips]

Jeff_Relf wrote:

No... No... No... I has to follow the left to right standard.
There is no "right to left standard". MS may document that _their_ compiler
does it that way, but that's them. Nothing to do with standards.
You concluded: <<
Admittedly, you've, erm, impressed us with the readability of your code.
I think Kelsey has, too -- but I know whose code I'd prefer to maintain.
:-) >>

I wouldn't want you anywhere near my code.

Not much of a worry about that; you don't actually have any code, just
gibberish.

Jul 21 '05 #271

Jeff_Relf wrote:

Hi Eran_Kampf, Re: How I won't use C# because it doesn't have #define,
You told me: << Funny thing is, 10 years ago there were assembly
developers
talking about C and C++ the same way you take about C#. >>

C++ is as high_level as I want it to be because I can always add a DLL.

But C#, with it's Program_Inside_a_Program, lack of #define
and UnGodly_Bloat are a straightjacket I refuse to wear.

Funny thing, that. Here the rest of the C++ world has agreed that #define
is largely pointless in C++ and even, in many cases, unnecessarily
dangerous, but you, of course, knowing better than almost the entire C++
coding world insist on doing things the wrong way.

Must be nice to live in your universe.

Jul 21 '05 #272

gerry

sorry to disagree Kelsey but the current language specifications for c , c++
, c# and java ALL agree with Relf on this point.
i tend to agree with your opinions regarding this type of coding , but they
are just opinions.

"Kelsey Bjarnason" <kb********@gmail.com> wrote in message
news:BKS6e.27945$7Q4.10982@clgrps13...

[snips]

Jeff_Relf wrote:
No... No... No... I has to follow the left to right standard.
There is no "right to left standard". MS may document that _their_

compiler does it that way, but that's them. Nothing to do with standards.
You concluded: <<
Admittedly, you've, erm, impressed us with the readability of your code. I think Kelsey has, too -- but I know whose code I'd prefer to maintain. :-) >>

I wouldn't want you anywhere near my code.

Not much of a worry about that; you don't actually have any code, just
gibberish.

Jul 21 '05 #273

In comp.os.linux.advocacy, Jeff_Relf
<Me@Privacy.NET>
wrote
on 12 Apr 2005 03:32:17 GMT
<Je************************@Cotse.NET>:

Hey Spooky ! ( Mr. Coders_Are_Friends_of_Mine_and_Relf_is_No_Coder )

Hederman has me plonked and you, knowning better,
are telling him I'm unemployed.
Oops. My bad. Nice to feel needed.

You think I'm lying when I repeatedly tell you that
only coding has paid my rent since the start of 1982 ?

I delivered code to paying customers 2 hours ago...
What about you, Mr. Coders_Are_Friends_of_Mine_and_Relf_is_No_Coder ! ? ?
I'm fulltime, mister.

At least you've figured out that #including "implementation.c"
is a bad thing. (I kid you not; one contractor we hired way back
when actually did exactly that in his code.)

And, No, I'm not being rhetorical here... I want an answer... Mr. Dodger.
See above.

You concluded: <<
So we use automated translators -- compilers -- to make our
wishes known, and sometimes the translation goes very badly awry
-- Jeff's "Swap_32()" being a case in point.
In Jeff's case it's probably because of a misuse of the language.

Well, Mr. Coders_Are_Friends_of_Mine_and_Relf_is_No_Coder,
do you think you could compile the code below and tell me the result ?

typedef unsigned char * uint_8_P ;
int Swap_32 ( int X ) { uint_8_P P = ( uint_8_P ) & X + 4 ;
return * -- P << 24 | * -- P << 16 | * -- P << 8 | * -- P ; }
main() { printf( "%x", Swap_32( 0x84838281 ) ); }

Sure. On my Linux/x86 I get 84838281. Nice no-op. (Or is it
supposed to swap only on big-endian hardware? Is it supposed
to even *run* on big-endian hardware?)

gcc version 3.3.5-20050130 (Gentoo Linux 3.3.5.20050130-r1,
ssp-3.3.5.20050130-1, pie-8.7.7.1)

I don't have my SPARC up at the moment or I'd try it on that as well.

Color me unimpressed.

Just so you know: I can code this a bunch of ways. Here's some examples
that actually *work*:

int Swap_32_Shifty(int X)
{
return (X & 0xff000000) >> 24
| (X & 0x00ff0000) >> 8
| (X & 0x0000ff00) << 8
| (X & 0x000000ff) << 24;
}

This one's simple enough but the "int" should probably be unsigned,
and assumes everything is 32-bit -- an increasingly problematic
issue as more 64-bit stuff comes online. Still, it at least
puts the bytes in the right place, unlike someone else's example.

int Swap_32_Union(int X)
{
union {
int x;
unsigned char c[sizeof(int)];
} u;

u.x = X;
int i;
int j;

for(i = 0, j = sizeof(int)-1; i < j; i++,j--)
{
unsigned char t = u.c[i];
u.c[i] = u.c[j];
u.c[j] = t;
}

return u.x;
}

This one's the safest of the bunch.

int Swap_32_Doubleswap(int X)
{
register char tmp;
tmp = *((unsigned char *) &X + 3);
*((unsigned char *) &X + 3) = *((unsigned char *) &X + 0);
*((unsigned char *) &X + 0) = tmp;
tmp = *((unsigned char *) &X + 2);
*((unsigned char *) &X + 2) = *((unsigned char *) &X + 1);
*((unsigned char *) &X + 1) = tmp;

return X;
}

Ugly but straightforward; I'm swapping two pairs of bytes. (For brownie
points: find the harmless bug.)

int Swap_32_Doubleswap_Local(int X)
{
register unsigned char tmp;
register unsigned char * P = (unsigned char *) &X;

tmp = P[3]; P[3] = P[0]; P[0] = tmp;
tmp = P[2]; P[2] = P[1]; P[1] = tmp;
return X;
}

A little prettier variant with a local variable holding the pointer.
If coded exactly according to spec (taking the location of a
parameter is, if I'm not mistaken, a bit of a no-no in C/C++):

int Swap_32_Doubleswap_Local_Proper(int X)
{
int Y = X;
register unsigned char tmp;
register unsigned char * P = (unsigned char *) &Y;

tmp = P[3]; P[3] = P[0]; P[0] = tmp;
tmp = P[2]; P[2] = P[1]; P[1] = tmp;
return Y;
}

It turns out Swap_32_Doubleswap and Swap_32_Doubleswap_Local generate
identical code under the optimizer (even with the harmless bug!);
Swap_32_Doubleswap_Local_Proper generates 4 extra lines.

This particularly goofy variant is nevertheless workable:

int Swap_32_Extremely_Goofy(int X)
{
int i;
int result = 0;
static int table[] = {
24,25,26,27,28,29,30,31,
16,17,18,19,20,21,22,23,
8,9,10,11,12,13,14,15,
0,1,2,3,4,5,6,7
};

for(i=0;i<32;i++)
{
if(X & 1) result |= (1<<table[i]);
X >>= 1;
}

return result;
}

though I doubt it's the most efficient.

Or one can simply use

htonl() / ntohl()

These will swap only on little-endian hardware. Of course, that
may be exactly what's desired, as network order is big-endian.
The big advantage to these are that they are predeclared (in
<netinet/in.h>), and therefore no coding at all is required on
the user's part.

For the record, here's a code analysis of the assembly coming out.
(The comments are mine. Note that this is GAS notation, which
is the reverse of what Intel likes.)

..globl Swap_32
.type Swap_32, @function
Swap_32:
pushl %ebp ; standard frame
movl %esp, %ebp ; crap
subl $4, %esp ; make room for the P
leal 12(%ebp), %eax ; load the parameter's addy, +4 ...
movl %eax, -4(%ebp) ; ... then stuff it into P
leal -4(%ebp), %eax ; ... and take &P into ax
decl (%eax) ; --P
movl -4(%ebp), %eax ; ... and take &P into ax again
; (dumb unoptimized code)
movzbl (%eax), %edx ; *P
sall $24, %edx ; << 24
leal -4(%ebp), %eax ; ... and take &P into ax AGAIN
decl (%eax) ; --P
movl -4(%ebp), %eax ; ... I sense a theme here
movzbl (%eax), %eax ; *P
sall $16, %eax ; << 16
orl %eax, %edx ; '|'
leal -4(%ebp), %eax ; ... getting monotonous
decl (%eax) ; --P
movl -4(%ebp), %eax ; ... three guesses
movzbl (%eax), %eax ; *P
sall $8, %eax ; << 8
orl %eax, %edx ; '|'
leal -4(%ebp), %eax ; ... again?
decl (%eax) ; --P
movl -4(%ebp), %eax ; ... finally, the last repetition
movzbl (%eax), %eax ; and the last bit to or in
orl %eax, %edx ; '|'
movl %edx, %eax ; and now return the result
leave ; ... to the caller
ret

If one compiles with -O, one gets:

..globl Swap_32
.type Swap_32, @function
Swap_32:
pushl %ebp ; standard frame
movl %esp, %ebp ; crap
leal 11(%ebp), %ecx ; CX = P = &X + 3; we've
; already decremented at this point
movzbl (%ecx), %eax ; *P
sall $24, %eax ; << 24
decl %ecx ; --P
movzbl (%ecx), %edx ; *P
sall $16, %edx ; << 16
orl %edx, %eax ; '|'
decl %ecx ; --P
movzbl (%ecx), %edx ; *P
sall $8, %edx ; <<8
orl %edx, %eax ; '|'
movzbl -1(%ecx), %edx ; *P
orl %edx, %eax ; '|', leaving result in AX
popl %ebp ; for the caller
ret

It's still wrong but now it's more compact. 18 lines.

Compare that code to the unoptimized Swap_32_Doubleswap():

..globl Swap_32_Doubleswap
.type Swap_32_Doubleswap, @function
Swap_32_Doubleswap:
pushl %ebp ; standard frame
movl %esp, %ebp ; crap
subl $4, %esp ; I declared this "register",
; but oh well; make room for
; it anyway
movzbl 11(%ebp), %eax ; get the 4th byte
movb %al, -1(%ebp) ; save it
movzbl 8(%ebp), %eax ; move first byte
movb %al, 11(%ebp) ; to the fourth byte
movzbl -1(%ebp), %eax ; get the saved byte
movb %al, 8(%ebp) ; and move it to first byte
movzbl 10(%ebp), %eax ; get the 3rd byte
movb %al, -1(%ebp) ; save it
movzbl 9(%ebp), %eax ; move second byte
movb %al, 10(%ebp) ; to the 3rd byte
movzbl -1(%ebp), %eax ; get the saved byte
movb %al, 9(%ebp) ; and move it to second byte
movl 8(%ebp), %eax ; return result
leave ; ... to the caller
ret

and the optimized variant:

..globl Swap_32_Doubleswap
.type Swap_32_Doubleswap, @function
Swap_32_Doubleswap:
pushl %ebp ; standard frame
movl %esp, %ebp ; crap
leal 8(%ebp), %edx ; (unsigned char *) &X
movzbl 3(%edx), %ecx ; get fourth byte
movzbl 8(%ebp), %eax ; get first byte
movb %al, 3(%edx) ; store first in fourth
movb %cl, 8(%ebp) ; store fourth in first
movzbl 2(%edx), %ecx ; get third byte
movzbl 1(%edx), %eax ; get second byte
movb %al, 2(%edx) ; store second in third
movb %cl, 1(%edx) ; store third in second
movl 8(%ebp), %eax ; fetch the result
popl %ebp ; ... and return it
ret

This one's only 14 lines.

Which one's likely to be more efficient, at least on GCC-derived systems?

Which one's more likely to actually function properly?

--
#191, ew****@earthlink.net
It's still legal to go .sigless.

Jul 21 '05 #274

In comp.os.linux.advocacy, Sean Hederman
<em*******@codingsanity.blogspot.com>
wrote
on Tue, 12 Apr 2005 00:28:43 +0200
<d3**********@ctb-nnrp2.saix.net>:

"The Ghost In The Machine" <ew***@sirius.athghost7038suus.net> wrote in
message news:ej************@sirius.athghost7038suus.net...
[Snip]
For the record:
32 = 0x00000020
536870912 = 0x20000000
538976288 = 0x20202020

Interesting glitch.
Yeah, I saw that when I ran it, I just the assumed result was a mistake in
Relfies code. Couldn't be bothered to load another C++ compiler to check
this; Can you confirm it's okay on non-MS compilers? If so that's a pretty
shocking MS C++ bug, no matter how much Relf abuses the edges of the C++
standard.

Not sure if it's a bug or not. On GCC/x86 it would return 32.

[snip for brevity]
Uh...you sure you want him to be writing code in *three* languages? :-)
(He already mangles C/C++ -- which for purposes of this discussion
is pretty much one language, the way Jeff uses it! -- and, presumably,
Visual Basic, which is apparently an adjunct to his X.CPP.) ...
Actually, in his case it's probably more like trying to
squish as much code in his screen as will uncomfortably
fit, readability be damned.

Probably not a bad thing for him -- but to the rest of us,
it looks like parrot droppings.

...
Not sure it's anything near like Basic. It's probably more like
Relfic -- a language that nobody else has ever invented. :-)

Who woulda thunk that a major MS semi-fan like me would have so much in
common with a raving linux fundi ? ;-)

Ah, are you? Well, we can argue in another thread. :-)

I shouldn't have to spend hours poring over your code to
find out what it does. Good code is more like an essay.
Clear, commented, and well laid out. Yours is quite
clearly none of those things.

There's at least one court ruling (I'd have to find it) that
states that a certain subset of code (relating to crypto)
is free speech -- and any code could be construed as a medium of
communication both to computer and to developer. Certainly
good code should be clear to both; of course, code is always
clear to the computer (as the computer only reads machine
code anyway) but when was the last time you or I actually
wanted to pore over core dumps? :-)

So we use automated translators -- compilers -- to make our
wishes known, and sometimes the translation goes very badly
awry -- Jeff's "Swap_32()" being a case in point. In Jeff's
case it's probably because of a misuse of the language.

Does this mean I can sue Jeff for abusing my First Amendment
rights? Oh, bugger, I'm not American dammit. I'd be interested
in reading that ruling if you manage to find it though, sounds
interesting.

I'll have to find it. It's from the 9th Circuit, that much I remember.

--
#191, ew****@earthlink.net
It's still legal to go .sigless.

--
#191, ew****@earthlink.net
It's still legal to go .sigless.

Jul 21 '05 #275

Jeff_Relf

Hi Spooky, Re: My Swap_32(), You wrote: <<
Not sure if it's a bug or not. On GCC/x86 it would return 32. >>

Don't you guys believe in hex ? !

0x84838281 is a much Much better test,
as each byte is labeled and has it's high bit set.

By the way, on a x86, I should've reversed the order, like this:

typedef unsigned char * uint_8_P ;

int Swap_32 ( int X ) { uint_8_P P = ( uint_8_P ) & X ;
return * P << 24 | * ++ P << 16 | * ++ P << 8 | * ++ P ; }
main() {
// prints 0x84848484 ! ! MS_CPP_7_1 BUG
printf( "%x", Swap_32( 0x84838281 ) ); }

Assigning each byte to a global makes it work, like this:

typedef unsigned char * uint_8_P ; int _1, _2, _3, _4 ;

int Swap_32 ( int X ) { uint_8_P P = ( uint_8_P ) & X ;
return ( _1 = * P << 24 ) | ( _2 = * ++ P << 16 )
| ( _3 = * ++ P << 8 ) | ( _4 = * ++ P ); }

main() {
// prints 0x81828384 as it should.
printf( "%x", Swap_32( 0x84838281 ) ); }

Removing the assignments, but keeping the parens, fails:
return ( * P << 24 ) | ( * ++ P << 16 ) | ( * ++ P << 8 ) | ( * ++ P );

As I said before, MS_CPP fails with or without the optimizer,
and MicroSoft claims that the | operator is evaluated left to right:
http://msdn.microsoft.com/library/de....operators.asp

Operator |
Name Bitwise inclusive OR
Associativity Left to right

Jul 21 '05 #276

gerry wrote:

sorry to disagree Kelsey but the current language specifications for c ,
c++ , c# and java ALL agree with Relf on this point.
i tend to agree with your opinions regarding this type of coding , but
they are just opinions.

Actually, no, they don't.

He's correct about a very limited subset of the problem. However, he
completely fails to grasp the full scope of the problem, to wit, the
undefined behavior as a result of the multiple modifications involved
between sequence points, from which the associativity provides _no_
protection at all.

Jul 21 '05 #277

Jeff_Relf

Hi Kelsey_Bjarnason ( and Gerry ),

Re: This Swap_32_x86() working for gcc but not for MS_CPP_7_1:

typedef unsigned char * uint_8_P ;

int Swap_32_x86 ( int X ) { uint_8_P P = ( uint_8_P ) & X ;
return * P << 24 | * ++ P << 16 | * ++ P << 8 | * ++ P ; }
main() {
// prints 0x84848484 ! ! MS_CPP_7_1 BUG
printf( "%x", Swap_32_x86( 0x84838281 ) ); }

You claimed there's, << undefined behavior as a result of
the multiple modifications involved between sequence points,
from which the associativity provides no protection at all. >>

I don't know what you might be asserting here, if anything,
but I hope you have a lot of credible documents to support you.

MS_CPP_7_1's precedence is well defined.

MicroSoft claims that the | operator is evaluated left to right:
http://msdn.microsoft.com/library/de....operators.asp

Operator |
Name Bitwise inclusive OR
Associativity Left to right

Note that this works:

typedef unsigned char * uint_8_P ; int _1, _2, _3, _4 ;

int Swap_32_x86_NoBug ( int X ) { uint_8_P P = ( uint_8_P ) & X ;
return ( _1 = * P << 24 ) | ( _2 = * ++ P << 16 )
| ( _3 = * ++ P << 8 ) | ( _4 = * ++ P ); }

main() {
// prints 0x81828384 as it should.
printf( "%x", Swap_32_x86_NoBug( 0x84838281 ) ); }

Swap_32_x86() should work like this ( but doesn't ):
return * P << 24 | * ++ P << 16 | * ++ P << 8 | * ++ P ; }
1. * P -> Reg
2. Reg << 24 -> Reg
3. ++ P -> Reg2
4. * Reg2 -> Reg2
5. Reg2 << 16 -> Reg2
6. Reg | Reg2 -> Reg
7. ++ P -> Reg2
8. * Reg2 -> Reg2
9. Reg2 << 8 -> Reg2
10. Reg | Reg2 -> Reg
11. ++ P -> Reg2
12. * Reg2 -> Reg2
13. Reg | Reg2 -> Reg
14. Return Reg.

Jul 21 '05 #278

Jeff_Relf

Hi Spooky, Re: If you could try my Swap_32() in gcc,

You told me: << Sure. On my Linux/x86 I get 84838281. Nice no-op.
( Or is it supposed to swap only on big-endian hardware ?
Is it supposed to even *run* on big-endian hardware ? )
gcc version 3.3.5-20050130 (Gentoo Linux 3.3.5.20050130-r1,
ssp-3.3.5.20050130-1, pie-8.7.7.1)
I don't have my SPARC up at the moment or I'd try it on that as well. >>

I had in mind a program on a Bigendian system
that had to swap all the ints inside a .WAV file.
But I lost track of what I was doing... damned alzheimers !

I should've called it Small_is_First_32(), not Swap_32()
Going the other way, where an x86 is reading Large_First ints, it'd be:

typedef unsigned char * uint_8_P ;

__inline int Large_is_First_32 ( int X ) { uint_8_P P = ( uint_8_P ) & X ;
return * P << 24 | * ++ P << 16 | * ++ P << 8 | * ++ P ; }
main() {
// prints 0x84848484 ! ! MS_CPP_7_1 BUG
printf( "%x", Large_is_First_32( 0x84838281 ) ); }

You told me: << Color me unimpressed. >>

At least I still remember how to rollover in my bed,
get to the store/bathroom, etc. ( ha ha ).

Re: Your return ( X & 0xff000000 ) >> 24 ...

Too ugly for me.

Re: Your unsigned char t = u.c[i]; u.c[i] = u.c[j]; u.c[j] = t;,

That one is Waaay ugly... yuck.

You showed: <<
int Swap_32_Doubleswap( int X ) { register char tmp;
tmp = *((unsigned char *) &X + 3);
*((unsigned char *) &X + 3) = *((unsigned char *) &X + 0);
*((unsigned char *) &X + 0) = tmp;
tmp = *((unsigned char *) &X + 2);
*((unsigned char *) &X + 2) = *((unsigned char *) &X + 1);
*((unsigned char *) &X + 1) = tmp; return X; } >>

That's ugly Spooky, what's the harm in declaring:

typedef unsigned char * uint_8_P ;

By the way, a speed_optimized compile of Large_is_First_32() is merely:

20 mov ecx,dword ptr [esp+4] ; Oops ! a MS_CPP_7_1 bug !
24 shr ecx,18h
27 mov eax,ecx
29 shl eax,8
2C or eax,ecx
2E shl eax,8
31 or eax,ecx
33 shl eax,8
36 or eax,ecx

But, if it's working code you're looking for, try this:

typedef unsigned char * uint_8_P ;
__inline int Large_is_First_32 ( int X ) {
uint_8_P P = ( uint_8_P ) & X ;
return * P << 24 | P [ 1 ] << 16 | P [ 2 ] << 8 | P [ 3 ]; }
main(){
rv = Large_is_First_32( 0x84838281 );
}

But Large_is_First_32() just becomes:
mov dword ptr [rv (40451Ch)],81828384h ; Ha !

using int rv = 0x84838281 ; rv = Large_is_First_32( rv );
Large_is_First_32() becomes:

00 mov eax,dword ptr [rv (403010h)]
05 xor ecx,ecx
07 mov ch,al
09 movzx edx,ah
0C movzx eax,byte ptr [rv+2 (403012h)]
13 or ecx,edx
15 movzx edx,byte ptr [rv+3 (403013h)]
1C shl ecx,8
1F or ecx,eax
21 shl ecx,8
24 or ecx,edx
26 mov dword ptr [rv (403010h)],ecx

Re: Swap_32_Doubleswap, optimized, 14 lines: <<
pushl %ebp ; standard frame
movl %esp, %ebp ; crap
leal 8(%ebp), %edx ; (unsigned char *) &X
movzbl 3(%edx), %ecx ; get fourth byte
movzbl 8(%ebp), %eax ; get first byte
movb %al, 3(%edx) ; store first in fourth
movb %cl, 8(%ebp) ; store fourth in first
movzbl 2(%edx), %ecx ; get third byte
movzbl 1(%edx), %eax ; get second byte
movb %al, 2(%edx) ; store second in third
movb %cl, 1(%edx) ; store third in second
movl 8(%ebp), %eax ; fetch the result
popl %ebp ; ... and return it
ret >>

You asked me: <<
Which one's likely to be more efficient, at least on GCC-derived systems ?
Which one's more likely to actually function properly ? >>

My Large_is_First_32() is named better, for one thing, it works for another.
And, most important of all... it's Easier_To_Read !

Further, it's only 12 lines of assembly, including fetching and storing rv.

What's more... I found a bug in MS_CPP_7_1.

Jul 21 '05 #279

Hi Jeff, I actually felt sharp enough to try to reverse engineer your
original code. :-)

So my big quest is..... is the code below your original? Or was there
another one? Since I cannot find the original message anymore.

int Swap_32 ( int X ) { uint_8_P P = ( uint_8_P ) & X + 4 ;
return * -- P << 24 | * -- P << 16 | * -- P << 8 | * -- P ; }
Or was it something like this:
int Swap_32 ( int X ) { uint_8_P P = ( uint_8_P ) & X ;
return * P << 24 | * ++ P << 16 | * ++ P << 8 | * ++ P ; }

I didn't notice the -- and the &X + 4 part because of my dyslexia. I thought
I saw ++. and &X

And the second question does this run on a Intel byte order or a Motorola
byte order?

So if this is you code, then be prepared.... put on your safety belts
because I have a lot of things to say. :-)
I will reply when I am at home. I need some time consuming tests to do.

I do admit I cannot make it to work on the VC++ 2003 so far, but I am pretty
sure that a real VC++ 2003 programmer would not choose to follow that way of
programming. ;-)

Jul 21 '05 #280

Jeff_Relf

Hi Olaf, Are you looking for my original code, or just code that fails ?

The previous code I showed assumed that the smallest byte came in memory,
....which would only work if it were compiled/run on a bid_endian system
( converting .WAV files was the example I had in mind ).

My latest code assumes that the int's largest byte comes first in memory.
Below is my latest Working code.
rv must be a global or the optimizer barfs.

typedef unsigned char * uint_8_P ; int rv = 0x84838281 ;

__inline int Large_is_First_32 ( int X ) {
uint_8_P P = ( uint_8_P ) & X ;
return * P << 24 | P [ 1 ] << 16 | P [ 2 ] << 8 | P [ 3 ] ; }

int main(){
rv = Large_is_First_32( rv );
}

00 mov eax,dword ptr [rv (403010h)]
05 xor ecx,ecx
07 mov ch,al
09 movzx edx,ah
0C movzx eax,byte ptr [rv+2 (403012h)]
13 or ecx,edx
15 movzx edx,byte ptr [rv+3 (403013h)]
1C shl ecx,8
1F or ecx,eax
21 shl ecx,8
24 or ecx,edx
26 mov dword ptr [rv (403010h)],ecx

This code fails ( but shouldn't, ¿ a MS_CPP_7_1 bug ):

typedef unsigned char * uint_8_P ; int rv = 0x84838281 ;

__inline int Large_is_First_32 ( int X ) {
uint_8_P P = ( uint_8_P ) & X ;
return * P << 24 | * ++ P << 16 | * ++ P << 8 | * ++ P ; }

int main(){
rv = Large_is_First_32( rv );
}

Kelsey has said that, optimized or not, my multiple * ++ P's were illegal,
but I can't imagine where MicroSoft declares that,
....or when, exactly, multiple * ++ P's shouldn't work.

You concluded: <>

I write code for a living and for fun... I'm real enough.

Althoug this doesn't work ( ¿ a MS_CPP_7_1 bug ):
return * P << 24 | * ++ P << 16 | * ++ P << 8 | * ++ P ;

I just Love how it looks !

This works, but doesn't look nearly as good:
return * P << 24 | P [ 1 ] << 16 | P [ 2 ] << 8 | P [ 3 ] ; }

I'm all about looks and readability... not speed.
( But it's damn fast too, you must admit )

P.S. I know you don't care, but I changed my LoopTo() again, it's now:

#define LoopTo( StopCond ) \
while ( Ch && ( Ch = ( uchar ) * ++ P ) \
&& ! ( Ch2 = ( uchar ) P [ 1 ], ( StopCond ) ) )

Note that Ch2 is now undefined when Ch becomes null.
LoopTo() is used in HTM_TXT.CPP

Pass HTM_TXT.EXE a .HTML file and it spits out a .TXT file.
http://www.Cotse.NET/users/jeffrelf/HTM_TXT.EXE
http://www.Cotse.NET/users/jeffrelf/HTM_TXT.CPP
http://www.Cotse.NET/users/jeffrelf/HTM_TXT.VCPROJ

These are valid HTML tags: <! Comment --> <Alpha> </Alpha>
But, due to the leading space, < Alpha> is not.
Things like &Unknown are sent through untranslated, for obvious reasons.

Jul 21 '05 #281

Tim Smith

In article <Je************************@Cotse.NET>,
Jeff_Relf <Me@Privacy.NET> wrote:
[unreadable crap...I'll try to salvage what I can from his poor layout]

The code in question:

int Swap_32_x86 ( int X ) { uint_8_P P = ( uint_8_P ) & X ;
return * P << 24 | * ++ P << 16 | * ++ P << 8 | * ++ P ; }
Why it doesn't work:
undefined behavior as a result of
the multiple modifications involved between sequence points, from
which the associativity provides no protection at all. >>

Relf is not convinced: I don't know what you might be asserting here, if anything,
but I hope you have a lot of credible documents to support you.

Read this:

<http://msdn.microsoft.com/library/de.../en-us/vclang/
html/_clang_c_sequence_points.asp>

In C and C++, precedence and associativity basically determine the shape
of the parse tree. They do *NOT* determine the order of evaluation of
things, other than indirectly (in that an operation can not be done
until the value of both operands is known).

For example, consider a()+b()*c(). The compile can call those three
functions in any order it wants. It will have to then do the
multiplication before it does the addition.

--
--Tim Smith

Jul 21 '05 #282

Hurray for Peter Pumpkinhead

Tim Smith wrote:

For example, consider a()+b()*c(). The compile can call those three
functions in any order it wants. It will have to then do the
multiplication before it does the addition.

What if a static global is set in a() that needs to be called in b(),
c() to get a correct calculation?

Jul 21 '05 #283

Bill

Jeff_Relf wrote:

return * P << 24 | * ++ P << 16 | * ++ P << 8 | * ++ P ;

}

1. &&,|| : Are short circuited
: have sequence points between operands
&,| : Are NOT short circuited
: DO NOT have sequence points between operands
2. From the ANSI/ISO C Standard 9899-1990,
section 6.3 "Expressions", paragraph 2:
"Between the previous and next sequence point an object shall
have its stored value modified at most once by the
evaluation of an expression. Furthermore, the prior value
shall be accessed only to determine the value to be stored."
3. Modifying a value more than once between sequence points is
undefined.
4. A compiler can do ANYTHING IT WANTS when behavior is undefined
5. You modify P more than once between sequence points
6. This is NOT a compiler bug
This is programmer error

Information on these topocs is easily available from GOOGLE

Hope this helps
Bill

Jul 21 '05 #284

> Hi Olaf, Are you looking for my original code, or just code that fails ?

I try to look at the code that you tried to make it work in theory.

I had prepared a big mail with disassembler stuff. But I did not have enough
time at my company, so I wanted to complete it at home. :-) But I see that
you already discovered some of my points.

First of all the code assumes a Intel byte order, mine did not and would
work on C# and C++ including none-Intel.
Second this --P thing was moving in the wrong direction. This way you
unecessarely slow down any memory operations since memory tends to start
preloading the next memory byte in bursts. And you use the previous instead
thus stalling any memory cycle.
On most processors/memories or any, memory reads are optimized to be read
from low to high in sequenctional order.

Interesting, is that you also found the way I would have done it that indeed
works on VC++ 2003
But then again, the Intel byte order is not gone and you had to use a global
variable in your case to optimize it.
The problem with a global memory is that it might not reside in one of the
32 bytes cache lines of your processor cache, so you might lose time to to
load the memory to the processor cache. And you loas an additional 32 bytes
of processor cache memory for that one global variable again slowing down
something else.
__inline int Large_is_First_32 ( int X ) {
uint_8_P P = ( uint_8_P ) & X ;
return * P << 24 | P [ 1 ] << 16 | P [ 2 ] << 8 | P [ 3 ] ; }
But you biggest bottleneck is this thing: 0C movzx eax,byte ptr [rv+2 (403012h)]
....
15 movzx edx,byte ptr [rv+3 (403013h)]

Compared to my code, you need additional memory access and worse you only
read one byte at a time, while the complete original variable of 32 bits
would have been aligned on a memory/4 boundary thus loaded on one pass in
memory.
Only your compiler optimizer discovered that it could reduce one byte read
in it's first load.
Both these assembler instructions indicates that you need additional reads
and also on memory locations that are not devidable by 4. So giving your
processor another stall.

You should really try my way and check the optimized results.You will lose
at least the 2 memory reads.

I don't think I need to post the message I was planning to. ;-)
Everything here has been said. :-)

Jul 21 '05 #285

Bill

Tim Wrote:

In C and C++, precedence and associativity basically determine the shapeof the parse tree. They do *NOT* determine the order of evaluation of things, other than indirectly (in that an operation can not be done
until the value of both operands is known).

In the case of (&&, ||) the order of operation IS defined as left to
right and STOPS once the outcome is known (this is called short
circuiting)
SO for example

if ( (F(x) < 0) || (G(x) <0) )
Blah();

F(x) will be executed first.
If (F(x) < 0) is true we NEVER execute the second test

Thus G(x) is not guaranteed to run.

-----------------------------------
Unfortunately for Refl, he didn't use these operators
He used the bitwise operators and they DON'T short circuit

Had he used the boolean operators his code MIGHT have worked.
Or at the least it would have defined behavior.

Hope this helps
Bill

Jul 21 '05 #286

Jeff_Relf

Howdy there Bill, Re: * ++ P << 16 | * ++ P << 8...,

That code Should work. Why... you ask ?
Because it's much more readable than this: P [ 1 ] << 16 | P [ 2 ] << 8
So it's a bug in both MS_CPP_7_1 and ISO_C.

Hope that helps.

P.S. This iterates MS_CPP's sequence points:
http://msdn.microsoft.com/library/de...nce_points.asp

Jul 21 '05 #287

Bill

Relf Wrote:

Howdy there Bill, Re: * ++ P << 16 | * ++ P << 8..., That code Should work. Why... you ask ?
Because it's much more readable than this: P [ 1 ] << 16 | P [ 2 ] << 8

Huh????
What does supposed READABILITY have to do with Code Correctness or
Undefined Behavior???
So it's a bug in both MS_CPP_7_1 and ISO_C.
That does not logically follow

....
P.S. This iterates MS_CPP's sequence points:
http://msdn.microsoft.com/library/default.asp?url=/library/en->us/vclang98/html/_pluslang_c.2b2b_.sequence_points.asp

I carefully looked over the list of sequence points and don't see one
that applies to your case
Which one do YOU think applies?

Bill

Jul 21 '05 #288

Jeff_Relf

Hi Olaf, I used a global instead of a constant or a local,
so that it didn't totally optimize away my code.
I've since found a better way to do that, this is the disassembly:

rv = Large_is_First_32( X );
* P << 24 | P [ 1 ] << 16 | P [ 2 ] << 8 | P [ 3 ] ;

1D xor ecx,ecx
1F mov ch,al
21 movzx edx,ah
24 mov dword ptr [esp+0Ch],eax ; Make the pointer
28 shr eax,10h
2B or ecx,edx
2D movzx edx,al
30 shl ecx,8
33 or ecx,edx
35 movzx eax,ah
38 shl ecx,8
3B or eax,ecx
3D mov dword ptr [rv (403018h)],eax ; Store rv, the Global int

vs. Your:

return (((byte) iOriginal)) << 24 | ((byte) (iOriginal >> 8)) << 16 |
((byte) (iOriginal >> 16)) << 8 | ((byte) (iOriginal >> 24));

90 mov eax,esi // eax=iOriginal
92 and eax,0FFh // (byte) iOriginal ( converting int to byte )
97 shl eax,18h // (byte) iOriginal<< 24
9a mov edx,esi // edx=iOriginal
9c sar edx,8 // iOriginal >> 8
9f and edx,0FFh // ((byte) (iOriginal >> 8))
a5 shl edx,10h // ((byte) (iOriginal >> 8)) << 16
a8 or eax,edx // ((byte) iOriginal)) << 24
// | ((byte) (iOriginal >> 8)) << 16
aa mov edx,esi // edx=iOriginal
ac sar edx,10h // iOriginal >> 16
af and edx,0FFh // ((byte) (iOriginal >> 16))
b5 shl edx,8 // ((byte) (iOriginal >> 16)) << 8
b8 or eax,edx // ((byte) iOriginal)) << 24
// | ((byte) (iOriginal >> 8)) << 16
// | ((byte) (iOriginal >> 16)) << 8
ba mov edx,esi // edx=iOriginal
bc sar edx,18h // iOriginal >> 24
bf and edx,0FFh // ((byte) (iOriginal >> 24)
c5 or eax,edx // ((byte) iOriginal)) << 24
// | ((byte) (iOriginal >> 8)) << 16
// | ((byte) (iOriginal >> 16)) << 8
// | ((byte) (iOriginal >> 24)
c7 mov ebx,eax // copy to the swap variable.

My code is faster, and more readable... sorry Olaf !
( And readability was my Only goal here... not speed )

This was the code I used:

#include <StdLib.H>
typedef unsigned char * uint_8_P ; int rv ;

__inline int Large_is_First_32 ( int X ) {
uint_8_P P = ( uint_8_P ) & X ;
return * P << 24 | P [ 1 ] << 16 | P [ 2 ] << 8 | P [ 3 ] ; }

int main(){ char * P = "84838281" ; register int X = strtoul( P, & P, 16 );
rv = Large_is_First_32( X );
return rv ;
}

The rv is there so that I can confirm that it works.
The strtoul() is there so that it doesn't optimize away my code.

Jul 21 '05 #289

In comp.os.linux.advocacy, Jeff_Relf
<Me@Privacy.NET>
wrote
on 13 Apr 2005 03:12:51 GMT
<Je************************@Cotse.NET>:

Hi Spooky, Re: My Swap_32(), You wrote: <<
Not sure if it's a bug or not. On GCC/x86 it would return 32. >>

Don't you guys believe in hex ? !

0x84838281 is a much Much better test,
as each byte is labeled and has it's high bit set.
Well, ideally one would use several such...but for a single test
that's not a bad start.

By the way, on a x86, I should've reversed the order, like this:

typedef unsigned char * uint_8_P ;

int Swap_32 ( int X ) { uint_8_P P = ( uint_8_P ) & X ;
return * P << 24 | * ++ P << 16 | * ++ P << 8 | * ++ P ; }
main() {
// prints 0x84848484 ! ! MS_CPP_7_1 BUG
printf( "%x", Swap_32( 0x84838281 ) ); }
And you were going to encapsulate that in some sort of #ifdef, right?

Assigning each byte to a global makes it work, like this:

typedef unsigned char * uint_8_P ; int _1, _2, _3, _4 ;

int Swap_32 ( int X ) { uint_8_P P = ( uint_8_P ) & X ;
return ( _1 = * P << 24 ) | ( _2 = * ++ P << 16 )
| ( _3 = * ++ P << 8 ) | ( _4 = * ++ P ); }

main() {
// prints 0x81828384 as it should.
printf( "%x", Swap_32( 0x84838281 ) ); }
OK, who else wants to vote for this being the Ugliest Workaround
For A Possible Compiler Bug This Week? :-)

Removing the assignments, but keeping the parens, fails:
return ( * P << 24 ) | ( * ++ P << 16 ) | ( * ++ P << 8 ) | ( * ++ P );

As I said before, MS_CPP fails with or without the optimizer,
and MicroSoft claims that the | operator is evaluated left to right:
http://msdn.microsoft.com/library/de....operators.asp

Operator |
Name Bitwise inclusive OR
Associativity Left to right

Or just do the obvious fix:

return (P[0] << 24) | (P[1] << 16) | (P[2] << 8) | (P[3]);

How well does that one work for you?

--
#191, ew****@earthlink.net
It's still legal to go .sigless.

Jul 21 '05 #290

Tim Smith

In article <42**************@speakeasy.net>, Hurray for Peter Pumpkinhead wrote:

For example, consider a()+b()*c(). The compile can call those three
functions in any order it wants. It will have to then do the
multiplication before it does the addition.

What if a static global is set in a() that needs to be called in b(),
c() to get a correct calculation?

Then you have to write it like this:

int temp = a();
whatever = temp + b()*c();

--
--Tim Smith

Jul 21 '05 #291

In comp.os.linux.advocacy, Hurray for Peter Pumpkinhead
<ja************@speakeasy.net>
wrote
on Wed, 13 Apr 2005 07:12:37 -0700
<42**************@speakeasy.net>:

Tim Smith wrote:
For example, consider a()+b()*c(). The compile can call those three
functions in any order it wants. It will have to then do the
multiplication before it does the addition.

What if a static global is set in a() that needs to be called in b(),
c() to get a correct calculation?

Then one has unpredictable behavior and will be lucky to
get a result even remotely reproducible, and probably one
will get some rather subtle and oddball symptoms to hash
through before one gets at the root cause of the problem,
during porting of the application from a platform that
works to another platform essential to one's business plan.

In short: problem-in-waiting. :-)

Of course the workaround isn't too bad; either factor out
the global and have a setup() routine prior to evaluating
the expression in question, or put the result of a()
into a variable, and force the issue:

double aVal = a();
/* ... use ... */ aVal+b()*c(); /* ... */

I could see all of a(), b() and c() calling a setup()
routine if they all needed access to the same pointer
(setup() would create and/or return it to them), though;
so long as setup() does the same thing every time it won't
really matter *who* gets called first:

***

Something * staticValue;

void setup()
{
/* ... blah blah whatever blah blah ... */
staticValue = new Something(...);
/* ... blah blah whatever blah blah ... */
}

double a()
{
if(staticValue == 0) setup();
return staticValue->a();
}

double b()
{
if(staticValue == 0) setup();
return staticValue->b();
}

double c()
{
if(staticValue == 0) setup();
return staticValue->c();
}

***

This is admittedly not the best of code (thread problem?
*What* thread problem? :-) ) but at least it wouldn't depend
on an explicit order of evaluation.

--
#191, ew****@earthlink.net
It's still legal to go .sigless.

Jul 21 '05 #292

In comp.os.linux.advocacy, Bill
<sa********@hotmail.com>
wrote
on 13 Apr 2005 08:15:13 -0700
<11**********************@l41g2000cwc.googlegroups .com>:

Relf Wrote:
Howdy there Bill, Re: * ++ P << 16 | * ++ P << 8...,
That code Should work. Why... you ask ?
Because it's much more readable than this: P [ 1 ] << 16 | P [ 2 ]

<< 8

Huh????
What does supposed READABILITY have to do with Code Correctness or
Undefined Behavior???

Ah, remember -- it's the "Jeff-Relf-Readability Test". If he can
read it, no problem. If he can't, it's all your fault. :-)

So it's a bug in both MS_CPP_7_1 and ISO_C.
That does not logically follow

Make that "Jeff-Relf-Logic" as well.

...
P.S. This iterates MS_CPP's sequence points:
[URL repair]
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclang98/html/_pluslang_c.2b2b_.sequence_points.asp
I carefully looked over the list of sequence points and don't see one
that applies to your case
Which one do YOU think applies?

Interesting -- "An expression can modify an objects [sic] value only
once between consecutive sequence points". I think this explains
the behavior in part that he saw earlier, in that the declaration

unsigned char * P = (unsigned char *) &X + 4;
return *--P << 24 | *--P << 16 | *--P << 8 | *--P;

was for some reason returning 0x20202020 for input value 0x00000020;
apparently the four decrements were being collapsed.

However, that does not explain why the modified form

unsigned char * P = (unsigned char *) &X;
return *P++ << 24 | *P++ << 16 | *P++ << 8 | *P;

actually *works*.

My brain is starting to hurt.

Bill

--
#191, ew****@earthlink.net
It's still legal to go .sigless.

Jul 21 '05 #293

> Hi Olaf, I used a global instead of a constant or a local,

so that it didn't totally optimize away my code.
I've since found a better way to do that, this is the disassembly:

rv = Large_is_First_32( X );
* P << 24 | P [ 1 ] << 16 | P [ 2 ] << 8 | P [ 3 ] ;

1D xor ecx,ecx
1F mov ch,al
21 movzx edx,ah
24 mov dword ptr [esp+0Ch],eax ; Make the pointer
28 shr eax,10h
2B or ecx,edx
2D movzx edx,al
30 shl ecx,8
33 or ecx,edx
35 movzx eax,ah
38 shl ecx,8
3B or eax,ecx
3D mov dword ptr [rv (403018h)],eax ; Store rv, the Global int
This is very close to something that I would have created if I would have
made it in assembler directly.
I assume that 'eax' starts with the X variable?
This is clearly created by an optimizer compiler.

I did check your code by hand, it works. :-)

Only the line: 24 mov dword ptr [esp+0Ch],eax ; Make the pointer is obsolete and in fact it is not storing the pointer it is the actual X
value, maybe for future reference?

Try this = P[0] << 24 | P [ 1 ] << 16 | P [ 2 ] << 8 | P [ 3 ] ;
instead of this * P << 24 | P [ 1 ] << 16 | P [ 2 ] << 8 | P [ 3 ] ;
You might lose that additional.
My code is faster, and more readable... sorry Olaf !
( And readability was my Only goal here... not speed )

Why sorry, if it works then it is a good job? :-)
And the fact that you adapt your coding style and prove that you are willing
to learn and adapt is a very good

But the code I showed compiled in C# and with minor change (BYTE instead of
byte) also C++.
But regarding to speed, it is really compiler dependend. I do not have an
optimizer compiler for C++ since I use VC++ 2003 standard so my assembler
code generated will be bigger.
I do not know if the VC# 2003 proffesional have an compiler optimizer.???

Also note that the C# generated assembler and the unmanaged C++ generated
assembler was nearly exactly the same. My original point in the discussion
was to prove that C# was totally not slow compared to unanaged C++ when you
look at the assembler level. I cannot compare with an compiler optimizer,
since I don't have one available.

So I would be interested to look at the optimized code from my function. Can
you give me that?
I mean how did you determin that my code is slower? Did you test this with
your compiler?

Jul 21 '05 #294

Jeff_Relf

Hi Spooky, I'm saying that the following could should work, but doesn't:

Large_is_First_32 ( int X ) {
unsigned char * P = ( unsigned char * ) & X ;
return * P ++ << 24 | * P ++ << 16 | * P ++ << 8 | * P ; }

C_99 and MS_CPP should both be updated so that it will work, in my opinion,
because P [ 1 ] << 16 | P [ 2 ] << 8
is too ugly, too weird, and too unnecessary.

This is the code that works:

#include <StdLib.H>
typedef char * int_8_P ; typedef unsigned char * uint_8_P ; int rv ;

int Large_is_First_32 ( int X ) {
uint_8_P P = ( uint_8_P ) & X ;
return * P << 24 | P [ 1 ] << 16 | P [ 2 ] << 8 | P [ 3 ] ; }

int main(){ int_8_P P = "84838281" ;
int X = strtoul( P, & P, 16 ); rv = Large_is_First_32( X );
return rv ; } // rv is now 0x81828384

The optimizer inlines the code as follows:
rv = * P << 24 | P [ 1 ] << 16 | P [ 2 ] << 8 | P [ 3 ] ;

1D xor ecx,ecx
1F mov ch,al
21 movzx edx,ah
24 mov dword ptr [ esp + 0Ch ],eax
28 shr eax,10h
2B or ecx,edx
2D movzx edx,al
30 shl ecx,8
33 or ecx,edx
35 movzx eax,ah
38 shl ecx,8
3B or eax,ecx
3D mov dword ptr [ rv ( 403018h ) ],eax

Such fast and pretty code !

The strtoul() must be there or my code gets optimized away.
rv, a global, must be there to confirm that it works.

Jul 21 '05 #295

Jeff_Relf

Hi Olaf, Re: This code of mine:

uint_32 Big_First_32 ( int & X ) { uint_8_P B = ( uint_8_P ) & X ;
return * B << 24 | B [ 1 ] << 16 | B [ 2 ] << 8 | B [ 3 ]; }

00 xor edx,edx
02 mov dh,al
04 movzx ecx,ah
07 or edx,ecx
09 mov ecx,dword ptr [esp+18h]
0D shr ecx,10h
10 shl edx,8
13 movzx ebx,cl
16 or edx,ebx
18 shl edx,8
1B movzx ecx,ch
1E or edx,ecx

You told me: << This is very close to something that I would have created
if I would have made it in assembler directly.
I assume that 'eax' starts with the X variable ?
This is clearly created by an optimizer compiler. >>

Yes, hovering the mouse cursor over eax shows 0x84838281,
which is what I set X to at the beginning ( in that test ).

And Yes, it was optimized with MS_CPP's /Og switch, Global optimizations:
http://msdn.microsoft.com/library/de...ore_.2f.og.asp

You added: << I did check your code by hand, it works. :-)
Only the line:
24 mov dword ptr [esp+0Ch],eax ; Make the pointer
is obsolete and in fact it is not storing the pointer
it is the actual X value, maybe for future reference ? >>

Right, it's being placed on the stack, I should've realized that.
I'm not really into the disassembly, as you can tell.

I don't know why it was there,
because my latest code does it the other way around, like this:
09 mov ecx,dword ptr [esp+18h]

I had to add the & here: Big_First_32 ( int & X )
to stop the /Og switch from breaking my code in certain Bizarre cases,
....another bug with MS_CPP ?

The /Og switch is notorious for breaking my code like that,
I've stopped using it in my professional code.

Re: This idea of yours: P[0] << 24 instead of * P << 24,

That produces the exact same disassembly.

Re: If VC# 2003 proffesional has an optimizer,

I don't know... I only use the CPP part.
I got my copy of VS_2003_Pro from a friend, it cost me nothing.
( Universities are like that )

You wrote: << Also note that the C# generated assembler
and the unmanaged C++ generated assembler was nearly exactly the same.
My original point in the discussion was
to prove that C# was totally not slow
compared to unanaged C++ when you look at the assembler level.
I cannot compare with an compiler optimizer,
since I don't have one available.

So I would be interested to look at the optimized code from my function.
Can you give me that ?
I mean how did you determin that my code is slower ?
Did you test this with your compiler ? >>

Your code works, and it becomes:

uint_32 Swap_32 ( int X ) {
return uint_8( X ) << 24 | uint_8( X >> 8 ) << 16
| uint_8( X >> 16 ) << 8 | uint_8( X >> 24 ); }

1B movzx ecx,ah
1E mov ch,al
20 mov edx,eax
22 sar edx,10h
25 movzx edx,dl
28 sar eax,18h
2B shl ecx,8
2E or ecx,edx
30 movzx eax,al
33 shl ecx,8
36 or eax,ecx
38 mov dword ptr [rv (403018h)],eax

Adding Winsock2.H's htonl() to the mix,
Your C++ code it the fastest, this is the output:
.00401 Seconds, Sum 2147244176424960, Swap_32( 0 - 999,999 ).
.00469 Seconds, Sum 2147244176424960, Big_First_32( 0 - 999,999 ).
.00789 Seconds, Sum 2147244176424960, htonl( 0 - 999,999 ).

But, as I said before, readability was my only goal, not speed,
and I find my code to be more readable.

Here's the code the produced the above output:
http://www.Cotse.NET/users/jeffrelf/Kelsey.CPP
http://www.Cotse.NET/users/jeffrelf/Kelsey.VCPROJ

#pragma warning( disable: 4127 4244 4706 )
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <StdLib.H>
#include <stdio.h>
#include <IO.H>
#include <Winsock2.H>
#pragma comment( lib, "Ws2_32.LIB")

#define Loop( N ) int J = - 1, LLL = N ; while ( ++ J < LLL )
#define Tics ( QueryPerformanceCounter( ( Quad * ) & _Tics ), _Tics )
#define Secs ( _Secs = Tics / Secnd_Dub )

typedef char * int_8_P ;
typedef unsigned char uint_8 ; typedef uint_8 * uint_8_P ;
typedef unsigned __int32 uint_32 ;
typedef LARGE_INTEGER Quad ;

double Secnd_Dub, _Secs, Mark ; __int64 _Tics, Secnd ;

uint_32 Big_First_32 ( int & X ) { uint_8_P B = ( uint_8_P ) & X ;
return * B << 24 | B [ 1 ] << 16 | B [ 2 ] << 8 | B [ 3 ]; }

uint_32 Swap_32 ( int X ) {
return uint_8( X ) << 24 | uint_8( X >> 8 ) << 16
| uint_8( X >> 16 ) << 8 | uint_8( X >> 24 ); }

main() {
QueryPerformanceFrequency( ( Quad * ) & Secnd ); Secnd_Dub = Secnd ;
FILE * fp = fopen( "AA.TXT", "w" );
const int Times = 1000 * 1000 ;

__int64 X = 0 ; Mark = Secs ;
{ Loop( Times ) X += Swap_32( J ); } double Dur = Secs - Mark ;

__int64 X2 = 0 ; Mark = Secs ;
{ Loop( Times ) X2 += Big_First_32( J ); } double Dur2 = Secs - Mark ;

__int64 X3 = 0 ; Mark = Secs ;
Loop( Times ) X3 += htonl( J ); double Dur3 = Secs - Mark ;

char SecStr [ 99 ] ;
sprintf( SecStr, "%1.5f" , Dur );
fprintf( fp, "%s Seconds, Sum %I64d, Swap_32( 0 - 999,999 ).\n"
, SecStr + ( * SecStr == '0' ), X );

sprintf( SecStr, "%1.5f" , Dur2 );
fprintf( fp, "%s Seconds, Sum %I64d, Big_First_32( 0 - 999,999 ).\n"
, SecStr + ( * SecStr == '0' ), X2 );

sprintf( SecStr, "%1.5f" , Dur3 );
fprintf( fp, "%s Seconds, Sum %I64d, htonl( 0 - 999,999 ).\n"
, SecStr + ( * SecStr == '0' ), X3 ); fclose( fp ); }

Jul 21 '05 #296

Nice test. :-)

Right now I am checking out this IL Assembler.
It appears that the managed/unmanaged thing used n C++ is not native of C++,
but could be implemented in C# too if they would change the C# compiler.
It is actually built into the JIT itself.

The IL assembler is a Object oriented assembler thing based on stacks,
similar like Forth (if I remeber correctly).
Or the way HP scientific calculators would work. You push 2 varablios on the
stack and then say ADD, and the result is stored on that stack ready to be
used for the next operation.
It is getting used to it. :-)

But the VC# 2003 clearly does not optimize that IL code, in both release and
debug code when I look at it with ILDASM.
Maybe the Professional version does that? So it is logical that the
executable code compiled by the JIT is also an almost one to one translation
of the C# instructions.

Now rumors are that the 64 bit JIT of .NET framework 2.0 has more time to do
things, so it create a lot more optimized code. But the 32 bit version is
not.
The reason why the JIT is not optimizing that muich is because it would take
a very long time to load the .NET program. So they chose for less optimizing
to speed up the loading process.

I also discovered that there exist a NGEN program that pre-compiles a .NET
program, so it has more optimized solution and thus also runs faster. But it
has some issues, so it is not used that much. But rumors also tells me that
this NGEN is actually used to compile the .NET framework assemblies during
the install procedure so it executes faster.

So al to of interesting things to discover. :-)

Jul 21 '05 #297

Jeff_Relf

Hi Olaf, You told me: << Nice test. :-) >>

Thanks, and now that I've thought about it some more,
I think your method, given current contraints,
is the most readable, as well as the fastest:

uint_32 Swap_32 ( int X ) {
return uint_8( X ) << 24 | uint_8( X >> 8 ) << 16
| uint_8( X >> 16 ) << 8 | uint_8( X >> 24 ); }

But I still think the following code should be legal,
as it would be even more readable and would be quit optimizable.

uint_32 Big_First_32 ( int X ) { uint_8_P P = ( uint_8_P ) & X ;
return * P << 24 | * ++ P << 16 | * ++ P << 8 | * ++ P ; }

Re: Adding the IL Assembler to C#'s compiler
and possibly optimizing managed code on install,

That would certainly speed things up.

As long as you're making changes to the compiler like that,
I vote for #define and more sequence_points.

Jul 21 '05 #298