C / asm / long ints

fermineutron

A while back i tried to calculate factorials of large numbers using
arrays in C, the array encoded integer arithemetic that i wrote in C
was very slow, it would take almost a second to multiply 2 array
encoded integers. resently i looked at large precision libraries for C,
in particular GMP but i was unable to get it to run with my compiler,
aperantly some header files were not found. I was curious what is the
best way to interface C and asm so that i could write simmilar library
in asm and use it in C.

Any ideas?

Oct 26 '06 #1

Subscribe Post Reply

2303

Richard Heathfield

fermineutron said:

A while back i tried to calculate factorials of large numbers using
arrays in C, the array encoded integer arithemetic that i wrote in C
was very slow, it would take almost a second to multiply 2 array
encoded integers. resently i looked at large precision libraries for C,
in particular GMP but i was unable to get it to run with my compiler,
aperantly some header files were not found. I was curious what is the
best way to interface C and asm so that i could write simmilar library
in asm and use it in C.

Assembly language is not a magic wand you can wave to turn bad algorithms
into good ones. If your C routines weren't quick enough, your assembly
language routines are very unlikely to be quick enough either.

Choose Better Algorithms. Then implement them in C.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Oct 26 '06 #2

mensanator

fermineutron wrote:

A while back i tried to calculate factorials of large numbers using
arrays in C, the array encoded integer arithemetic that i wrote in C
was very slow, it would take almost a second to multiply 2 array
encoded integers. resently i looked at large precision libraries for C,
in particular GMP but i was unable to get it to run with my compiler,
aperantly some header files were not found. I was curious what is the
best way to interface C and asm so that i could write simmilar library
in asm and use it in C.

Any ideas?

It would probably be easier to figure out how to make GMP
work on your system. And worth the effort.

Oct 26 '06 #3

Jack \Abram\ Off

fermineutron wrote:

A while back i tried to calculate factorials of large numbers using
arrays in C, the array encoded integer arithemetic that i wrote in C
was very slow

You probably could write assembly to to the manipulations that you do in
your multiply procedure "a little less slowly", but I suspect what you
*really* want is something borrowed from higher maths: The FFT multiply.

http://numbers.computation.free.fr/C...ithms/fft.html

Variations on this theme are found in all kinds of programming
situations where really big numbers are involved, and in your own
experiments you stumbled upon the itch that is scratched by this approach.

If approximate solutions to n! would be useful in your application,
Stirling gave us something in the 18th century that finds all kinds
of uses in computing today:
http://mathworld.wolfram.com/Stirlin...oximation.html

If you study computer science in a university setting, you will see
Stirling's formula in the first course of discrete maths, and the DFT
("Discrete Fourier Transform") will come back to haunt you a few times
in various algorithm and automata courses. (Or at least it *should*;
some schools appear to not actually teach Computer Science.)

Oct 26 '06 #4

Nishu

fermineutron wrote:

A while back i tried to calculate factorials of large numbers using
arrays in C, the array encoded integer arithemetic that i wrote in C
was very slow, it would take almost a second to multiply 2 array
encoded integers. resently i looked at large precision libraries for C,
in particular GMP but i was unable to get it to run with my compiler,
aperantly some header files were not found. I was curious what is the
best way to interface C and asm so that i could write simmilar library
in asm and use it in C.

Some compilers support embedded asm but it is waste of time unless you
understand the underlined processor architecture, its related
instruction set, how your compiler generates the corresponding asm for
your optimized C routine and so on. It is big time investment.
Relatively, less time consuming method with appreciable return is to
optimize the C routine itself; and there are various techinques to do
so which are easily available on the net and also in some previous
threads in this group.

-Nishu

Oct 27 '06 #5

Chris Thomasson

"Nishu" <na**********@gmail.comwrote in message
news:11**********************@m7g2000cwm.googlegro ups.com...

fermineutron wrote:

[...]

>I was curious what is the
best way to interface C and asm so that i could write simmilar library
in asm and use it in C.

[...]

You make you libraries ABI follow the C calling convention for the processor
your targeting. For x86 prams are passed on the stack left-to-right, and
SPARC passed prams mostly in registers', ect...

Here are two examples of how to use IA-32 assembly language to build a
library that has a strict C API, and an ABI that follows the C calling
convention for the IA-32:

http://appcore.home.comcast.net/

http://appcore.home.comcast.net/vzoom/refcount/

Oct 27 '06 #6

Richard Bos

"Jack \"Abram\" Off" <ja*****@fraud.netwrote:

fermineutron wrote:
A while back i tried to calculate factorials of large numbers using
arrays in C, the array encoded integer arithemetic that i wrote in C
was very slow

You probably could write assembly to to the manipulations that you do in
your multiply procedure "a little less slowly", but I suspect what you
*really* want is something borrowed from higher maths: The FFT multiply.

http://numbers.computation.free.fr/C...ithms/fft.html

That's nice and fast, but it's a floating-point process and introduces
errors. All very well for floating point computations which are already
imprecise, but if you start out with a nice, exact integer array it's
probably not what you want.

Richard

Oct 27 '06 #7

Rod Pemberton

"fermineutron" <fr**********@yahoo.comwrote in message
news:11**********************@m73g2000cwd.googlegr oups.com...

A while back i tried to calculate factorials of large numbers using
arrays in C, the array encoded integer arithemetic that i wrote in C
was very slow, it would take almost a second to multiply 2 array
encoded integers. resently i looked at large precision libraries for C,
in particular GMP but i was unable to get it to run with my compiler,
aperantly some header files were not found. I was curious what is the
best way to interface C and asm so that i could write simmilar library
in asm and use it in C.

Any ideas?

About the only realistic way you're going to get x86 assembly to outperform
highly optimized C, is to get a true x86 assembly expert, like Terje
Mathisen, to code it for you.

Most current C compilers are extremely efficient in generating assembly.
I've gone to great lengths to outcode C compilers for x86 in a couple
special situations. The best I could do was almost match the C compiler...
The problem for you is that the C optimizer can take full advantage of
extremely complicated situations and special instructions to generate the
best code. In fact, some C optimizers generate hundreds of trial
combinations. Most people just can't handle such complexity or convoluted
situations.

Are there ways to make your C code faster? Yes.

1) buy a new computer, a 2Ghz AMD is roughly 1000 times faster than a 500
Mhz AMD x86 CPU
2) find a better algorithm, a brute force factorization may take a second or
two, many times slower than an elliptic curve factorization
3) switch from, say GCC, to a compiler which is known for more efficient
code, say OpenWatcom or Digital Mars
4) completely unroll any loops, this reduces branching which is always
expensive in assembly
5) completely unroll any loops, occasionally the loop size can be reduced by
one, depending on how the loop was coded
6) precompute as many operations as possible, even extremely large lookup
tables are much faster than computation
7) make an attempt to reduce the number of variables used in the
calculations
8) replace multiplications and divisions with additions, subtractions,
bitshifts
9) don't attempt to access integer data smaller than the largest assembly
integer type of the CPU (32-bits for 32-bit cpu, 64-bit for...)
10) play around with a decent number of compiler optimizations, usually a
small number of them will other the most improvement
11) although C compilers are very good with optimization, they aren't
perfect. Forcing the use of a register can improve the code's speed
Rod Pemberton

Oct 27 '06 #8

Richard Heathfield

Rod Pemberton said:

>
"fermineutron" <fr**********@yahoo.comwrote in message
news:11**********************@m73g2000cwd.googlegr oups.com...
>A while back i tried to calculate factorials of large numbers using
arrays in C, the array encoded integer arithemetic that i wrote in C
was very slow, it would take almost a second to multiply 2 array
encoded integers.

<snip>

Are there ways to make your C code faster? Yes.

1) buy a new computer, a 2Ghz AMD is roughly 1000 times faster than a 500
Mhz AMD x86 CPU

That ought to be unnecessary.

2) find a better algorithm

Of all your suggestions, this is the best. When the OP first posted on this
subject (factorial calculations using bignums), it was immediately evident
that his algorithms were suspect, since their implementations were very
very very much slower (by orders of magnitude, IIRC) than my own routines,
which I wrote with more regard to clarity than to speed.

<snip>

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Oct 27 '06 #9

CBFalconer

Chris Thomasson wrote:

>

.... snip ...

>
You make you libraries ABI follow the C calling convention for the
processor your targeting. For x86 prams are passed on the stack
left-to-right, and SPARC passed prams mostly in registers', ect...

Is that rather hard on the babies in the prams? Child abuse? :-)

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Oct 27 '06 #10

Chris Thomasson

"CBFalconer" <cb********@yahoo.comwrote in message
news:45***************@yahoo.com...

Chris Thomasson wrote:
>>
... snip ...
>>
You make you libraries ABI follow the C calling convention for the
processor your targeting. For x86 prams are passed on the stack
left-to-right, and SPARC passed prams mostly in registers', ect...

Is that rather hard on the babies in the prams? Child abuse? :-)

Whoa! I meant params of course!
lol

;^)

Oct 27 '06 #11

Rod Pemberton

"Richard Heathfield" <in*****@invalid.invalidwrote in message
news:b4******************************@bt.com...

Rod Pemberton said:

"fermineutron" <fr**********@yahoo.comwrote in message
news:11**********************@m73g2000cwd.googlegr oups.com...
A while back i tried to calculate factorials of large numbers using
arrays in C, the array encoded integer arithemetic that i wrote in C
was very slow, it would take almost a second to multiply 2 array
encoded integers.

<snip>

Are there ways to make your C code faster? Yes.

1) buy a new computer, a 2Ghz AMD is roughly 1000 times faster than a

500

Mhz AMD x86 CPU

That ought to be unnecessary.

2) find a better algorithm

Of all your suggestions, this is the best. When the OP first posted on

this

subject (factorial calculations using bignums), it was immediately evident
that his algorithms were suspect, since their implementations were very
very very much slower (by orders of magnitude, IIRC) than my own routines,
which I wrote with more regard to clarity than to speed.

<snip>

2) find a better algorithm

Of all your suggestions, this is the best.

I read your post. I originally had the line: "I have to agree with
Healthfield." but thought better of it... ;)

A poor algorithm may be the _cause_ of his problems, but, the _best_
suggestion, IMO, is to continue to use C and not switch to assembly. He
could spend his lifetime attempting to get his assembly to outperform
optimized C.

All of the other methods work and are easy enough to implement too. If he
applies all of them, except for two that offer the most improvement:
1) finding a better algorithm
2) buying a new computer
he could still easily see a 300-900% improvement in speed.

One has to ask why he wants large to compute large factorials. The usual
answer leads indirectly to something which could be used to break
encryption, or directly to the breaking of some encryption scheme. So, I
made the assumption that the OP wanted the absolutely fastest implementation
he could obtain without spending years coding assembly and without spending
a fortune. The easiest way is to spend some money for faster equipment.
You can buy an extremely fast PC for leas than $2k USD, or a 16-core Tyan
Typhoon PSC for 10-16$k USD.
Rod Pemberton

Oct 27 '06 #12

Richard Heathfield

Rod Pemberton said:

<snip>

>
One has to ask why he wants large to compute large factorials.

Probably the same reason I do - for the hell of it.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Oct 27 '06 #13

Mark F. Haigh

Rod Pemberton wrote:
<snip>

>
About the only realistic way you're going to get x86 assembly to outperform
highly optimized C, is to get a true x86 assembly expert, like Terje
Mathisen, to code it for you.

<snip>

Agreed for the most part, but one can often realize speed gains of an
order of magnitude or more by using special-purpose assembly
instructions that current-generation compilers generally do not emit.
A compiler will generally optimize the hell out of loops dealing with
bits in unsigned integers, but will generally not (yet) understand that
the entire thing could be perhaps replaced by a popcntbd (POWER5
population count) or a bsr (Pentium 4 bit scan reverse) depending on
the loop, etc.

If such a thing is at the heart of something like a packet classifier
or image filter, even an assembly non-expert can affect a measurable
performance gain.
Mark F. Haigh
mf*****@sbcglobal.net

Oct 27 '06 #14

Rod Pemberton

"Mark F. Haigh" <mf*****@sbcglobal.netwrote in message
news:11**********************@i42g2000cwa.googlegr oups.com...

Rod Pemberton wrote:
<snip>

About the only realistic way you're going to get x86 assembly to

outperform

highly optimized C, is to get a true x86 assembly expert, like Terje
Mathisen, to code it for you.
<snip>

Agreed for the most part, but one can often realize speed gains of an
order of magnitude or more by using special-purpose assembly
instructions that current-generation compilers generally do not emit.
A compiler will generally optimize the hell out of loops dealing with
bits in unsigned integers, but will generally not (yet) understand that
the entire thing could be perhaps replaced by a popcntbd (POWER5
population count) or a bsr (Pentium 4 bit scan reverse) depending on
the loop, etc.

If such a thing is at the heart of something like a packet classifier
or image filter, even an assembly non-expert can affect a measurable
performance gain.

Although I no problem with your point there may be some situations where
assembly is better, I currently believe that most of the instructions which
aren't emitted by x86 compilers is because they aren't a good choice for
speed. Using BSR as an example of a special-purpose instruction which could
give performance gains was probably a bad example. The timings I have
access to indicate that BSR can be very slow and it is non-pairable on the
P4. It's not likely any compiler would use it in a situation requiring
speed, but they may use it to reduce code size.

cycle timings for BSR reg,reg:
386 10+3n
486 6-103
P4 7-73
Rod Pemberton

Oct 27 '06 #15

Mark F. Haigh

Rod Pemberton wrote:

"Mark F. Haigh" <mf*****@sbcglobal.netwrote in message
news:11**********************@i42g2000cwa.googlegr oups.com...
Rod Pemberton wrote:
<snip>
>
About the only realistic way you're going to get x86 assembly to

outperform

highly optimized C, is to get a true x86 assembly expert, like Terje
Mathisen, to code it for you.
>
<snip>

Agreed for the most part, but one can often realize speed gains of an
order of magnitude or more by using special-purpose assembly
instructions that current-generation compilers generally do not emit.
A compiler will generally optimize the hell out of loops dealing with
bits in unsigned integers, but will generally not (yet) understand that
the entire thing could be perhaps replaced by a popcntbd (POWER5
population count) or a bsr (Pentium 4 bit scan reverse) depending on
the loop, etc.

If such a thing is at the heart of something like a packet classifier
or image filter, even an assembly non-expert can affect a measurable
performance gain.

Although I no problem with your point there may be some situations where
assembly is better, I currently believe that most of the instructions which
aren't emitted by x86 compilers is because they aren't a good choice for
speed. Using BSR as an example of a special-purpose instruction which could
give performance gains was probably a bad example. The timings I have
access to indicate that BSR can be very slow and it is non-pairable on the
P4.

The timings that I have access to indicate that bsr completes in a
single cycle on the newer Core 2 Intel chips (Family 6, Model 0xF), so
I will disagree with you on that. Whether or not this is a real
"Pentium 4" is open to question, I suppose, but that's starting to
drift off topic.

I just pulled bsr off the top of my head anyways, as I have personally
seen it dramatically decrease bitmap scanning times compared to a C
language bit-scanning loop.

Mark F. Haigh
mf*****@sbcglobal.net

Oct 28 '06 #16

stevenj

Richard Bos wrote:

"Jack \"Abram\" Off" <ja*****@fraud.netwrote:
You probably could write assembly to to the manipulations that you do in
your multiply procedure "a little less slowly", but I suspect what you
*really* want is something borrowed from higher maths: The FFT multiply.

http://numbers.computation.free.fr/C...ithms/fft.html

That's nice and fast, but it's a floating-point process and introduces
errors. All very well for floating point computations which are already
imprecise, but if you start out with a nice, exact integer array it's
probably not what you want.

You can perform convolutions of integer arrays (i.e.
arbitrary-precision multiplies) *exactly* with a floating-point FFT as
long as each integer is stored in a floating-point number with much
larger precision, and this is precisely what is done in practice by
most arbitrary-precision arithmetic packages.

Steven

Oct 28 '06 #17

Richard Bos

st*****@alum.mit.edu wrote:

Richard Bos wrote:
"Jack \"Abram\" Off" <ja*****@fraud.netwrote:
You probably could write assembly to to the manipulations that you do in
your multiply procedure "a little less slowly", but I suspect what you
*really* want is something borrowed from higher maths: The FFT multiply.
>
http://numbers.computation.free.fr/C...ithms/fft.html
That's nice and fast, but it's a floating-point process and introduces
errors. All very well for floating point computations which are already
imprecise, but if you start out with a nice, exact integer array it's
probably not what you want.

You can perform convolutions of integer arrays (i.e.
arbitrary-precision multiplies) *exactly* with a floating-point FFT as
long as each integer is stored in a floating-point number with much
larger precision, and this is precisely what is done in practice by
most arbitrary-precision arithmetic packages.

Myes. But are _much_ larger floating-point numbers really that much
faster than appropriately handled integers? Even if you don't have these
much larger FPs yet, and will have to emulate them? Don't get me wrong,
I see the value of the method, but I'm not so sure of its value to the
OP, who probably will have the same problems handling extended-precision
FPs that he has handling extended-size integers.

Richard

Oct 31 '06 #18

Richard Bos

"Chris Thomasson" <cr*****@comcast.netwrote:

fermineutron wrote:

I was curious what is the
best way to interface C and asm so that i could write simmilar library
in asm and use it in C.

You make you libraries ABI follow the C calling convention for the processor
your targeting. For x86 prams are passed on the stack left-to-right,

Oh, are they? That'll be news to the ISO C Standard, which allows
implementations to pull any trick they like to speed up function calls.

And that's important; if you want an assembly function to speed up the
program, you really don't want to use the slowest way available to call
that function.

Richard

Oct 31 '06 #19

Ancient_Hacker

fermineutron wrote:

A while back i tried to calculate factorials of large numbers using
arrays in C, the array encoded integer arithemetic that i wrote in C
was very slow, it would take almost a second to multiply 2 array
encoded integers.

Whoa! Did you just stuff one decimal digit per array element? Or one
bit? It's hard to make arithmetic that slow.

Try using an array of unsigned ints, keeping up to sizeof( int ) * 8
bits per element. The code should run many many times faster. For
instance if you were storing one decimal digit per element, using
32-bit ints will make it run at least 400 million times faster.
Really. You eliminate having to do a mod and divide per loop, and you
have 2^32/10 times fewer loops.

Or better yet find a working bignum library, there must be dozens of
them for C.

Oct 31 '06 #20

stevenj

Richard Bos wrote:

Myes. But are _much_ larger floating-point numbers really that much
faster than appropriately handled integers? Even if you don't have these
much larger FPs yet, and will have to emulate them? Don't get me wrong,
I see the value of the method, but I'm not so sure of its value to the
OP, who probably will have the same problems handling extended-precision
FPs that he has handling extended-size integers.

Apparently yes. Look at the source code of many major
arbitrary-precision arithmetic packages and you will see that they use
floating-point FFTs.

You typically just use double precision, which has a 53-bit
significand. The question is, how many integer bits do you pack into
each double-precision element? For an FFT up to size size 2^19 =
524288 or so, it is sufficient to use 12 bits per double if I remember
correctly (and this may be too conservative, as FFT roundoff errors
generally grow at most logarithmically with transform size).

Steven

Oct 31 '06 #21

stevenj

Richard Bos wrote:

Don't get me wrong,
I see the value of the method, but I'm not so sure of its value to the
OP, who probably will have the same problems handling extended-precision
FPs that he has handling extended-size integers.

After going back and reading the original post, by the way, I'm not
sure anything can help the OP.

Remember, the OP was going to use the GMP library but he had problems
compiling it because "some header files were not found," so his
proposed solution was to rewrite GMP in assembly! Only he doesn't know
how to interface C and asm. We are not talking about a reasonable
person here.

Regards,
Steven G. Johnson

Oct 31 '06 #22

Chris Thomasson

"Richard Bos" <rl*@hoekstra-uitgeverij.nlwrote in message
news:45****************@news.xs4all.nl...

"Chris Thomasson" <cr*****@comcast.netwrote:

fermineutron wrote:

>I was curious what is the
best way to interface C and asm so that i could write simmilar library
in asm and use it in C.

You make you libraries ABI follow the C calling convention for the
processor
your targeting. For x86 prams are passed on the stack left-to-right,

Oh, are they?

Most of the time, they are. So, IMHO, I would define an ABI that does it
this way first...

If I designed my ABI to follow some other calling convention, then you, and
most other programmers could not use my library...

You can use it now because it sticks to the normal C calling convention for
most C compilers on the architectures I am targeting; why would I want to do
it another way?

Nov 2 '06 #23

CBFalconer

Chris Thomasson wrote:

"Richard Bos" <rl*@hoekstra-uitgeverij.nlwrote in message
>"Chris Thomasson" <cr*****@comcast.netwrote:
>>>fermineutron wrote:

I was curious what is the best way to interface C and asm so
that i could write simmilar library in asm and use it in C.

You make you libraries ABI follow the C calling convention for
the processor your targeting. For x86 prams are passed on the
stack left-to-right,

Oh, are they?

Most of the time, they are. So, IMHO, I would define an ABI that
does it this way first...

If I designed my ABI to follow some other calling convention,
then you, and most other programmers could not use my library...

You can use it now because it sticks to the normal C calling
convention for most C compilers on the architectures I am
targeting; why would I want to do it another way?

Sounds as if your 'libraries' are available only in binary form.
This is the antithesis of portability. The calling conventions are
the province of the compiler, not the library. If you must hide
your source, rather than depending on copyright, consider using
cloaked source.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Nov 2 '06 #24

Jordan Abel

2006-11-02 <45***************@yahoo.com>,
CBFalconer wrote:

Sounds as if your 'libraries' are available only in binary form.
This is the antithesis of portability. The calling conventions are
the province of the compiler, not the library. If you must hide
your source, rather than depending on copyright, consider using
cloaked source.

I believe we were talking in libraries written and maintained in
a language other than C, though, namely x86 assembly language. In that
case, you can't rely on the C compiler to recompile it to different
calling conventions even if you DO have the source.

Nov 2 '06 #25

Chris Thomasson

"Jordan Abel" <ra****@random.yi.orgwrote in message
news:sl*******************@rlaptop.random.yi.org.. .

2006-11-02 <45***************@yahoo.com>,
CBFalconer wrote:

[...]

I believe we were talking in libraries written and maintained in
a language other than C, though, namely x86 assembly language.

Yup.

In that case, you can't rely on the C compiler to recompile it to
different
calling conventions even if you DO have the source.

Indeed...

Nov 6 '06 #26

CBFalconer

Chris Thomasson wrote:

"Jordan Abel" <ra****@random.yi.orgwrote in message
news:sl*******************@rlaptop.random.yi.org.. .
2006-11-02 <45***************@yahoo.com>,
CBFalconer wrote:

[...]

I believe we were talking in libraries written and maintained in
a language other than C, though, namely x86 assembly language.

Yup.

In that case, you can't rely on the C compiler to recompile it to
different
calling conventions even if you DO have the source.

Indeed...

I wrote nothing of what you quoted. Please take more care with
attributions.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Nov 6 '06 #27

Chris Thomasson

I wrote nothing of what you quoted. Please take more care with

attributions.

Sorry about that.

Nov 6 '06 #28

Keith Thompson

"Chris Thomasson" <cr*****@comcast.netwrites:

>I wrote nothing of what you quoted. Please take more care with
attributions.

Sorry about that.

Taking more care with attributions doesn't mean dropping them altogether.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 6 '06 #29

Jordan Abel

2006-11-06 <ln************@nuthaus.mib.org>,
Keith Thompson wrote:

>
Taking more care with attributions doesn't mean dropping them altogether.

Though he wouldn't have been the first on this group to have reached
that conclusion.

Nov 6 '06 #30

Keith Thompson

Jordan Abel <ra****@random.yi.orgwrites:

2006-11-06 <ln************@nuthaus.mib.org>,
Keith Thompson wrote:
>>
Taking more care with attributions doesn't mean dropping them altogether.

Though he wouldn't have been the first on this group to have reached
that conclusion.

Unfortunately, that's true.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 6 '06 #31

On Tue, 31 Oct 2006 13:06:36 GMT, Richard Bos wrote:

>"Chris Thomasson" <cr*****@comcast.netwrote:

fermineutron wrote:

>I was curious what is the
best way to interface C and asm so that i could write simmilar library
in asm and use it in C.

You make you libraries ABI follow the C calling convention for the processor
your targeting. For x86 prams are passed on the stack left-to-right,

Oh, are they? That'll be news to the ISO C Standard, which allows
implementations to pull any trick they like to speed up function calls.

And that's important; if you want an assembly function to speed up the
program, you really don't want to use the slowest way available to call
that function.

in how i see it, it is wrong have not standard way to call a function
in assembly

Nov 11 '06 #32

Harald van DÄ³k

av wrote:

On Tue, 31 Oct 2006 13:06:36 GMT, Richard Bos wrote:

"Chris Thomasson" <cr*****@comcast.netwrote:

fermineutron wrote:

I was curious what is the
best way to interface C and asm so that i could write simmilar library
in asm and use it in C.

You make you libraries ABI follow the C calling convention for the processor
your targeting. For x86 prams are passed on the stack left-to-right,
Oh, are they? That'll be news to the ISO C Standard, which allows
implementations to pull any trick they like to speed up function calls.

And that's important; if you want an assembly function to speed up the
program, you really don't want to use the slowest way available to call
that function.

in how i see it, it is wrong have not standard way to call a function
in assembly

Why? What drawbacks are there to using nonstandard ways to call
functions in assembly, when those functions themselves are already not
covered by the C standard?

Nov 11 '06 #33

C / asm / long ints

Similar topics