Calling a function when the number of parameters isn't known till runtime

John Friedland

[apologies if anyone thinks this is off-topic. I originally
cross-posted to comp.lang.asm.x86, but I'm now re-trying just c.l.c
after discovering that c.l.a.x86 is moderated]

My problem: I need to call (from C code) an arbitrary C library
function, but I don't know until runtime what the function name is,
how many parameters are required, and what the parameters are. I can
use dlopen/whatever to convert the function name into a pointer to
that function, but actually calling it, with the right number of
parameters, isn't easy.

As far as I can see, there are only two solutions:

1) This one is portable. If you know in advance that no function will
require more than, say, 2 parameters, then you can do something simple
like:

switch(nparams) {
case 0: (*user_func)(); break;
case 1: (*user_func)(P1); break;
case 2: (*user_func)(P1, P2); break;
}

2) If you don't know the maximum number of parameters, then there
seems to be no way to do this, short of writing assembler code. I
guess this would look something like this:

- the C code pushes all the parameters onto a local stack
- the C code calls an assembler routine, passing in the function
address, the local stack address, and maybe the number of parameters
- the assembler routine sets up a stack frame and does an indirect
call to the user's C code
- the assembler routine clears up the stack frame and returns

Any thoughts? Have I missed anything/does this make sense? If I have
to go for (2), I'll start with Linux/gcc/x86 and move on to other
systems if necessary. Does anyone know of any web resources that could
help me do this? I've got no idea how to do this at the moment.

Thanks -

John

Jul 11 '06 #1

Subscribe Post Reply

4293

Ancient_Hacker

John Friedland wrote:

[apologies if anyone thinks this is off-topic. I originally
cross-posted to comp.lang.asm.x86, but I'm now re-trying just c.l.c
after discovering that c.l.a.x86 is moderated]

My problem: I need to call (from C code) an arbitrary C library
function, but I don't know until runtime what the function name is,
how many parameters are required, and what the parameters are. I can
use dlopen/whatever to convert the function name into a pointer to
that function, but actually calling it, with the right number of
parameters, isn't easy.

As far as I can see, there are only two solutions:

1) This one is portable. If you know in advance that no function will
require more than, say, 2 parameters, then you can do something simple
like:

switch(nparams) {
case 0: (*user_func)(); break;
case 1: (*user_func)(P1); break;
case 2: (*user_func)(P1, P2); break;
}

Not good enough, as the parameters might be of different sizes.

You're in luck if you're calling Windows Win32 API functions, as almost
all (All?) parameters are 32-bits. But in general they won't be
something like this might work:

#define pushbyte(b) __asm{ push byte ptr b }
#define pushword(b) __asm{ push word ptr b }

#define popoff(c) __asm{ add sp,c }

void CallFunc( FuncPtr TheFunctionAddress, char * ParamInfo; unsigned
long int Params[] )
{

for( i = 0; i < strlen( ParamInfo ); i++ ) { int Len; Len = 0;
switch( ParamInfo[i] ) {
case 'b': pushbyte( Params[i] ); Len+=1;; break;
case 'w': pushword( Params[i] ); Len +=2;break;
case 'l: pushlong( Params[i] ); Len +=4;break;
case 'a: pushaddr( Params[i] ); Len += 4; break;
case default: printf("bad param descriptor: !! %c", ParamInfo[i] );
}
*TheFunctionAddress();
popoff( Len );
}

You'll probably have to step thru this code several times to watch the
pushing and popping action until you get it all just right.

Jul 11 '06 #2

Eric Sosman

John Friedland wrote On 07/11/06 10:54,:

[apologies if anyone thinks this is off-topic. I originally
cross-posted to comp.lang.asm.x86, but I'm now re-trying just c.l.c
after discovering that c.l.a.x86 is moderated]

My problem: I need to call (from C code) an arbitrary C library
function, but I don't know until runtime what the function name is,
how many parameters are required, and what the parameters are. I can
use dlopen/whatever to convert the function name into a pointer to
that function, but actually calling it, with the right number of
parameters, isn't easy.

A function pointer makes it easy to handle the unknown
name, but handling the unknown "signature" is another matter.
A function call in C is static in the sense that the code to
gather the arguments and retrieve the returned value is built
at compile time and cannot be changed at run time. Some
functions can accept variable numbers and even variable types
of arguments, but any particular call to a function has only
a fixed number of arguments of known types.

As far as I can see, there are only two solutions:

1) This one is portable. If you know in advance that no function will
require more than, say, 2 parameters, then you can do something simple
like:

switch(nparams) {
case 0: (*user_func)(); break;
case 1: (*user_func)(P1); break;
case 2: (*user_func)(P1, P2); break;
}

This will work (after a small correction). Note, though,
that the argument types must be known; you cannot extend this
approach to "anonymous" arguments. You could use a fancier
dispatching method than simply the argument count, but you'll
still need to write different calls for different signatures.

The small correction is that you need to convert the
function pointer to match the actual type of the function
when you call it, and a function's type includes information
about its argument list. You'd need something like

void (*user_func)(void) = ...;
...
switch (nparams) {
case 0: (*user_func)(); break;
case 1: (*(void (*)(int))user_func)(P1); break;
case 2: (*(void (*)(int,int))user_func)(P1,P2); break;
...

This is a case where a few typedefs are a distinct aid
to readability. Using them (and dropping the unnecessary
albeit harmless dereference operator), you'd have

typedef void (*Fptr0)(void);
typedef void (*Fptr1)(int);
typedef void (*Fptr2)(int, int);
Fptr0 user_func = ...;
...
switch (nparams) {
case 0: user_func(); break;
case 1: ((Fptr1)user_func)(P1); break;
case 2: ((Fptr2)user_func)(P1, P2); break;
...

2) If you don't know the maximum number of parameters, then there
seems to be no way to do this, short of writing assembler code. I
guess this would look something like this:

- the C code pushes all the parameters onto a local stack
- the C code calls an assembler routine, passing in the function
address, the local stack address, and maybe the number of parameters
- the assembler routine sets up a stack frame and does an indirect
call to the user's C code
- the assembler routine clears up the stack frame and returns

... for suitable values of "something like." The mechanisms
for passing arguments to functions are platform-specific. Some
platforms use a stack that's popped by the caller, others use a
stack popped by the callee. Some pass their arguments in CPU
registers instead of on a stack, to the extent possible. Some
use different strategies for different argument types: integers
and pointers in the A0,A1,... registers and floating-point
values in F0,F1,... Some use different strategies when calling
variadic functions than when calling functions with fixed-length
parameter lists. Some call struct-valued functions differently
than int-valued functions. In short, this approach requires an
intimate knowledge of the details of subroutine linkage on the
various platforms you want to support.

Any thoughts? Have I missed anything/does this make sense? If I have
to go for (2), I'll start with Linux/gcc/x86 and move on to other
systems if necessary. Does anyone know of any web resources that could
help me do this? I've got no idea how to do this at the moment.

Approach #1 is practical, if you can live with a predetermined
set of "signatures" and the set is not too large. Approach #2 is
flexible, but requires a lot of work that will need to be re-done
each time the code encounters a new machine (or sometimes, even
a new compiler). Personally, I'd stick with #1 and resort to #2
only in the very direst circumstances; it will turn out to be far
more expensive.

In fact, before resorting to #2 I'd take a step back and
ponder for a while. You're looking for a way to build a function
call at run-time, something that really can't be done in C (even
#1 is just choosing from among a bunch of pre-built calls). You
are presumably doing this not for the sheer enjoyment, but in
pursuit of a solution to a wider problem. Perhaps it's time to
re-examine the wider problem and see whether there's a more C-
friendly way to approach it, or even whether it ought to be
tackled with a different language altogether.

--
Er*********@sun.com

Jul 11 '06 #3

John Friedland

On 11 Jul 2006 08:33:23 -0700, "Ancient_Hacker" <gr**@comcast.net>
wrote:

>switch(nparams) {
case 0: (*user_func)(); break;
case 1: (*user_func)(P1); break;
case 2: (*user_func)(P1, P2); break;
}

Not good enough, as the parameters might be of different sizes.

Forgot to say - I know that all Pn are the same integer type. I don't
know what 'user_func' does, but I can specify the integer type for Pn.
But, even so, I don't like this solution.

>something like this might work:

#define pushbyte(b) __asm{ push byte ptr b }
#define pushword(b) __asm{ push word ptr b }

#define popoff(c) __asm{ add sp,c }

void CallFunc( FuncPtr TheFunctionAddress, char * ParamInfo; unsigned
long int Params[] )
{

for( i = 0; i < strlen( ParamInfo ); i++ ) { int Len; Len = 0;
switch( ParamInfo[i] ) {
case 'b': pushbyte( Params[i] ); Len+=1;; break;
case 'w': pushword( Params[i] ); Len +=2;break;
case 'l: pushlong( Params[i] ); Len +=4;break;
case 'a: pushaddr( Params[i] ); Len += 4; break;
case default: printf("bad param descriptor: !! %c", ParamInfo[i] );
}
*TheFunctionAddress();
popoff( Len );
}

Wow. Have you ever tried this? I'll give it a go with gcc -S and see
if it makes sense.

Thanks -

John

Jul 11 '06 #4

John Friedland

On Tue, 11 Jul 2006 11:51:35 -0400, Eric Sosman <Er*********@sun.com>
wrote:

The small correction is that you need to convert the
function pointer to match the actual type of the function
when you call it, and a function's type includes information
about its argument list. You'd need something like

void (*user_func)(void) = ...;
...
switch (nparams) {
case 0: (*user_func)(); break;
case 1: (*(void (*)(int))user_func)(P1); break;
case 2: (*(void (*)(int,int))user_func)(P1,P2); break;
...

I was hoping that declaring user_func as 'void (*user_func)()' would
get around this, but I have to admit that I don't fully understand the
implications of an empty parameter list in a declaration.

>Personally, I'd stick with #1 and resort to #2
only in the very direst circumstances; it will turn out to be far
more expensive.

Yes, I agree that #2 is a nightmare, but my option is to tell users
that they can only write these functions with a maximum of N
parameters. What do I make N? Most of the time, it's probably going to
be 3 or 4 max, but then someone will come along and say that it
doesn't work with 500 or 1000 parameters, and I'll have to add more of
the calls and distribute new code. Easier or harder than writing a bit
of assembler for each new supported platform?

>Perhaps it's time to
re-examine the wider problem and see whether there's a more C-
friendly way to approach it, or even whether it ought to be
tackled with a different language altogether.

I'm surprised that it's not more common. I have what's essentially a
scripting language, which is implemented in C and C++. The user can
call 'external' functions (which must be written in C), so the user's
(script) names the library and the function, and provides the
parameters. I process the script at runtime and I have to call the
user's C library code. The lack of this sort of dynamic calling seems
to be a significant oversight in C. While googling I did find that
this was apparently considered while the language was being developed,
but it seems that it got too complicated and the overhead was too
much.

John

Jul 11 '06 #5

Ancient_Hacker

another way, guaranteed to work, if you can accept a slight delay each
time a new previously unseen function has to be called:

WriteOutGlueCode( "/tmp/t%s.c");
system( "cc /tmp/t.c" );
h = dlopen( "/tmp/t.o" );

.... where WriteOutGlue is left as an exercise to the reader. And you
pass the parameters in and out thru a struct or (ugh) global
variables. And you insert the function name ni the file names of
course if you need more than one function at a time.

Jul 11 '06 #6

Flash Gordon

John Friedland wrote:

[apologies if anyone thinks this is off-topic. I originally
cross-posted to comp.lang.asm.x86, but I'm now re-trying just c.l.c
after discovering that c.l.a.x86 is moderated]

Even moderated groups give responses eventually ;-)

However, cross posting between groups covering different languages is
rarely a good thing.

My problem: I need to call (from C code) an arbitrary C library
function, but I don't know until runtime what the function name is,
how many parameters are required, and what the parameters are.

If it could be literally any function, then standard C can't handle it.
With standard C you could do a look up in to an array of function
pointers, or a switch to call the appropriate function etc, but all
these methods require you to enumerate the possibilities in your C
source files.

I can
use dlopen/whatever to convert the function name into a pointer to
that function,

Well, if this is the *nix dlopen function, asking for advice on a *nix
group such as comp.unix.programmer might be a good idea.

but actually calling it, with the right number of
parameters, isn't easy.

In fact, standard C provides no mechanism for doing this.

As far as I can see, there are only two solutions:

1) This one is portable. If you know in advance that no function will
require more than, say, 2 parameters, then you can do something simple
like:

switch(nparams) {
case 0: (*user_func)(); break;
case 1: (*user_func)(P1); break;
case 2: (*user_func)(P1, P2); break;
}

Yes, that is what you would have to do in standard C.

2) If you don't know the maximum number of parameters, then there
seems to be no way to do this, short of writing assembler code. I
guess this would look something like this:

- the C code pushes all the parameters onto a local stack
- the C code calls an assembler routine, passing in the function
address, the local stack address, and maybe the number of parameters
- the assembler routine sets up a stack frame and does an indirect
call to the user's C code
- the assembler routine clears up the stack frame and returns

Any thoughts? Have I missed anything/does this make sense? If I have
to go for (2), I'll start with Linux/gcc/x86 and move on to other
systems if necessary. Does anyone know of any web resources that could
help me do this? I've got no idea how to do this at the moment.

I think you will have to use techniques beyond what standard C has to
offer. Try comp.unix.programmer and the Linux and/or gcc groups to see
if they can provide any non-C-standard methods, if not either rethink
your problem or try the assembler approach. We can't help here since we
only deal with standard C and this is not possible in standard C (I'm
not complaining about the question being asked, since asking if
something can be done in standard C is not unreasonable for this group).
--
Flash Gordon, living in interesting times.
Web site - http://home.flash-gordon.me.uk/
comp.lang.c posting guidelines and intro:
http://clc-wiki.net/wiki/Intro_to_clc

Jul 11 '06 #7

John Friedland

On Tue, 11 Jul 2006 17:29:02 +0100, John Friedland <jd*@nospam.org>
wrote:

>I'm surprised that it's not more common. I have what's essentially a
scripting language, which is implemented in C and C++. The user can
call 'external' functions (which must be written in C), so the user's
(script) names the library and the function, and provides the
parameters. I process the script at runtime and I have to call the
user's C library code.

After writing this (that's the great thing about writing down your
problems), it occurred to me that lots of people must have had exactly
the same problem. So, I looked up the Python/ctypes sources to see how
they do it. It turns out that they use Red Hat's libffi; from the
README:

>Some programs may not know at the time of compilation what arguments
are to be passed to a function. For instance, an interpreter may be
told at run-time about the number and types of arguments used to call
a given function. Libffi can be used in such programs to provide a
bridge from the interpreter program to compiled code.

libffi uses assembler code, and supports a number of different
platforms. It's currently part of gcc, and the best place to get
stand-alone sources may be via Python. Links, for anyone who comes
this way in the future:

http://sourceforge.net/project/showf...group_id=71702
http://sourceware.org/libffi/

John

Jul 11 '06 #8

Eric Sosman

John Friedland wrote On 07/11/06 12:29,:

On Tue, 11 Jul 2006 11:51:35 -0400, Eric Sosman <Er*********@sun.com>
wrote:

> The small correction is that you need to convert the
function pointer to match the actual type of the function
when you call it, and a function's type includes information
about its argument list. You'd need something like

void (*user_func)(void) = ...;
...
switch (nparams) {
case 0: (*user_func)(); break;
case 1: (*(void (*)(int))user_func)(P1); break;
case 2: (*(void (*)(int,int))user_func)(P1,P2); break;
...

I was hoping that declaring user_func as 'void (*user_func)()' would
get around this, but I have to admit that I don't fully understand the
implications of an empty parameter list in a declaration.

The empty parameter list means "This function takes a
fixed number of arguments of fixed types, but we don't know
how many or of what kinds." That description (as far as it
goes) covers each of your usages individually, but I'm not
at all clear whether it covers them all collectively. A
true language lawyer might be able to settle the issue, but
instead of waiting for the pettifogging to settle down I'd
just opt for the squeaky-clean approach. It might not be
strictly necessary, but it's certainly prudent.

>>Personally, I'd stick with #1 and resort to #2
only in the very direst circumstances; it will turn out to be far
more expensive.

Yes, I agree that #2 is a nightmare, but my option is to tell users
that they can only write these functions with a maximum of N
parameters. What do I make N? Most of the time, it's probably going to
be 3 or 4 max, but then someone will come along and say that it
doesn't work with 500 or 1000 parameters, and I'll have to add more of
the calls and distribute new code. Easier or harder than writing a bit
of assembler for each new supported platform?

Hold your horses! Do I understand that you get to
prescribe the allowable function signatures? If so, your
problem simply vanishes: Tell the writers of the functions
that they'll receive one argument, namely, a pointer to
the start of an array containing the "real" arguments:

void (*user_func)(int *) = ...;
...
int *args = malloc(nparams * sizeof *args);
if (args == NULL) ...
for (i = 0; i < nparams; ++i)
args[i] = ...;
user_func (args);
free (args);

If you like, you can "decorate" this pattern by passing
an argument count, a few `const' qualifiers, and/or some
sort of indications of different argument types. But as
long as you get to dictate the function signature, you might
as well make it easy on yourself ...

>>Perhaps it's time to
re-examine the wider problem and see whether there's a more C-
friendly way to approach it, or even whether it ought to be
tackled with a different language altogether.

I'm surprised that it's not more common. I have what's essentially a
scripting language, which is implemented in C and C++. The user can
call 'external' functions (which must be written in C), so the user's
(script) names the library and the function, and provides the
parameters. I process the script at runtime and I have to call the
user's C library code. The lack of this sort of dynamic calling seems
to be a significant oversight in C. While googling I did find that
this was apparently considered while the language was being developed,
but it seems that it got too complicated and the overhead was too
much.

No one language can be alle Dinge to tout le monde.

--
Er*********@sun.com

Jul 11 '06 #9

Roberto Waltman

John Friedland wrote:

>...I have what's essentially a
scripting language, which is implemented in C and C++. The user can
call 'external' functions (which must be written in C), so the user's
(script) names the library and the function, and provides the
parameters. I process the script at runtime and I have to call the
user's C library code. The lack of this sort of dynamic calling seems
to be a significant oversight in C.

Can you use a calling convention other than the trivial one shown so
far?

If so, you could call all functions with a single parameter: an array
of tagged structs, or a parameter count and a tagged struct array. (Or
an TLV array, or a TTLV array)

You could supply a few utility functions to help the called functions
decode and validate the params, to circumvent the "this is too
complicated" complaints that are sure to come.

Some quick ad-lib coding:
#define MAX_PARAMS ???

enum tags
{
IS_NOTHING,
IS_A_CHAR,
IS_A_INT,
...
IS_A_POINT3D,
...
IS_A_HEFTY_THING
...
};

union param_val
{
char char_val;
int int_val;
...
struct point3d point3d_val;
...
struct hefty_thing *hefty_ptr;
...
};

struct tagged_param
{
enum tags tag;
union param_val val;
};

struct tagged_param param_array[MAX_PARAMS];

void some_func()
{
param_array[0].tag = IS_A_INT;
param_array[0].val.int_val = 5;

param_array[1].tag = IS_A_POINT3D;
param_array[1].val... = ...

...

param_array[5].tag = ... ;
param_array[5].val... = ... ;

param_array[6].tag = IS_NOTHING; /* belt and suspenders */

user_function_with_5_params(param_array);
}
Roberto Waltman

[ Please reply to the group,
return address is invalid ]

Jul 11 '06 #10

Roberto Waltman

Roberto Waltman <us****@rwaltman.netwrote:

>Can you use a calling convention other than the trivial one shown so
far?

I see that Eric Sosman just beat me on this approach. ;)

What I posted is very simplistic, search for TTLV (Tag (and, not or,)
Type, Length, Value) to make this mechanism really robust.
Roberto Waltman

[ Please reply to the group,
return address is invalid ]

Jul 11 '06 #11

Fred Kleinschmidt

"John Friedland" <jd*@nospam.orgwrote in message
news:ee********************************@4ax.com...

On Tue, 11 Jul 2006 11:51:35 -0400, Eric Sosman <Er*********@sun.com>
wrote:

> The small correction is that you need to convert the
function pointer to match the actual type of the function
when you call it, and a function's type includes information
about its argument list. You'd need something like

void (*user_func)(void) = ...;
...
switch (nparams) {
case 0: (*user_func)(); break;
case 1: (*(void (*)(int))user_func)(P1); break;
case 2: (*(void (*)(int,int))user_func)(P1,P2); break;
...

I was hoping that declaring user_func as 'void (*user_func)()' would
get around this, but I have to admit that I don't fully understand the
implications of an empty parameter list in a declaration.

>>Personally, I'd stick with #1 and resort to #2
only in the very direst circumstances; it will turn out to be far
more expensive.

Yes, I agree that #2 is a nightmare, but my option is to tell users
that they can only write these functions with a maximum of N
parameters. What do I make N? Most of the time, it's probably going to
be 3 or 4 max, but then someone will come along and say that it
doesn't work with 500 or 1000 parameters, and I'll have to add more of
the calls and distribute new code. Easier or harder than writing a bit
of assembler for each new supported platform?

The above paragraph implies that you have some control
over how the function looks. That is, the library you are interfacing
with is not some unknown third-party library, but is in fact written by
your users. If you can specify to them that they must write their
function with a most N parameters, then you could also specify
that they must write it with exactly two parameters: an array of
integers, and the number of integers in the array.

>
>>Perhaps it's time to
re-examine the wider problem and see whether there's a more C-
friendly way to approach it, or even whether it ought to be
tackled with a different language altogether.

I'm surprised that it's not more common. I have what's essentially a
scripting language, which is implemented in C and C++. The user can
call 'external' functions (which must be written in C), so the user's
(script) names the library and the function, and provides the
parameters. I process the script at runtime and I have to call the
user's C library code. The lack of this sort of dynamic calling seems
to be a significant oversight in C. While googling I did find that
this was apparently considered while the language was being developed,
but it seems that it got too complicated and the overhead was too
much.

John

--
Fred L. Kleinschmidt
Boeing Associate Technical Fellow
Technical Architect, Software Reuse Project

Jul 11 '06 #12

John Friedland

On Tue, 11 Jul 2006 13:49:53 -0400, Eric Sosman <Er*********@sun.com>
wrote:

Hold your horses! Do I understand that you get to
prescribe the allowable function signatures? If so, your
problem simply vanishes: Tell the writers of the functions
that they'll receive one argument, namely, a pointer to
the start of an array containing the "real" arguments:

Hmmm. Yes, I think you're absolutely right. As long as the call looks
good/natural in the script source, what does it matter if the
user/implementor has to extract the args from an array? I don't think
it matters at all.

I could also mix this with the hard-wired approach for low parameter
numbers. Maybe all foreign function calls with, say, 8 or less
parameters would end up with the user as a normal list of function
parameters, and a call with more than 8 parameters would end up as a
call with an array.

Thanks -

John

Jul 11 '06 #13

websnarf

John Friedland wrote:

[apologies if anyone thinks this is off-topic. I originally
cross-posted to comp.lang.asm.x86, but I'm now re-trying just c.l.c
after discovering that c.l.a.x86 is moderated]

My problem: I need to call (from C code) an arbitrary C library
function, but I don't know until runtime what the function name is,
how many parameters are required, and what the parameters are. I can
use dlopen/whatever to convert the function name into a pointer to
that function, but actually calling it, with the right number of
parameters, isn't easy.

As far as I can see, there are only two solutions:

1) This one is portable. If you know in advance that no function will
require more than, say, 2 parameters, then you can do something simple
like:

switch(nparams) {
case 0: (*user_func)(); break;
case 1: (*user_func)(P1); break;
case 2: (*user_func)(P1, P2); break;
}

I don't understand the comments by others in this thread. This is
obviously correct *AS WRITTEN* so long as you declare user_func as
follows:

void (* user_func) (...);

That is to say, the number of parameters are determined at runtime.
The problem is that you have to decide on a convention for determining
the number of parameters the function has recieved. If you want to
dictate this by the parameters themselves (typical) then you are going
to have at least *1* parameter, meaning case 0 above should not be
possible. Otherwise you could pass the number of parameters in a
static or something, but that probably defeats the purpose.

2) If you don't know the maximum number of parameters, then there
seems to be no way to do this, short of writing assembler code.

Yes, but at compile time you in a meta-sense do know the maximum number
of parameters by simply examining all the call-sites to see this for
yourself. How or why exactly you need to know this with more
flexibility is not exactly clear to me.

Is it your intention, for example, to somehow magically pass all the
elements of an array as parameters to a function, where the array
length is determined at runtime? Obviously this would require lots of
platform-specific code at the call-site, but would also call into
question what it is that you are doing -- why not just pass in the
array with its length as two parameters?

[...] I guess this would look something like this:

- the C code pushes all the parameters onto a local stack

This is typically very complicated, especially on x86 calling
conventions. The first few parameters are typically thrown into
registers, except for floating point parameters which get pushed into
the FPU stack, and then overflow goes into the hardware stack. For 32
bit x86 systems, passing in a 64-bit integers might split over two
register (but I don't recall for sure) and I don't remember the
convention for passing in structs that are smaller than 64 bits.

- the C code calls an assembler routine, passing in the function
address, the local stack address, and maybe the number of parameters

The number of parameters are not explicitely passed in. The amount of
stack consumed might be, however.

- the assembler routine sets up a stack frame and does an indirect
call to the user's C code
- the assembler routine clears up the stack frame and returns

Any thoughts? Have I missed anything/does this make sense? If I have
to go for (2), I'll start with Linux/gcc/x86 and move on to other
systems if necessary. Does anyone know of any web resources that could
help me do this? I've got no idea how to do this at the moment.

I need a clearer picture of what you are trying to do to know how to
help you. Unfortunately, in C, knowing what your parameters are either
has to be explicitely described, or is implicitely, but
deterministically described at run time, in some run-time specific
manner. On x86, matters have been highly complicated because of
passing parameters into registers and a seperate floating point stack.
So you can't just isolate the parameter data from inside the function.
And all this ignores the difference between a static and an extern
function (though I think that calling through a pointer forces the
function to have extern calling convention characteristics).

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jul 11 '06 #14

Ben Pfaff

we******@gmail.com writes:

This is obviously correct *AS WRITTEN* so long as you declare
user_func as follows:

void (* user_func) (...);

That's not a valid declaration. It's a syntax error.
--
"C has its problems, but a language designed from scratch would have some too,
and we know C's problems."
--Bjarne Stroustrup

Jul 11 '06 #15

John Friedland

On 11 Jul 2006 14:10:21 -0700, we******@gmail.com wrote:

>2) If you don't know the maximum number of parameters, then there
seems to be no way to do this, short of writing assembler code.

Yes, but at compile time you in a meta-sense do know the maximum number
of parameters by simply examining all the call-sites to see this for
yourself. How or why exactly you need to know this with more
flexibility is not exactly clear to me.

This program is a compiler/interpreter for a scripting language. The
program is compiled and supplied to an end-user. At runtime, the
end-user runs the compiler, supplying a script as input. The script
names a foreign (C) function, and some parameters, which must be
executed by the runtime environment. This is fairly common - tcl,
Python, and so on, do this.

So, there is never a 'compilation' which is aware of the function to
be called, or the number of parameters to it; it all has to be handled
at runtime. C can't do this, except via the various hacks elsewhere in
the thread. There is the va_list mechanism, of course, but even this
doesn't help - it can't be used 'dynamically', and so is just as
'static' as an ordinary function call.

John

Jul 11 '06 #16

Peter Nilsson

Ben Pfaff wrote:

we******@gmail.com writes:
This is obviously correct *AS WRITTEN* so long as you declare
user_func as follows:

void (* user_func) (...);

That's not a valid declaration. It's a syntax error.

C99 supports it for macro definitions, but not for function
declarations. It's valid
C++, which is probably why many bundled C compilers support it as an
extension.

But since 'trailing' variadic function arguments are subject to the
promotion,
the OP may as well use the non-prototype form...

void (*user_func)();

....noting the same caveats about promotable arguments and this won't
work
when calling variadic functions.

--
Peter

Jul 11 '06 #17

Adam Warner

On Tue, 11 Jul 2006 21:38:23 +0100, John Friedland wrote:

On Tue, 11 Jul 2006 13:49:53 -0400, Eric Sosman <Er*********@sun.com>
wrote:

> Hold your horses! Do I understand that you get to
prescribe the allowable function signatures? If so, your
problem simply vanishes: Tell the writers of the functions
that they'll receive one argument, namely, a pointer to
the start of an array containing the "real" arguments:

Hmmm. Yes, I think you're absolutely right. As long as the call looks
good/natural in the script source, what does it matter if the
user/implementor has to extract the args from an array? I don't think
it matters at all.

It only matters if the approach is too slow. This calling convention will
almost certainly force the C compiler to pass all arguments via memory.
This is likely to be significantly slower upon register-rich architectures.

This is a core issue to anyone implementing a high performance language
upon a C-based environment.

I could also mix this with the hard-wired approach for low parameter
numbers. Maybe all foreign function calls with, say, 8 or less
parameters would end up with the user as a normal list of function
parameters, and a call with more than 8 parameters would end up as a
call with an array.

Another portable way: Choose a fixed argument convention that is quite
efficient upon your architecture of choice. I would choose to optimise for
the Linux x86-64 ABI <http://www.x86-64.org/documentation/abi.pdfbecause
its C calling convention is well designed (passing quite a few arguments
in registers) and the architecture is growing in popularity.

Figure 3.4 on page 20 (also see Appendix A - Linux Conventions) for the
x86-64 ABI shows that six integer arguments are passed in registers. As a
first approximation you might choose to benchmark this fixed six argument
calling convention:

arg 1: The number of arguments being passed (0 or more)
arg 2: Your 1st argument (if relevant)
arg 3: Your 2nd argument (if relevant)
arg 4: Your 3rd argument (if relevant)
arg 5: Your 4th argument (if relevant)
arg 6: A pointer to an array for 5+ arguments (if relevant)

This should pass up to four of your original input values in registers
upon x86-64 Linux. The downside is that you're always passing six
arguments (and other architectures may stack allocate them). You might
choose a lower fixed number of arguments if passing up to four input
values is rare.

Another caveat: If you're passing pointers via this fixed argument
convention the integer arguments will need to be of sufficient size to
cast a pointer to an integer argument and back. Perhaps use union
arguments for increased portability and ease of use. Ditto for floats
(you might want float-specific arguments so the floats will be passed in
floating point registers if support by the ABI).

This approach also eliminates the switch statement. Most C compilers are
very bad at optimising for switch statements. They invariably include a
bounds check at every switch and only jump to the relevant code via a
single indirect jump opcode. Current superscalar architectures are likely
to mispredict the jump every single time the code to jump to changes.

Regards,
Adam

Jul 12 '06 #18

Adam Warner

On Tue, 11 Jul 2006 21:38:23 +0100, John Friedland wrote:

On Tue, 11 Jul 2006 13:49:53 -0400, Eric Sosman <Er*********@sun.com>
wrote:

> Hold your horses! Do I understand that you get to
prescribe the allowable function signatures? If so, your
problem simply vanishes: Tell the writers of the functions
that they'll receive one argument, namely, a pointer to
the start of an array containing the "real" arguments:

Hmmm. Yes, I think you're absolutely right. As long as the call looks
good/natural in the script source, what does it matter if the
user/implementor has to extract the args from an array? I don't think
it matters at all.

I could also mix this with the hard-wired approach for low parameter
numbers. Maybe all foreign function calls with, say, 8 or less
parameters would end up with the user as a normal list of function
parameters, and a call with more than 8 parameters would end up as a
call with an array.

Another portable way: Choose a fixed argument convention that is quite
efficient upon your architecture of choice. I would choose to optimise for
the Linux x86-64 ABI <http://www.x86-64.org/documentation/abi.pdfbecause
its C calling convention is well designed (passing quite a few arguments
in registers) and the architecture is growing in popularity.

Figure 3.4 on page 20 (also see Appendix A - Linux Conventions) for the
x86-64 ABI shows that six integer arguments are passed in registers. As a
first approximation you might choose to benchmark this fixed six argument
calling convention:

arg 1: The number of arguments being passed (0 or more)
arg 2: Your 1st argument (if relevant)
arg 3: Your 2nd argument (if relevant)
arg 4: Your 3rd argument (if relevant)
arg 5: Your 4th argument (if relevant)
arg 6: A pointer to an array for 5+ arguments (if relevant)

This should pass up to four of your original input values in registers
upon x86-64 Linux. The downside is that you're always passing six
arguments (and other architectures may stack allocate them). You might
choose a lower fixed number of arguments if passing up to four input
values is rare.

Another caveat: If you're passing pointers via this fixed argument
convention the integer arguments will need to be of sufficient size to
cast a pointer to an integer argument and back. Perhaps use union
arguments for increased portability and ease of use. Ditto for floats
(you might want float-specific arguments so the floats will be passed in
floating point registers if support by the ABI).

This approach also eliminates the switch statement. Most C compilers are
very bad at optimising for switch statements. They invariably include a
bounds check at every switch and only jump to the relevant code via a
single indirect jump opcode. Current superscalar architectures are likely
to mispredict this jump every single time the resulting address to jump to
changes.

Regards,
Adam

Jul 12 '06 #19

Calling a function when the number of parameters isn't known till runtime

Similar topics