471,582 Members | 1,517 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,582 software developers and data experts.

Script Assembler in C++: What operations should I implement?

I wish to create my own assembly language for script. For now it is
mostly for fun and for the sake of the learning, but I am also creating
a game engine where I want this system in. Once the assembler is done I
will develop a simple script language/compiler which uses this
assembler. What operators do you suggest I implement in it?

Yours,
Morten Aune Lyrstad
Jul 22 '05 #1
21 1904
That should be operations, not operators. Sorry! :-)
Jul 22 '05 #2
That should be which operations, not what operators... Sorry! ;-)

Yours,
Morten Aune Lyrstad
Jul 22 '05 #3
Morten Aune Lyrstad wrote:

I wish to create my own assembly language for script. For now it is
mostly for fun and for the sake of the learning, but I am also creating
a game engine where I want this system in. Once the assembler is done I
will develop a simple script language/compiler which uses this
assembler. What operators do you suggest I implement in it?


First of all you should think about what type of 'virtual CPU' you
want to implement. Then start thinking about what operations this
virtual CPU should support.

For starters: I recommend a stack machine (that is: no registers,
all arithmetic works on a stack by pushing from the stack, performing
the operation, poping the result back on the stack).

So obivous operations are:
push
pop
cmp comparing top of stack with (top+1) of stack, test for equality
push true (=1) or false (=0) on stack
jump
jump_if_zero jump conditionally if top of stack equals 0
load (from memory address specified at top of stack, push value onto stack)
store (store *top at memory address specified with (top+1), pop both from stack )
add
subtract
multiply
divide
...

A statement like
i = j + 2

will translate to

push <address_of_i>
push <address_of_j>
load
push 2
add
store

Once you have that, other operations will come to your mind if you need them.
Eg. you will encounter that you often generate the sequence

....
push 0
cmp

where the sequence push 0, cmp could easily be replaced with a dedicated
cmp_zero

or various different jump types (jump_if_zero, jump_if_not zero, jump_if_positive,
jump_if_negative, etc)
It's a lot of fun, to design your own CPU and figuring out how well it can be
programmed.

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #4
Wow; a great reply! Thank you very much!

Many great tips, plus good explanations. I love this! I was indeed
thinking of creating a stack machine. Your answer will help me start
off. Great!

Thanks,
Morten Aune Lyrstad
Jul 22 '05 #5
Morten Aune Lyrstad wrote:

Wow; a great reply! Thank you very much!

Many great tips, plus good explanations. I love this! I was indeed
thinking of creating a stack machine. Your answer will help me start
off. Great!


Another start: Search the Web for resources on the programming
language 'Forth'. It is a very simple programming language, which
made the stack principle to its core element. You can get lots of
ideas on possible operations and how to implement them from this
language.
--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #6
Karl Heinz Buchegger wrote:
Morten Aune Lyrstad wrote:
Wow; a great reply! Thank you very much!

Many great tips, plus good explanations. I love this! I was indeed
thinking of creating a stack machine. Your answer will help me start
off. Great!

Another start: Search the Web for resources on the programming
language 'Forth'. It is a very simple programming language, which
made the stack principle to its core element. You can get lots of
ideas on possible operations and how to implement them from this
language.

Would you suggest that I make a push and pop for each different
datatype? Or should I make a more generic system?
Jul 22 '05 #7
Morten Aune Lyrstad wrote:

Karl Heinz Buchegger wrote:
Morten Aune Lyrstad wrote:
Wow; a great reply! Thank you very much!

Many great tips, plus good explanations. I love this! I was indeed
thinking of creating a stack machine. Your answer will help me start
off. Great!

Another start: Search the Web for resources on the programming
language 'Forth'. It is a very simple programming language, which
made the stack principle to its core element. You can get lots of
ideas on possible operations and how to implement them from this
language.

Would you suggest that I make a push and pop for each different
datatype? Or should I make a more generic system?


I think it is easier (and faster too) to generate multiple
data stacks: one for each data type. You won't have many:
integers, floating point and strings
(integers and floating point could be collected in one data
type by storing integers as floating point)
Then you need to duplicate the set of operations for each data
type and of course you will need operations for transfering
the top of stack element from one stack to another (He, he:
this is where the fun begins. Introduce a new 'hardware
concept' and see what 'assembly' instructions you need to
support it)

The 'assembly' instructions by itself get simpler, if they
don't need to figure out what's actually on the stack and
react accordingly. That's something the compiler can do
and generate different instructions. On the other hand:
the instruction set gets bigger, but it's not that big of
a deal :-)

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #8
Karl Heinz Buchegger wrote:
Morten Aune Lyrstad wrote:
Karl Heinz Buchegger wrote:
Morten Aune Lyrstad wrote:
Wow; a great reply! Thank you very much!

Many great tips, plus good explanations. I love this! I was indeed
thinking of creating a stack machine. Your answer will help me start
off. Great!
Another start: Search the Web for resources on the programming
language 'Forth'. It is a very simple programming language, which
made the stack principle to its core element. You can get lots of
ideas on possible operations and how to implement them from this
language.


Would you suggest that I make a push and pop for each different
datatype? Or should I make a more generic system?

I think it is easier (and faster too) to generate multiple
data stacks: one for each data type. You won't have many:
integers, floating point and strings
(integers and floating point could be collected in one data
type by storing integers as floating point)
Then you need to duplicate the set of operations for each data
type and of course you will need operations for transfering
the top of stack element from one stack to another (He, he:
this is where the fun begins. Introduce a new 'hardware
concept' and see what 'assembly' instructions you need to
support it)

The 'assembly' instructions by itself get simpler, if they
don't need to figure out what's actually on the stack and
react accordingly. That's something the compiler can do
and generate different instructions. On the other hand:
the instruction set gets bigger, but it's not that big of
a deal :-)

Great idea, thanks again! :-)
Jul 22 '05 #9
The comparison operations (like cmp) does not pop the variables it tests
from the stack; am I right?
Jul 22 '05 #10
Let me rephrase that: How do I determine which operations pop from the
stack, and which does not?
Jul 22 '05 #11
Morten Aune Lyrstad wrote:
Let me rephrase that: How do I determine which operations pop from the
stack, and which does not?


All pop except those that push.

And may I suggest you take it to comp.programming? I see no C++ in this
discussion...
Jul 22 '05 #12
Morten Aune Lyrstad wrote:

The comparison operations (like cmp) does not pop the variables it tests
from the stack; am I right?


Sure it does.
Everything comes from the stack and results go back to the stack.
Why should cmp know about variables? It is the job of the code prior
to cmp to arrange everything correctly on the stack.
Where else should you get them from?

(Hint: Whenever you think of number, think of expression
in the future

So how does

if( i + 5 == j * 7 )
k = 8

gets translated ?

Well, eg. It could look like this:

push <address_of_i>
load
push 5
add ; i + 5 is evaluated

push <address_of_j>
load
push 7
multiply ; j * 7 is evaluated

; at this time the stack contains (in that pop-ing order)
; the result of j * 7
; the result of i + 5

cmp ; cmp will pop those 2 results from
; the stack, compare them and push
; 0 if not equal
; 1 if equal

jmp_zero label_1 ; jump_zero pops a value from the stack
; if this value equals 0, then the jump is taken
push <address_of_k>
push 8
store
label1: .....
Every 'assembler' instruction is simple. It is the combination of instructions
which gives the power.
The reason for this: flexibility!
In this specifc example: How says that the result of cmp is used for deciding
to bracnch. The result of a comparison can equally well be assigned to a variable:

k = ( i == 5 )

push <address_of_k>
push <address_of_i>
push 5
cmp
store

now k will contain 1 if i equals 5, or 0 otherwise

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #13
I get it; thanks again! :-)
Jul 22 '05 #14
You seem to have some knowledge about this. What do you think about this
as a translated 'for' loop? My virtual machine can handle this now. It
should be the equivalent of

for (int i = 0; i < 10; i++)
{
}

-------------------- SAGA code --------------------
push [addr0] ;\
push 10 ;} Store 10 into the first variable
store ;/
push [addr1] ;\
push 0 ;} Store 0 into the second variable
store ;/
loopStart:
push [addr1] ;} Load the second variable
load ;/
push [addr0] ;} Load the first variable
load ;}
lessi ; Compare integer. Is A less than B?
jmpzero endLabel ; No, exit loop
push [addr1] ; Push address of second variable
push [addr1] ;} Load the second variable
load ;/
push 1 ;} Add 1 to the value
addi ;/
store ; Store into the second variable
jmp loopStart ; Start loop over
endLabel:
end ; End function
------------------ SAGA code end ------------------

I really appreciate your help.

Yours,
Morten Aune Lyrstad
Jul 22 '05 #15
On Sat, 08 Jan 2005 02:31:21 +0100, Morten Aune Lyrstad
<mo****@wantsno.spam> wrote:
You seem to have some knowledge about this. What do you think about this
as a translated 'for' loop? My virtual machine can handle this now. It
should be the equivalent of

for (int i = 0; i < 10; i++)
{
}


[x86 assembler code snipped]

My opinion is that compiler-specific implementation details, and
assembly programming in general, are both seriously off-topic in this
newsgroup.

--
Bob Hairgrove
No**********@Home.com
Jul 22 '05 #16
Bob Hairgrove wrote:
On Sat, 08 Jan 2005 02:31:21 +0100, Morten Aune Lyrstad
<mo****@wantsno.spam> wrote:

You seem to have some knowledge about this. What do you think about this
as a translated 'for' loop? My virtual machine can handle this now. It
should be the equivalent of

for (int i = 0; i < 10; i++)
{
}

[x86 assembler code snipped]

My opinion is that compiler-specific implementation details, and
assembly programming in general, are both seriously off-topic in this
newsgroup.

--
Bob Hairgrove
No**********@Home.com


Perhaps; but you should read the entire thread before making a judgment.
First of all: This has nothing to do with x86 assembly or assembly
programming at all. It is a part of a virtual machine system I am
writing in c++. Second; this is a continuation of a thread I have had
here for quite some time, and since I have been getting help a specific
person here, I can't just skip to another newsgroup.
Jul 22 '05 #17
"Morten Aune Lyrstad" <mo****@wantsno.spam> wrote...
Bob Hairgrove wrote:
On Sat, 08 Jan 2005 02:31:21 +0100, Morten Aune Lyrstad
<mo****@wantsno.spam> wrote:

You seem to have some knowledge about this. What do you think about this
as a translated 'for' loop? My virtual machine can handle this now. It
should be the equivalent of

for (int i = 0; i < 10; i++)
{
}

[x86 assembler code snipped]

My opinion is that compiler-specific implementation details, and
assembly programming in general, are both seriously off-topic in this
newsgroup.

--
Bob Hairgrove
No**********@Home.com


Perhaps; but you should read the entire thread before making a judgment.
First of all: This has nothing to do with x86 assembly or assembly
programming at all. It is a part of a virtual machine system I am writing
in c++. Second; this is a continuation of a thread I have had here for
quite some time, and since I have been getting help a specific person
here, I can't just skip to another newsgroup.


You seem to have been seriously misled.

When you want to discuss what you're implementing, it's whatever you
are implementing that defines the topic of your postings, not what
language you're implementing it in. So, just like in your case of
some virtual machine, databases, cad, text editors, client-server
systems, OS, simulations of any kind, etc., are OFF-TOPIC here, even
if they are implemented _in_ C++. Since you're not talking _about_
C++ _itself_, you're not on topic in comp.lang.c++. Comprehend?

When you want to converse with a particular person about an off-topic
subject, you take your conversation elsewhere, _usually_ to personal
e-mail. Since you have been getting help from that particular person,
he will not feel disconnected when you take it elsewhere, you just
have to let him know as well. Of course, such continuation needs to
be agreed upon by both parties.

Victor
Jul 22 '05 #18
Ok, ok, I yield; jeez! :-)
Jul 22 '05 #19
So where should I go then? Is there some place that is a bit more open
than this?
Jul 22 '05 #20
"Morten Aune Lyrstad" <mo****@wantsno.spam> wrote in message
news:KD******************@news4.e.nsc.no...
So where should I go then? Is there some place that is a bit more open
than this?

comp.programming is very general, and was already suggested by Victor.
And what about comp.compilers ?

Each forum has a goal and etiquette, and one has to respect this.
I agree that some "you're OT" comments here can be (excessively) brutal,
but C++ is found in all fields of programming. Without these reminders,
this NG would be soon be overloaded with too many off-topic threads.

Cheers,
Ivan
--
http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
Jul 22 '05 #21
Morten Aune Lyrstad wrote:

You seem to have some knowledge about this.
Actually I like compiler construction and built various
virtual CPU's for them. It's all about practice.
What do you think about this
as a translated 'for' loop? My virtual machine can handle this now. It
should be the equivalent of

for (int i = 0; i < 10; i++)
{
}

-------------------- SAGA code --------------------
push [addr0] ;\
push 10 ;} Store 10 into the first variable
store ;/
push [addr1] ;\
push 0 ;} Store 0 into the second variable
store ;/
loopStart:
push [addr1] ;} Load the second variable
load ;/
push [addr0] ;} Load the first variable
load ;}
lessi ; Compare integer. Is A less than B?
jmpzero endLabel ; No, exit loop
push [addr1] ; Push address of second variable
push [addr1] ;} Load the second variable
load ;/
push 1 ;} Add 1 to the value
addi ;/
store ; Store into the second variable
jmp loopStart ; Start loop over
endLabel:
end ; End function
------------------ SAGA code end ------------------

I really appreciate your help.


Hmm. In the source code, 10 is a compile time constant not a variable.
So I would have expected something like this.

push [addr1] ;\
push 0 ;} Store 0 into the second variable
store ;/
loopStart:
push [addr1] ;} Load the second variable
load ;/
push 10 ;} Load the constant
lessi ; Compare integer. Is A less than B?
jmpzero endLabel ; No, exit loop
push [addr1] ; Push address of second variable
push [addr1] ;} Load the second variable
load ;/
push 1 ;} Add 1 to the value
addi ;/
store ; Store into the second variable
jmp loopStart ; Start loop over
endLabel:
end ; End function

Other then that: looks good.
Now we are back to your original question: 'What instructions should
a virtual CPU have'. Looking at your code, you will easily see a need
for an additional instruction: inci (increment integer). The sequence
push 1
addi
can easily be replaced by
inci ; increment top of stack elemnt

Other improvements: You often will see the sequence
push [some_addr]
load
which can easily be replaced with
load_from [some_addr]

and so on, and so on.
This is what I ment earlier: By looking at the generated assembly
one can often see possible improvements and adjust the instruction
set to include them. But beware: If you see such patterns, make
sure that they are frequent enough that it really pays to introduce
a new instruction. Eg. I once created a language which included
elements for doing 3D-graphics programming. In the instruction set
I had instructions for creating points, lines, edges, faces, lights,
and so on. I thought a while about also including a special instruction
for calculating the phythagorean distance ( sqrt(x*x+y*y+z*z) ) of 2
points. But I noticed that it was far to infrequently used to keep
it.

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #22

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by Fabian | last post: by
4 posts views Thread by Leslaw Bieniasz | last post: by
8 posts views Thread by John | last post: by
reply views Thread by leo001 | last post: by
reply views Thread by lumer26 | last post: by
1 post views Thread by lumer26 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.