473,407 Members | 2,326 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes and contribute your articles to a community of 473,407 developers and data experts.

Compilers - 7: Instructions

11,448 Expert 8TB
Greetings,

Introduction

This part of the article is one week late; I apologize for that; my excuse is:
bizzy, bizzy, bizzy; I attended a nice course and I had to lecture a bit and
there simply was no time left for writing.

we've come a long way; the previous articles discussed the following aspects
of compilers:

1: a general introduction outlining the overall structure of compilers and
the languages they compile;

2: tokenizers (or lexical analyzers) that take care of transforming a character
input stream to a token input stream;

3: formal grammars that describe the syntax of a (programming) language and
form a guideline for the implementation of language parsers;

4: all the necessary bookkeeping needed by several parts of the compiler; the
tables define functions, operators and other predefined symbols;

5: parsers that take care whether or not a token stream obeys a certain syntax.
Parsers closely cooperate with the code generator part of the compiler;

6: interpreters that execute or evaluate the generated instruction sequence.

Instructions

The last part of this article discusses the instructions themselves. Every
instruction implements this interface:

Expand|Select|Wrap|Line Numbers
  1. public interface Instruction extends Serializable {
  2.  
  3.     public void execute(Interpreter interpreter) throws InterpreterException;
  4.     public void initialize(String token, int arity);
  5. }
  6.  
The second method is only used by the code generator when the instruction is
constructed. The first method is invoked by the interpreter. That's all the
interpreter is interested in: to invoke every instruction in a sequence of
instructions.

Here's the top part of the Instruction class hierarchy:

Expand|Select|Wrap|Line Numbers
  1. Instruction
  2.     |
  3.     +-------+- AbstractInstruction
  4.         |
  5.         +-------+- AbstractFunctionInstruction
  6.         |    |
  7.         |    +----------- FunctionInstruction
  8.         |    |
  9.         |    +----------- UserInstruction
  10.         |
  11.         +--------- QuoteInstruction
  12.  
The AbstractInstruction class implements the name and arity of a particular
instruction. The arity of the instruction equals the number of arguments on
the stack taken by that particular instruction; e.g. the arity of the '+'
operator equals 2 and the name equals "+".

Below the AbstractInstruction there are two subtypes of instructions: the
AbstractFunctionInstruction and the QuoteInstruction. The first subclass
implements ordinary built-in functions as well as user defined functions.

The QuoteInstruction implements special function type instruction. An example
of such an instruction is the IfInstruction: the arity equals 3 and its name
is "if". The allows us to type things like this:

Expand|Select|Wrap|Line Numbers
  1. if (x, y, z)
  2.  
If expression 'x' is not equal to zero the value of the 'if' function equals
'y', otherwise the value equals 'z'. As you could have read in a previous
part of this article, the generated instructions for x, y and z are passed to
the IfInstruction as is, i.e. the 'if' instruction passed the sequence of
instructions that make up expression 'x' to the interpreter to be executed.
If the top element of the stack is non-zero, the instruction sequence for 'y'
is passed to the interpreter, othewise the instruction sequence for expression
'z' is passed to the interpreter.

The FunctionInstruction is the base class for ordinary built-in functions
such as 'sin', 'cos', 'log' etc. The UserInstruction handles all user defined
functions.

Functions

Operators are implemented as FunctionInstructions. There's nothing strange
about it; compare this:

Expand|Select|Wrap|Line Numbers
  1. add(x, y)
  2. x+y
  3.  
It's just the notation in the source code that is a bit 'strange', i.e. the '+'
operator doubles as the 'add' function; both are the same.

A function pops its arguments from the stack and computes its value. For the '+'
operator the operands are added and the result is pushed back on the stack. For,
say, the 'sin' function, one operand is popped from the stack, the sine is
computed and the result is pushed back on the stack again.

Instead of implementing all that pushing and popping in every class the
AbstractFunctionInstruction takes care of it all. See the source code attached
with the previous part of this article.

The following functions are implemented:

+ - ! (unary operators)
< <= == != >= >
+ -
* /
^
sin cos tan
exp log
max min abs

All functions have well known semantics.

Stack functions

On top of those 'normal' functions, other functions are available that are able
to explicitly manipulate the data stack.

The following functions are implemented:

arg(n) : move the n-th element from the top to the top of the stack;
top(n) : copy the n-th element from the top to the top of the stack;
clear() : clear the entire stack;
depth() : push the number of stack elements on the stack.

You can play with these functions using the software available in the attachment
of the previous part of this article. If you ruin the stack no harm is done, i.e.
the interpreter signals an incorrect stack and throws an exception which will
be printed and execution simply stops. If you want to use the Interpreter class
in you own software you can print a message and start evaluating another sequence
of instructions. Feel free to play with it.

Lists

Our little language supports simple lists, e.g. the '+' function/operator can
be invoked on simple double operands as in, say "3+4" but one or both of its
operands can also be lists. If more than one operand is a list all lists have
to be 'simple' which means: one dimensional lists of the same length; here
are a few examples:

Expand|Select|Wrap|Line Numbers
  1. {3, 4} + 5         // {8, 9}
  2. {3, 4} + {5, 6}        // {8, 10}
  3. {3, 4} + {5, 6, 7}    // error
  4. {3, 4} + {5, {6, 7}}    // error
  5. min({3, 6}, {5, 4})    // {3, 4}
  6. max({3, 6}, 5)        // {5, 6}
  7.  
The AbstractFunctionInstruction takes care of it all; if at least one of the
operands is a list then all lists that represent the operand(s) of the function
need to be simple and of the same length. The AbstractFunctionInstruction passes
the double operands to the function and collects the results. Finally it builds
a list out of the results again.

Some functions bypass this mechanism, i.e. they handle list operands themselve.
Remember the 'listfunc' reserved word? User functions defined using the 'listfunc'
reserved word are marked such that the mechanism described above is bypassed.
There are also a few built-in functions that handle lists:

append(x, y) // append y to x
aslist(n) // store the top n stack elements in a list
concat(x, y) // catenate x and y in a new list
head(x) // first element of list x
tail(x) // all but the first element of list x
join(x, y) // join the two lists x and y
size(x) // number of elements in list x

Again, feel free to play with those functions; it's fun. See if you can find what
append, concat and join really do and how they differ.

Quote instructions

One of the QuoteInstructions has been mentioned above already; there are three
of those instructions defined:

if(x, y, z) // if (x) then y else z
while(x, y) // 0 or the last result of y
generate(x, y) // while (x) append y to a list

Please do play a bit with the included software (see previous article part)
and if you feel adventurous use the already implemented classes as a blueprint
for your own functions. If won't be much of an adventure. Don't forget to alter
the .properties files accordingly.

Concluding remarks

Given all the software described in this large article it is a breeze to
implement, say, an expression evaluator. Just a few lines of code will do the
job. This interpreter can do much more than just that but I leave it to your
imagination to come up with great ideas that can put this little system to
good work. Even just printing out the generated instructions (try it!) can
be an educational experience: you'll see infix to postfix translation for free.

Here's that evaluator method:

Expand|Select|Wrap|Line Numbers
  1. // evaluate the expression, throws InterpreterExcetpion if anything's wrong
  2. public double evaluate(String expression) throws InterpreterException {
  3.  
  4.     // parse the string
  5.     Code code= new Parser().parse(new StringReader(expression));
  6.  
  7.     // do the evaluation and get the stack top
  8.     Value value= new Interpreter().executeSub(code);
  9.  
  10.     // convert the value back to a double
  11.     return value.getDbl();
  12. }
  13.  
I hope people have actually read this all. If you did: congratulations, you
must've learned quite a bit and I hope the included software can be of any
use for you.

The next tip will be about text processing and querying. I haven't actually
implemented anything yet but I'll come up with something.

kind regards,

Jos
Jul 4 '07 #1
0 4034

Sign in to post your reply or Sign up for a free account.

Similar topics

17
by: Ozo | last post by:
What would be the C++ compiler producing the fastest code for Windows XP Pro (32-bit)? I have to choose between these two: - Visual C++ 2005 compiler - Intel C++ Compiler 9.0 for Windows My...
4
by: nutty | last post by:
Hi, In my code I use two implicit casts in a row. VC compiled it in all version I have. 7.1 and 8.0, but it seems not to be standard. Comeau in strict mode and gcc don't accept it. Of course,...
0
by: JosAH | last post by:
Greetings, this week's compiler article is all about bookkeeping; boring, I admit it, but we need it for our Tokenizer and Parser(s). Two weeks ago I showed the Tokenizer class code. It uses a...
0
by: JosAH | last post by:
Greetings, welcome back at the sequel of the parsers article chapter. This part is dedicated to the ExpressionParser, the largest parser class for our little language. This class parses a...
0
by: JosAH | last post by:
Greetings, this week's article part discusses the parsers used for our little language. We will implement the parsers according to the grammar rules we defined in the second part of this...
1
by: JosAH | last post by:
Greetings, welcome back again. The part of the articles rounds up the discussion on the Interpreter and disusses some code uploaded in the attachment. The Interpreter again The Interpreter...
0
by: JosAH | last post by:
Greetings, last week we completed an important part of our compiler. We can parse and compile source text. Source texts contain a list of expressions and function definitions. Either compilation...
0
by: JosAH | last post by:
Greetings, welcome back to the second part of this week's tip. This article part shows you the Tokenizer class. Let's do it in small parts: public class Tokenizer { // default size of a...
2
by: justinanthonyhamilton | last post by:
Hello, everyone. I recently took a class on Data Structures in C++ (using D.S. Malik's C++ Programming: Program Design Including Data Structures), and while I learned a good about specific data...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.