Due 11:59pm Monday Aug 26
The goal of this project is to setup the environment and to learn how to parse and translate mathematical expressions into x86_64 assembly code. Our approach will be simple and the objective is to highlight some of the key aspects and ideas behind a compiler.
For this first project, we are going to keep things very simple and take very small steps. Make sure that you understand everything correctly before going to the next step.
At the end of the project, we will be able to parse mathematical expressions with single digit numbers, addition, substraction, multiplication, division, and parentheses. Our parser will generate an intermediate representation in the form of an Abstract Syntax Tree (AST). The definition of the AST is the one used during the lecture.
In parallel, we will implement a generator that converts the AST into x86_64 code.
Here is a small example of what we will be able to generate by the end of the project:
[info] Running project1.Runner 2+3 ============= AST ================ Plus(Lit(2),Lit(3)) ================================== ============ OUTPUT ============== .text .global entry_point entry_point: push %rbp # save stack frame for C convention mov %rsp, %rbp # beginning generated code movq $2, %rbx movq $3, %rcx addq %rcx, %rbx movq %rbx, %rax # end generated code # %rax contains the result mov %rbp, %rsp # reset frame pop %rbp ret ================================== Result: 5
The first step of this assignment is to set up your environment as described in the Getting Started - Tools page. If you have any trouble installing the tools for the course, ask the TAs for help.
Note that all of the prerequisites (even IDEs like Eclipse) are installed on the CS Department Linux machines, and you are free to use an X server such as xming to do the project remotely or follow these instructions to access the lab machines remotely. Alternatively you may do the project on your own machine and then upload your results to your home directory on data.cs.purdue.edu and turn in your results using turnin.
The project has been designed and tested for Linux/Mac OS. If you have only Windows installed on your laptop consider running Linux in a VM, or use the lab machines for the project. (However it should be working on Windows as well)
If you use remote access to work on your project, please use one of the lab machines pod1-1 to pod1-20 with the suffix cs.purdue.edu (e.g. pod1-1.cs.purdue.edu)
Download the skeleton file here .
tar xzvf proj1.tar.gz cd proj1
You can use your editor or IDE of choice. The Scala website has instructions for a range of alternatives.
A description of the different files is provided below. The two files you need to complete are Parser.scala and Generator.scala. Each parser that we are going to implement builds on top of another, so you need to implements them in order by following the comments in the code. We will implement two different generators using two different strategies. It is recommended to implement the StackASMGenerator class when it is suggested in the Parser.scala file, and implement the RegASMGenerator at the end.
The Scala Build Tool (sbt) is an open source build tool for Scala and Java projects. These files contain the information necessary to compile this project. You should not have to modify them.
To use sbt, launch a terminal and go to the project directory (proj1) and type 'sbt'. It will lauch an sbt console. You can run the program from there:
run "arg1" "arg2" // run main program with arguments arg1 and arg2 run "1+3*(5-8)" test // run all the tests application (in src/test)
This file defines multiple classes that are used to generate the code and run it on your machine. Nothing needs to be modified, but it is recommanded to read it and have an idea of what is happening behind the scenes.
As we are generating assembly code which is OS dependent, we are using GCC to do the heavy lifting for us. The boostrap file is a generic C file that is calling a function entry_point and is printing the result in stdout. Our compiler will generate the file gen/gen.s and will be assembled and bootstrapped by gcc:
gcc bootstrap.c gen.s -o out
out will then print the result of our compiled expression.
NOTE: the assembly/compilation and running steps are executed through the code. (see Util.scala#L88 and Main.scala#L39)
The main function is defined in this file. During this project, you will be implementing many different parsers in order to test them through the main function. You will need to modify the parser or generator you want to use.
This class contains the definition of our intermediate language. This is the language we are using in class; please refer to the lecture for more information.
The class SimpleParser defined in this file is the basic parser that we are going to use for this project. It is reading a stream of characters one at a time and can extract a single digit number getNum or a single letter name getName. It does not handle whitespace.
We are keeping the parser very simple at the beginning to focus on the important concepts. Later on, we will improve it to handle multiple digits numbers and more expressions. We will also support whitespace.
The rest of the file is the definition of each parser we are building, starting from single number expressions to the full language we want to target. What you have to do for this project is highlighted with TODOs in the code. We encourage you to read the comments and code carefully before proceeding.
In this file, we are defining some generators that can convert our AST into x86_64 code.
We consider two different ways of generating the assmbly code, each of which has pros and cons: Stack-based code, which is not very efficient but can handle arbitrarily complex expressions; and Register-based code, which is efficient but can not handle arbitrarily complex expressions. We will see later that compilers use a hybrid approach.
These files contain some unit tests for the first parsers. You will have to write your own tests for the others. There are some functions given to you in order to make the implementation easier.
You should turnin the proj1 directory. Please run an 'sbt clean' and './cleanall.sh' before submitting.
To turn in your project, make sure that you are in the directory that contains proj1 directory. Then type the following:
turnin -c cs502 -p proj1 proj1
To verify your turned-in work, do:
turnin -c cs502 -p proj1 -v
Your project will be tested against a set of unit tests. The weights of each task will be distributed as follow: