The Itti language will resemble nice features from several well known computer languages, but have a set of quirks that are uncommon. You will not have an extensive library for input and output (or any other operation for that matter).
The lexical structure of Itti is defined as follows:
- Ignore white space (blanks, tabs, new lines, and form feeds).
- An identifier is a string made up of letters, digits, and underscores that does not begin with a digit. The maximum number of characters you should pay attention to are the first 32. Ignore everything after that. (Token name is IDENTIFIER.)
- A character constant is enclosed in single quotes. (Token name is CCONSTANT.)
- A string constant is enclosed in double quotes. (Token name is SCONSTANT.)
- Special characters should be treated as C does using a backslash.
- An integer is a sequence of digits 0, ..., 9. (Token name is ICONSTANT.)
- A floating point number has 64 bits (i.e., a C double) and the following
- Mantissa only: 123., 123.3, 0.3, .3
- Mantissa and exponent: 0.123d42, 1.23d-3, 0.001d+10
- Keywords should processed specially:
- begin (Token name BBEGIN.)
- Operators (and punctuation) that need to be recognized are the following:
-> ARROW = ASSIGN : COLON , COMMA && DAND || DBAR == DEQ >= GE > GT [ LBRACKET <= LE ( LPAREN < LT - MINUS % MOD ~= NE ~ NOT + PLUS ] RBRACKET ) RPAREN ; SEMI / SLASH * STAR // COMMENT . PERIOD
Tokens do not cross new lines. If lex cannot produce a string or number, you do not either. An error message is appropriate if you truncate something, however. An underflow (a floating point number that rounds off to 0 instead of whatever it really is in infinite precision arithmetic), however, is not an error, and should be treated as 0.
Defining It by Example
The easiest way to see what the language really is, is to study an example. Below is an example to solve the heat equation on a patch. With a little work, it could even be a set of patches that were chosen using an adaptive mesh refinement method and could be turned into a parallel programming code. Oh, no... It's a partial differential equation solver in a compiler course!!! (Now who let this into a compiler course?)
Comments aside, find the bugs and report them to me.
// This is the Fall 2002 target code for CS 441G. // Your compiler should be able to lex, parse, and generate code for the // entire file by the end of the course. // Watch out for inconsistencies in this file and report them. begin program AMRsolve; begin class Grid; // Declarations for class int imin, imax; // first and last indices in dimension i int jmin, jmax; // first and last indices in dimension j double data[imin:imax, jmin:jmax]; // actual data begin procedure Make( int i0, i1, j0, j1; ); // Set bounds imin = i0; imax = i1; jmin = j0; jmax = j1; // Allocate data in memory (initialized to 0) allocate data; end procedure; // Make begin procedure SetInterior( double val; ) int i, j; // loop variables i = imin+1; it val == 0.0 -> return; else -> j = jmin+1; it data[i,j] = val; j = j + 1; loop j < jmax; ti; i = i + 1; loop i < imax; ti; end procedure; // SetInterior begin function int Iterate( int iters; double rnorm; ); int itr; // loop variable double nrm; // norm of data itr = 0; it // loop variables only defined inside of outermost it...ti int i, j; itr < iters -> // first case of itr loop i = imin+1; nrm = 0.; it i < imax -> j = jmin+1; it j < jmax -> data[i,j] = 2.5d-1 * ( data[i-1,j] + data[i+1,j] + data[i,j-1] + data[i,j+1] ); nrm = nrm + data[i,j] * data[i,j]; j = j + 1; loop j < jmax; ti; // j loop i = i + 1; loop i < imax; ti; // i loop nrm <= rnrm -> // second case of itr loop return itr; loop itr <= itrmax; ti; // itr loop end function; // Iterate begin procedure Destroy(); destroy data; // Remove data from memory imin = 0; imax = 0; jmin = 0; jmax = 0; // never hurts end procedure; // Destroy end class; // Grid // The main program, actually is hidden right here. Grid u; // Declare int i; // Number of iterations run u.Make( 0, 127, 32, 255); u.SetInterior( 2.0 ); i = u.Iterate( 100, 1.0d-3 ); printf( "%d iterations\n", i ); // Pass this straight through to C end program; // AMRsolve
Gee, this program is really ugly coding... :-(
Note: declarations can occur right after either a begin or an it statement. Both constructs define a block and have a matching end or ti to end the block. After the block ends, all of the variables declared in the block become undefined and their memory should be freed automatically.
The Symbol Table
The symbol table will be a pain in the neck all semester. Just when you think that you have it just right, someone in your group will have an inspiration and need a modification or addition. So be really flexible and defensive in designing your symbol table.Start by entering data such as an identifier name, constant value, the scope, and the memory location (if appropriate). Keep all attributes of a name in a separate record. Also, think of a fast way to access the symbol table. My favorite is a combination of hashing the symbols and then using a linked list to get to the right symbol table entry.
Your lexer can help. For many tokens, there is the token value (e.g., ICONSTANT) and a token subvalue (e.g., the binary value of the string in yytext) that can be stored in the global variable yylval. You should take advantage of this as often as possible in the early stages of compiling.
Finally, Itti allows for variable to be redefined inside of blocks. Watch out for the nested loop variables and other equally unsavory variables. Your symbol table must resolve entries correctly so that the right instance of a variable is always used in the right place.
The Translation Output
You will produce C code that resembles assembly language. Only very simple operations are allowed.