Language Description

Introduction

The F04 language will be similar to standard algorithmic languages, but will not follow any one in particular. You will not have an extensive library for input, output, system calls, or memory allocation.

The lexigraphical structure of the language is defined as follows:

Operator

Token name
= ASSIGN
: COLON
, COMMA
// COMMENT
&& DAND
/ DIVIDE
|| DOR
== DEQ
>= GEQ
> GT
[ LBRACKET
<= LEQ
( LPAREN
< LT
- MINUS
% MOD
* MULTIPLY
. PERIOD
~= NE
~ NOT
. PERIOD
+ PLUS
] RBRACKET
) RPAREN
; SEMI

Tokens do not cross new lines. If lex cannot produce a string or number, you do not either. An error message is appropriate if you truncate something, however. An underflow (a floating point number that rounds off to 0 instead of whatever it really is in infinite precision arithmetic), however, is not an error, and should be treated as 0.

Defining It by Example

The easiest way to see what the language really is, is to study an example. Our example solves the heat equation on a patch. With a little work, it could even be a set of patches that were chosen using an adaptive mesh refinement method and could be turned into a parallel programming code. Oh, no... Who let a partial differential equation solver into a compiler course!!!

// This is the Fall 2004 target code for CS 441G.
// Your compiler should be able to lex, parse, and generate code for the
// entire file by the end of the course.

// Watch out for inconsistencies in this file and report them.

begin program AMRsolve;

    begin class Grid;

        // Variable declarations for class
        int    imin, imax;        // first and last indices in dimension i
        int    jmin, jmax;        // first and last indices in dimension j
        double    data[imin:imax, jmin:jmax];    // actual data

        begin procedure Make( int i0, i1, j0, j1; );

            // Local variable
            int    j;

            // Check and set bounds
            if i0 <= 0 || i1 < i0 || j0 <=0 || j1< j0 then
                    printf("Error in array bounds in call to Grid.Make()\n");
                    return;
            else
                    imin = i0;  imax = i1;  jmin = j0; jmax = j1;
            end if

            // Zero data
            begin do int i = i0 : i <= i1 : i = i+1;
                begin do j = j0 : j <= j1;
                    data[i,j] = 0.;
                    j = j + 1;
                end do;
            end do;

        end procedure; // Make

        begin procedure SetInterior( double val; )

            int    i, j;            // loop variables

            i = imin+1;
            begin do i = imin + 1 : i < imax : i = i + 1;
                begin do j = jmin + 1 : j < jmax : j = j + 1;
                    data[i,j] = val;
                end do;
            end do;

        end procedure; // SetInterior

        begin function int Iterate( int iters; double rnorm; );

            double    nrm;          // Norm of data
            int       i, j;         // Utility variables
            nrm = rnorm + 1.0e0;

            // Loop at most iters times while the error norm is greater than rnorm
            begin do int itr = 1 : i <= iters : i = i + 1 while nrm > rnrm;

                begin do i = imin : i <= imax : i = i + 1;
                    begin do j = jmin : j <= jmax : j = j + 1;
                        data[i,j] = 2.5e-1 * ( data[i-1,j] + data[i+1,j]
                                                + data[i,j-1] + data[i,j+1] );
                        nrm = nrm + data[i,j] * data[i,j];
                    end do;
                end do;

            end do;

            return itr;

        end function; // Iterate

        begin procedure Destroy();

            imin = 0; imax = 0; jmin = 0; jmax = 0;    // Never hurts

        end procedure; // Destroy

    end class; // Grid
     

    // Main program follows:

    Grid    u;    // Declare
    int     i;    // Number of iterations run

    u.Make( 0, 127, 32, 255);
    u.SetInterior( 2.0 );
    i = u.Iterate( 100, 1.0e-3 );
    printf( "%d iterations\n", i );        // Pass this straight through to C

end program; // AMRSolve

Notes:

  1. Declarations can occur right after a begin statement. The declarations are only valid within the block. After the block end, all of the variables declared in that block become undefined and memory should be freed automatically.
  2. Not all of the key words have been used (e.g., scanf and some of the logical operators).
  3. scanf and printf are special cases that will be discussed in class.
  4. A comment is the text folloing the // to the end of the line, as in C++. The text should be thrown away by the lexer unless you want to save it for some reason (e.g., complete program reconstruction from your parse tree).

Class Examples

The class put together a number of simple 5-10 line examples:

arnaoudova.f04
berry.f04
brumm.f04
chao.f04
delgoda.f04
drerup.f04
funken.f04
gilbert.f04
greathouse.f04
hays.f04
kaneko.f04
meredith.f04
pang.f04
porter.f04
quinn.f04
yang.f04

The Symbol Table

The symbol table will be a pain in the neck all semester. Just when you think that you have it just right, you will have an inspiration and need a modification or addition. So be really flexible and defensive in designing your symbol table.

Start by entering data such as an identifier name, constant value, the scope, and the memory location (if appropriate). Keep all attributes of a name in a separate record. Also, think of a fast way to access the symbol table. My favorite is a combination of hashing the symbols and then using a linked list to get to the right symbol table entry.

Your lexer can help. For many tokens, there is the token value (e.g., ICONSTANT) and a token subvalue (e.g., the binary value of the string in yytext) that can be stored in the global variable yylval. You should take advantage of this as often as possible in the early stages of compiling.

Finally, our language allows for variable to be redefined inside of blocks. Watch out for the nested loop variables and other equally unsavory variables. Your symbol table must resolve entries correctly so that the right instance of a variable is always used in the right place.