PA4 — The Semantic Analyzer

PA4 is due 10/25 at 11:59pm Central.

For Vanderbilt's F23 semester, you must complete this assignment in Python.

You may work in a team of up to two people for this assignment. You may work in a team for any or all subsequent programming assignments. You do not need to keep the same teammates. The course staff are not responsible for finding you a willing team. If you want to work on a team, you must register your group on the autograder before submitting!

Goal

For this assignment, you will write a semantic analyzer. Among other things, this involves traversing the abstract syntax tree and the class hierarchy. You will reject all Cool programs that do not comply with the Cool type system (producing an error message). For valid Cool programs, you will produce an annotated abstract syntax tree, class map, implementation map, and parent map.

You will also write additional code to deserialize the AST produced by the parser stage and to serialize the class map, implementation map, parent map, and annotated AST produced by your semantic analysis.

Specification

You must create three artifacts:

  1. A program that takes a single command-line argument (e.g., file.cl-ast). That argument will be an ASCII text Cool abstract syntax tree file (as described in PA3). Your program must either indicate that there is an error in the input (e.g., a type error) or emit file.cl-type, a serialized Cool abstract syntax tree, class map, implementation map, and parent map. If your program is called checker, invoking checker file.cl-ast should yield the same output as the reference compiler, cool --type file.cl. Your program will consist of one or more Python files.
  2. A plain ASCII text file called readme.txt describing your design decisions and choice of test cases. See the grading rubric. A few paragraphs should suffice.
  3. Testcases good.cl, bad1.cl, bad2.cl and bad3.cl. The first should pass the semantic analysis stage. The remaining three should yield semantic analysis errors.

Error Reporting

To report an error, write the string

ERROR: line_number: Type-Check: message

to standard output and terminate the program. You may write whatever you want in the message, but it should be fairly indicative. Example erroneous input:

class Main inherits IO {
 main() : Object {
   out_string("Hello, world.\n" + 16777216) -- adding string + int !?
 } ;
} ;

Example error report output:

ERROR: 3: Type-Check: arithmetic on String Int instead of Ints

Line Number Error Reporting

The typing rules do not directly specify the line numbers on which errors are to be reported. As of v1.11, the Cool reference compiler uses these guidelines (possibly surprising ones are italicized):

Remember that you do not have to match the English prose of the reference compiler's error messages at all. You just have to get the line number right.

Semantic checks are unordered — if a program contains two or more errors, you may indicate whichever you like. You can infer from this that all of our test cases will contain at most one error.

The .cl-type File Format

If there are no errors in the input file.cl-ast, your program should create file.cl-type and serialize the class map, implementation map, parent map, and annotated AST to it (in that order).

The class and implementation maps are described in the Cool Reference Manual.

A .cl-type file consists of four sections:

  1. The class map.
  2. The implementation map.
  3. The parent map.
  4. The annotated AST.

Simply output the four sections in order, one after the other.

We will now describe exactly what to output for the class and implementation maps. The general idea and notation (one string per line, recursive descent) are the same as in PA3.

Detailed .cl-type Example

Now that we've formally defined the output specification, we can look at a worked example. Here's the example input we will consider:

class Main inherits IO {
  my_attribute : Int <- 5 ; 
  main() : Object { 
    out_string("Hello, world.\n") 
  } ;
} ; 

Resulting .cl-type class map output with comments:

.cl-type class map comment
class_map
6 number of classes
Bool note: includes predefined base classes
0
IO
0
Int
0
Main
1 our Main has 1 attribute
initializer
my_attribute named "my_attribute"
Int with type Int
2 initializer expression line number
Int initializer expression type (see above: this is an expression annotated with a type)
-- do not emit these expression type annotations for the PA4c Checkpoint!
integer initializer expression kind
5 which integer constant is it?
Object
0
String
0

Resulting .cl-type implementation map output with comments:

.cl-type implementation map comment
implementation_map
6 six classes
Bool first is Bool
3 - it has three methods
abort - first is abort()
0 -- abort has 0 formal arguments
Object -- name of parent class from which Bool inherits abort()
0 -- abort's body expression starts on line 0
Object -- abort's body expression has type Object
internal -- abort's body is an internal kind of expression (i.e., a system call; see above)
Object.abort -- extra detail on abort's body expression
copy - second of Bool's three methods is copy()
0 -- copy has 0 formal arguments
Object -- name of parent class from which Bool inherits copy()
0 -- copy's body expression starts on line 0
SELF_TYPE -- copy's body expression has type SELF_TYPE
internal -- copy's body is an internal kind of expression (i.e., a system call; see above)
Object.copy -- extra detail on copy's body expression
... many lines skipped ...
Main another class is Main
8 - it has 8 methods
... many lines skipped ...
main - one of Main's methods is main()
0 -- main has 0 formal arguments
Main -- the name of the class where Main.main() is defined
4 -- the body expression of Main.main starts on line 4
SELF_TYPE -- the body expression of Main.main has type SELF_TYPE
self_dispatch -- the body of Main.main() is a self_dispatch kind of expression
... many lines skipped ...

Finally, the resulting .cl-type parent map output with comments:

.cl-type parent map comment
parent_map
5 there are five classes with parents (Object is the sixth class)
Bool Bool's parent ...
Object ... is Object.
IO IO's parent ...
Object ... is Object.
Int Int's parent ...
Object ... is Object.
Main Main's parent ...
IO ... is IO.
String String's parent ...
Object ... is Object.

Writing the rote code to output a .cl-type text file given an AST may take a bit of time but it should not be too complex; our reference implementation does it in 35 lines, and cleaves closely to the structure given above. Reading in the AST is similarly straightforward; our reference implementation does it in 171 lines.

Commentary

You can do basic testing as follows:

$ cool --parse file.cl
$ cool --out reference --type file.cl
$ python3 my-main.py file.cl-ast
$ diff -b -B -E -w file.cl-type reference.cl-type

You should implement all of the typing rules in the Cool Reference Manual. There are also a number of other rules and corner cases you have to check (e.g., no class can inherit from Int, you cannot redefine a class, you cannot have an attribute named self, etc.). They are sprinkled throughout the manual. Check everything you possibly can.

Starter Resources

Zach Karas has graciously followed along with the video guides below to create some helpful starter code. While you are welcome to start with this code, it is incomplete and will require effort to fully function. It is ultimately your responsibility to create functioning code for the autograder.

Video Guides

NOTE: Some of these video guides are from a previous offering of a similar course at the University of Virginia. The assignment for this semester has changed slightly. While they are still relevant, you are responsible for completing the assignment according to this course's grading rubric.

A number of Video Guides are provided to help you get started on this assignment on your own. The Video Guides are walkthroughs in which the instructor manually completes and narrates, in real time, the first part of this assignment — including a submission to the grading server. They include coding, testing and debugging elements.

If you are still stuck, you can post on the forum, approach the TAs, or approach the professor. The use of online instructional content outside of class weakly approximates a flipped classroom model. Click on a video guide to begin, at which point you can watch it fullscreen or via Youtube if desired.

PA4c — Checkpoint
PA4 — Expressions
PA4 Primer
PA4 Python Deserialization
PA4 Python Examples
(Some students report the audio is broken in the other video. This is a re-recording)

PA4c — Checkpoint

PA4c is a checkpoint for PA4 to encourage you to start early. This is not required for credit, but you will get unlimited submissions on the autograder so you can test early and often.

For PA4c you can submit to the autograder an early version of PA4 that does the following:

Thus, you should build the class hierarchy and check everything related to that. For example:

Q: What's the exact list of errors I have to check for in PA4c?

A: No such list is provided! Part of the assignment is thinking up all possible checks that do not involve expressions.

What to Turn In For PA4c

If you decide to complete PA4c (highly recommended), you can submit your source code to the autograder:

  1. main.py — the main portion of your implementation, which we will execute with python3 main.py some_input.cl-ast.
  2. In addition, you can submit up to 20 additional *.py files (e.g., to import).

What to Turn In For PA4

You must turn in the following files to the autograder:

  1. readme.txt — your README file describing your implementation as well as a list of type checking rules you thought up.
  2. good.cl — a novel positive testcase
  3. bad1.cl — a novel negative testcase
  4. bad2.cl — a novel negative testcase
  5. bad3.cl — a novel negative testcase
  6. The source code of your implementation, including
    • main.py, the main implementation. We will execute your submission using python3 main.py some_test.cl-ast and expect to have it produce some_test.cl-type as output.
    • Optionally, you can provide up to 20 more *.py files if you decide to split up your implementation.

Grading Rubric

PA4 Grading (out of 110 points):