For Vanderbilt's F23 semester, you must complete this assignment in
Python.
You may work in a team of up to two people for this assignment. You may work in a
team for any or all subsequent programming assignments. You do not need to
keep the same teammates. The course staff are not responsible for finding
you a willing team.
If you want to work on a team, you must register your group on the
autograder before submitting!
Goal
For this assignment, you will write a semantic analyzer. Among other
things, this involves traversing the abstract syntax tree and the class
hierarchy. You will reject all Cool programs that do not comply with the
Cool type system (producing an error message). For valid Cool programs,
you will produce an annotated abstract syntax tree, class map,
implementation map, and parent map.
You will also write additional code to deserialize the AST produced by
the parser stage and to serialize the class map, implementation map,
parent map, and annotated AST produced by your semantic analysis.
Specification
You must create three artifacts:
A program that takes a single command-line argument (e.g.,
file.cl-ast). That argument will be an ASCII text
Cool abstract syntax tree file (as described in PA3).
Your program must either indicate that there is an error in the
input (e.g., a type error) or emit file.cl-type, a
serialized Cool
abstract syntax tree, class map, implementation map, and parent map.
If your program is called checker, invoking checker
file.cl-ast should yield the same output as the reference
compiler, cool --type
file.cl. Your program will consist of
one or more Python files.
A plain ASCII text file called readme.txt describing your
design decisions and choice of test cases. See the grading rubric. A few
paragraphs should suffice.
Testcases good.cl, bad1.cl, bad2.cl and
bad3.cl. The first should pass the semantic analysis stage. The
remaining three should yield semantic analysis errors.
Error Reporting
To report an error, write the string
ERROR: line_number: Type-Check: message
to standard output and terminate the
program. You may
write whatever you want in the message, but it should be fairly indicative.
Example erroneous input:
class Main inherits IO {
main() : Object {
out_string("Hello, world.\n" + 16777216) -- adding string + int !?
} ;
} ;
Example error report output:
ERROR: 3: Type-Check: arithmetic on String Int instead of Ints
Line Number Error Reporting
The typing rules do not directly specify the line numbers on which errors
are to be reported. As of v1.11, the Cool reference compiler uses these
guidelines (possibly surprising ones are italicized):
Errors related to parameter-less method main in class
Main: always line 0
Inheritance cycle: always line 0
Other inheritance type problem: inherited type identifier location
self or SELF_TYPE used in wrong place: self
(resp. SELF_TYPE) identifier (resp. type) location
Redefining a feature: (second) feature location
Redefining a formal or class: (second) identifier location
Other attribute problems: attribute location
Redefining a method and changing types: (second) type location
Other problems with redefining a method: method location
Method body type does not conform: method name identifier
location
Attribute initializer does not conform: attribute name
identifier location
Errors with types of arguments to relational/arithmetic operations:
location of relational/arithmetic operation expression
Errors with types of while / if subexpression(s):
location of (enclosing) while or if expression
(not the location of the conditional expression)
Errors with case expression (e.g., lub): location of
case expression
Errors with conformance in let: location of let
expression (not location of initializer)
Errors in blocks: location of (beginning of) block expression
Errors in actual arguments: location of method invocation expression
(not the location of any particular actual argument)
Assignment does not conform: assignment expression location
(not right-hand-side location)
Unknown identifier: location of identifier
Unknown method: location of method name identifier
Unknown type: location of type
Remember that you do not have to match the English prose of the reference
compiler's error messages at all. You just have to get the line number
right.
Semantic checks are unordered — if a program contains two
or more errors, you may indicate whichever you like. You can infer from
this that all of our test cases will contain at most one error.
The .cl-type File Format
If there are no errors in the input file.cl-ast, your program should
create file.cl-type and serialize the
class map, implementation map, parent map, and annotated AST to it (in
that order).
The class
and implementation maps are described in the Cool Reference Manual.
A .cl-type file consists of four sections:
The class map.
The implementation map.
The parent map.
The annotated AST.
Simply output the four sections in order, one after the other.
We will now describe exactly what to output for the class and
implementation maps. The general idea and notation (one string per line,
recursive descent) are the same as in PA3.
The Class Map
Output class_map \n.
Output the number of classes and then \n.
Output each class in turn (in ascending alphabetical order):
Output the name of the class and then \n.
Output the number of attributes and then \n.
Output each attribute in turn (in order of appearance, with
inherited attributes from a superclass coming first):
Output no_initializer \n and then the attribute name
\n and then the type name \n.
or Output initializer \n and then the
attribute name \n and then the type name \n and then the
initializer expression.
The Implementation Map
Output implementation_map \n.
Output the number of classes and then \n.
Output each class in turn (in ascending alphabetical order):
Output the name of the class and then \n.
Output the number of methods for that class and then \n.
Output each method in turn (in order of appearance, with
inherited or overridden methods from a superclass coming first;
internal methods are defined to appear in ascending alphabetical order):
Output the method name and then \n.
Output the number of formals and then \n.
Output each formal's name only:
Output the name and then \n
If this method is inherited from a parent class and not
overriden, output the name of the ultimate parent class that
defined the method body expression and then \n. Otherwise, output
the name of the current class and then \n.
Output the method body expression.
The Parent Map
Output parent_map \n.
Output the number of parent-child inheritance relations and then
\n. This number is equal to the number of classes minus one (since
Object has no parent).
Output each child class in turn (in ascending
alphabetical order):
Output the name of the child class and then \n.
Output the name of the child class's parent and then \n.
The Annotated AST
With two exceptions, the annotated AST format is
identical to the normal AST from PA3.
The first change involves expressions. To output an
Expression:
Output the line number of the expression and then a newline (as in
PA3).
Output the name of type associated with the
expression and then a newline.
For example, the expression 3+x is associated with the type
Int.
This is new to PA4. It should be done for PA4, but not for PA4c.
Output the name of the expression and then a newline and then any
subparts.
The second change is a new kind of expression, internal,
used to represent the bodies of predefined methods. Internal expressions
are those that are handled by the run-time system — you might think of
them as part of the standard library. You output Internal
Expressions (including the type annotation, as above) as follows:
0 \n type \n internal \n Class.method \n
The valid kinds of internal expressions (i.e., the values for
Class.method) are:
They are formally defined in the Cool Reference Manual.
Note that you must output
information about all classes and methods defined in the program as well
as all base classes (and their methods). Do not just print out "classes
actually used" or "methods actually called" or something like that.
Output all classes and methods — no optimizations or shortcuts!
Detailed .cl-type Example
Now that we've formally defined the output specification, we can
look at a worked example. Here's the example input we will consider:
class Main inherits IO {
my_attribute : Int <- 5 ;
main() : Object {
out_string("Hello, world.\n")
} ;
} ;
Resulting .cl-type class map output with comments:
.cl-type class map
comment
class_map
6
number of classes
Bool
note: includes predefined base
classes
0
IO
0
Int
0
Main
1
our Main has 1 attribute
initializer
my_attribute
named "my_attribute"
Int
with type Int
2
initializer expression line number
Int
initializer expression type (see
above: this is an expression annotated with a type)
-- do not emit these expression type annotations for the PA4c Checkpoint!
integer
initializer expression kind
5
which integer constant is it?
Object
0
String
0
Resulting .cl-type implementation map output with
comments:
.cl-type implementation map
comment
implementation_map
6
six classes
Bool
first is Bool
3
- it has three methods
abort
- first is abort()
0
-- abort has 0 formal arguments
Object
-- name of parent class from
which Bool inherits abort()
0
-- abort's body expression starts
on line 0
Object
-- abort's body expression
has type Object
internal
-- abort's body is
an internal kind of expression (i.e., a system call; see above)
Object.abort
-- extra detail on
abort's body expression
copy
- second of Bool's three
methods is copy()
0
-- copy has 0 formal arguments
Object
-- name of parent class from
which Bool inherits copy()
0
-- copy's body expression starts
on line 0
SELF_TYPE
-- copy's body
expression has type SELF_TYPE
internal
-- copy's body is
an internal kind of expression (i.e., a system call; see above)
Object.copy
-- extra detail on
copy's body expression
... many lines skipped ...
Main
another class is Main
8
- it has 8 methods
... many lines skipped ...
main
- one of Main's methods is
main()
0
-- main has 0 formal arguments
Main
-- the name of the class where
Main.main() is defined
4
-- the body expression of Main.main starts on line 4
SELF_TYPE
-- the body expression
of Main.main has type SELF_TYPE
self_dispatch
-- the body
of Main.main() is a self_dispatch kind of expression
... many lines skipped ...
Finally, the resulting
.cl-type parent map output with
comments:
.cl-type parent map
comment
parent_map
5
there are five classes with
parents (Object is the sixth class)
Bool
Bool's parent ...
Object
... is Object.
IO
IO's parent ...
Object
... is Object.
Int
Int's parent ...
Object
... is Object.
Main
Main's parent ...
IO
... is IO.
String
String's parent ...
Object
... is Object.
Writing the rote code to output a .cl-type text file given an AST
may take a bit of time but it should not be too complex; our reference
implementation does it in 35 lines, and cleaves closely to the structure
given above. Reading in the AST is similarly straightforward; our reference
implementation does it in 171 lines.
You should implement all of the typing rules in the Cool Reference Manual.
There are also a number of other rules and corner cases you have to check
(e.g., no class can inherit from Int, you cannot redefine a class, you
cannot have an attribute named self, etc.). They are sprinkled throughout
the manual. Check everything you possibly can.
Starter Resources
Zach Karas has graciously followed along with the video guides below
to create some helpful starter code. While you are welcome to start
with this code, it is incomplete and will require effort to fully
function. It is ultimately your responsibility to create functioning
code for the autograder.
NOTE: Some of these video guides are from a previous
offering of a similar course at the University of Virginia. The
assignment for this semester has changed slightly. While they are
still relevant, you are responsible for completing the assignment
according to this course's grading rubric.
A number of Video Guides are provided to help you get started on this
assignment on your own. The Video Guides are walkthroughs in which the
instructor manually completes and narrates, in real time, the
first part of this assignment — including a submission to the
grading server. They include coding, testing and debugging elements.
If you are still stuck, you can post on the forum,
approach the TAs, or approach the professor. The use of online
instructional content outside of class weakly approximates a
flipped classroom
model. Click on a video guide to begin, at which point you can watch
it fullscreen or via Youtube if desired.
PA4c — Checkpoint
PA4 — Expressions
PA4 Primer
PA4 Python Deserialization
PA4 Python Examples
(Some students report the audio is broken in the other
video. This is a re-recording)
PA4c — Checkpoint
PA4c is a checkpoint for PA4 to encourage you to start early.
This is not required for credit, but you will get unlimited submissions
on the autograder so you can test early and often.
For PA4c you can submit to the autograder an early version
of PA4 that does the following:
Reads in the .cl-ast file given as a command-line argument.
You do not need to use a parser generator to read in the
.cl-ast file — its format was specifically chosen to make it
easy to read with just some mutually-recursive procedures. It should
take you (much) fewer than 150 lines to read in the .cl-ast
file.
Does every bit of typechecking and semantic analysis possible
without typechecking expressions.
Thus you should not annotate types in initializer
expressions in the class map.
Prints out error messages as normal.
Outputs only the class map to .cl-type if there are
no errors.
You can use the --class-map command-line argument to
get the reference compiler to produce the class map after
typechecking (for comparison).
Thus, you should build the class hierarchy and check everything related to
that. For example:
Check to see if a class inherits from Int (etc.).
Check to see if a class inherits from an undeclared class.
Check for cycles in the class hierarchy.
Check for duplicate method or attribute definitions in the same class.
Check for a child class that redefines a parent method but changes
the parameters.
Check for a missing method main in class Main.
Check for self and SELF_TYPE mistakes in classes
and methods.
This list is not exhaustive -- read the Cool Reference Manual
carefully and find everything you might check for without typechecking
expressions.
Basically, you'll look at classes, methods, and attibutes (but not
method bodies, which you'll do for the full submission).
Q: What's the exact list of errors I have to check for in PA4c?
A: No such list is provided! Part of the assignment is thinking up all
possible checks that do not involve expressions.
What to Turn In For PA4c
If you decide to complete PA4c (highly recommended), you can submit your
source code to the autograder:
main.py — the main portion of your implementation,
which we will execute with python3 main.py
some_input.cl-ast.
In addition, you can submit up to 20 additional *.py
files (e.g., to import).
What to Turn In For PA4
You must turn in the following files to the autograder:
readme.txt — your README file describing your
implementation as well as a list of type checking rules you thought
up.
good.cl — a novel positive testcase
bad1.cl — a novel negative testcase
bad2.cl — a novel negative testcase
bad3.cl — a novel negative testcase
The source code of your implementation, including
main.py, the main implementation. We will execute
your submission using python3 main.py
some_test.cl-ast and expect to have it produce
some_test.cl-type as output.
Optionally, you can provide up to 20 more *.py files
if you decide to split up your implementation.
Grading Rubric
PA4 Grading (out of 110 points):
83 points — for autograder tests
Each missed test removes points, to a minimum of 0/83, even if
there are more tests than total points.
10 points — for a clear description in your
readme.txt file
10 — thorough discussion of design decisions (e.g., handling
of the class hierarchy, case and new and dispatch)
and choice of test cases; a few paragraphs of coherent English
sentences should be fine
5 — vague or hard to understand; omits important details
0 — little to no effort, or submitted an RTF/DOC/PDF file
instead of plain TXT
12 points — 3 points each for valid and novel good.cl, bad1.cl,
bad2.cl and bad3.cl files
3 — wide range of test cases added, stressing most Cool features
and three error conditions, novel files
2 — added some tests, but the scope not sufficiently broad
0 — little to no effort, or course files resubmitted as
tests
5 points — for code cleanliness
5 — code is mostly clean and well-commented
3 — code is sloppy and/or poorly commented in places
0 — little to no effort to organize and document code