In this third, you will implement the typer for the Alpine compiler.
To get the lab files, you have 2 options:
-
pull this repository if you already cloned it last week. Otherwise, you can clone it by running the following command:
$ git pull
or
$ git clone https://github.com/epfl-lara/compiler2024-labs-public.git
-
Download the zip file on Moodle
Then take your current alpine
project, i.e., where you implemented the interpreter and the parser, and:
- copy the
typing/
directory from this week (either from zip or repo) into youralpine
project at this place:src/main/scala/alpine/typing
- copy the
Main.scala
file from this week (either from zip or repo) into youralpine
project at this place:src/main/scala/Main.scala
- copy the
driver/
directory from this week (either from zip or repo) into youralpine
project at this place:src/main/scala/alpine/driver
- copy the new test files by copying the
test/typing
directory from this week (either from zip or repo) into youralpine
project at this place:src/test/scala/alpine/typing
- move the interpreter tests from
archive/test/evaluation
back tosrc/test/scala/alpine/evaluation
.
Note: the InterpreterTests.scala
contain an old behavior that is not relevant anymore. You can either delete it or pull the new changes from the repository.
Edit: Please copy the symbols/Types.scala
from the latest version on the repo. The old labs contain an outdated version of this file.
Your project directory structure should look like something like this:
alpine/
├── archive/ <----- YOU CAN DELETE THIS DIRECTORY
│ ├── test/
│ │ ├── evaluation/
│ │ │ ├── InterpreterTest.scala
├── src/
│ ├── main/
│ │ ├── scala/
│ │ │ ├── alpine/
│ │ │ │ ├── driver/ <----- COPY FROM THIS WEEK FILES (replace the current one)
│ │ │ │ ├── evaluation/
│ │ │ │ │ ├── Interpreter.scala
│ │ │ │ ├── parsing/
│ │ │ │ │ ├── Parser.scala
│ │ │ │ │ ├── ...
│ │ │ │ ├── typing/ <----- COPY FROM THIS WEEK FILES
│ │ │ │ │ ├── Typer.scala
│ │ │ │ │ ├── ...
│ │ │ │ ├── util/
│ │ │ │ │ ├── ...
│ │ │ ├── Main.scala <----- COPY FROM THIS WEEK FILES (replace the current one)
│ ├── test/
│ │ ├── scala/
│ │ │ ├── alpine/
│ │ │ │ ├── evaluation/ <----- MOVE BACK FROM ARCHIVE
│ │ │ │ │ ├── InterpreterTest.scala
│ │ │ │ ├── parsing/
│ │ │ │ │ ├── ...
│ │ │ │ ├── typing/ <----- COPY FROM THIS WEEK FILES
│ │ │ │ │ ├── TyperTest.scala
│ │ │ │ ├── util/
To submit your work, go to this week assignment on Moodle and upload the following file:
src/main/scala/alpine/typing/Typer.scala
Let's recall the global idea of a simple compiler's pipeline:
Source code -> Lexer -> Parser -> Type checking -> Assembly generation
During this lab, you will implement the type checking phase of the Alpine compiler. Note that it does three things:
- It checks that the program is well-typed, i.e. that the types of the expressions are consistent with the types of the variables, the types of the functions (including the fact that the function is called with the right number of arguments and that the types of the arguments are consistent with the types of the parameters), and the types of the constants.
- It also infers the types of expressions that do not have a type ascription, i.e. it computes the type of each expression and stores it in the AST.
- It also perform “name resolution”: every variable and function call should be resolved to a unique definition (this is the
referredEntity
field from the interpreter lab.) For example, if the program contains two function definitionsfun f(x: Int) -> Int { ... }
andfun f(x: Float) -> Float { ... }
, then the callf(42)
should be resolved to one of the two definitions. The choice of the definition depends on the type of the argument42
.
The type checker supports all the features described in the language description (i.e., in this file), EXCEPT the Type declarations. This means, your typer does not have to handle the constructs of the form
type T = Int
type ListInt = #nil | #cons(head: Int, tail: ListInt)
Specifically, the subset of the language that the typer will handle does NOT include recursive types.
The typer will be implemented in the Typer.scala
file. It will visit the AST and return the computed/discovered types after having checked their consistence.
To solve the typing problem, we will generate a set of constraints and solve them using the provided Solver
.
Here are the available constraints that the solver can handle:
Equal(lhs: Type, rhs: Type, origin: Origin)
: the left hand side and the right hand side should have the same type.Subtype(lhs: Type, rhs: Type, origin: Origin)
: the left hand side should be a subtype of the right hand side.Apply(function: Type, inputs: List[Type.Labeled], output: Type, origin: Origin)
: the function should be a function type with the given inputs and output types.Member(lhs: Type, rhs: Type, member: String | Int, selection: Expression, origin: Origin)
: the left hand side should have a member with the typerhs
, i.e., something of thelhs.member: rhs
.selection
is the expression of the member selection.Overload(name: Tree, candidates: List[EntityReference], tpe: Type, origin: Origin)
: this constraint is responsible of the name analysis mentioned above. The name should be the name of entity (e.g., a function names) that can refer to multiple entities. All those entities are passed incandidates
. This constraint indicates to the solver that one of the entity in the list should have the typetpe
. TheSolver
will then pick the right entity if it exists. This is how theTyper
will resolve the ambiguity of the function callf(42)
mentioned above.
The origin is the AST node that generated this constraint (this is to show a meaningful error to the user in case the constraint cannot be solved).
All of those constraints are stored in an instance of the ProofObligations
class that will be passed to the solver.
To add new constraints you can use:
context.obligations.constrain(e, type)
to add an equality constraint that the expressione
should have typetype
. This function also returns the currently inferred type ofe
.context.obligations.add(constraint)
to add a constraintconstraint
to the obligations (any type of constraint, unlike the other possible function).
When infering types, it is not always possible to know the type of an expression when encountering it. For example, let us take the following code:
let a = 1
let x = if (a <= 0) { #person(name: "Jean") } else { #person(name: "Jean", age: a) }
In this case, when typing the if
expression, we cannot fix the type of the then
and else
branches. This is because we do not know the type of x
(we are inferring it), and we only know that the type of an if
expression is a supertype of the types of the then
and else
branches.
So to infer the type of x
, we assign it a fresh type variable Solver
is then responsible to find a type for
To generate a fresh type variable, please use the freshTypeVariable()
method.
We have provided the implementations of the visit
function of identifiers and literals, as well as for the pattern matching. You will have to implement the visit
functions for the other AST nodes.
def visitBooleanLiteral(e: ast.BooleanLiteral)(using context: Typer.Context): Type =
context.obligations.constrain(e, Type.Bool)
def visitIntegerLiteral(e: ast.IntegerLiteral)(using context: Typer.Context): Type =
context.obligations.constrain(e, Type.Int)
def visitFloatLiteral(e: ast.FloatLiteral)(using context: Typer.Context): Type =
context.obligations.constrain(e, Type.Float)
def visitStringLiteral(e: ast.StringLiteral)(using context: Typer.Context): Type =
context.obligations.constrain(e, Type.String)
In each of these methods, we add a new constraint that the visited AST node is of type Type.Bool
, Type.Int
, Type.Float
or Type.String
respectively.
def visitIdentifier(e: ast.Identifier)(using context: Typer.Context): Type =
bindEntityReference(e, resolveUnqualifiedTermIdentifier(e.value, e.site))
In this method, we make calls to:
-
resolveUnqualifiedTermIdentifier
to resolve the identifier to a list of possible entities. This step will find all the entities that the identifier can refer to. For example, if the identifier isf
, it will return all the functions that are namedf
. -
bindEntityReference(e: ast.Tree, candidates: List[EntityReference])
to bind the AST node to the resolved entities- if there is no candidates, then the type of
e
isType.Error
- if there is a single candidate, then the type of
e
is the type of the candidate. Moreover, it links the AST node to the candidate using theproperties.treeToReferredEntity
map. - if there is more than a single candidate, then a new fresh type variable
$\tau$ is created and the AST node is linked to the type variable$\tau$ . Moreover, a newOverload
constraint is added to the obligations to say that the type ofe
is the type of one of the candidates. Also, a constraint is added to say that the type ofe
is$\tau$ .
- if there is no candidates, then the type of
Note that there is some useful functions in the Typer
class that you can use to generate fresh type variables, get the type of an entity, check if a type is a subtype of another, etc.
In programming, the Scope of a binding is defined as follows:
[The ] scope is "the portion of source code in which a binding of a name with an entity applies".
For example, in the following code:
fun f(x: Int) -> Int {
let y = 42 {
x + y
}
}
let main = let z = 1 {
f(z)
}
the scope of the binding of y
is the body of the let
binding, so:
{
x + y
}
and the scope ot the binding of x
is the body of the function f
, so:
{
let y = 42 {
x + y
}
}
Inside SBT, you can use the run -t <input file>
command to run the type checker on a file. You can also add the --type-inference
flag to show the solving part of the type checker (so including the constraints and the solution).
Let us recall that overloading is the ability to have multiple functions with the same name but different types. For example, in the following code:
fun f(x: Int) -> Int { x }
fun f(x: Float) -> Float { x }
the function f
is overloaded. When the type checker encounters a call to f
, it should resolve the call to one of the two functions. This is what the Overload
constraint is for.
Overloading is resolved by the Solver
class. For every Overload
constraint and their candidates, the solver forks itself and tries to solve the other constraints with each candidate (i.e., type checking the program assuming the candidate is the right one). If one of the forks fails, the solver backtracks and tries with the next candidate. If all the forks fail, the solver reports an error. And if one of the forks succeeds, the solver returns the solution.
If you run the compiler with the --trace-inference
flag with a piece of code that contains some overload, you will see that the solver tries to solve the constraints with each candidate.
You can see some calls to memoizedUncheckedType
. Its role is to store the type of a tree node in an unchecked map. This means the type is temporary, but not checked yet.
This is used for recursive functions: the type of the function is stored in this map, so that the body can be typechecked while "knowing" the type of the function itself (for possible recursive calls). This is also used for recursive types, but we do not handle them in this lab.
This is also used for memoization, i.e., not recomputing the type of a tree multiple times.
Below are listed hints about the visit functions you have to implement, along with the inference rules of the alpine type system.
where
For each field of the record, you should visit recursively the expression of the field to get field types.
Note: Math mode can be buggy on GitHub Markdown if the above rule is not properly displayed. You can download the .html file and open it in your browser to see it properly! Or open the Markdown file inside your editor. All \char"23
should be displayed as #
.
Two constraints should be addeed:
-
Apply
: the application is valid. -
e
: should have the type$\tau$
This means that if
You should get the type of the function and of the labeled arguments. If the output type of the function is not known, you should generate a fresh type variable for it. Otherwise,
The idea is simlar to the visitApplication
method. Keep in mind that infix and prefix applications do not have labels and their number of arguments is fixed (1 for prefix and 2 for infix).
Prefix applications:
Infix applications:
Check the checkInstanceOf
function.
The type of a conditional is then
branch and the type of the else
branch. The condition must be a Boolean
. Generate fresh variables if needed.
Note: you should consider the trivial case as well. If the two branches have the same type, then there is no need for a fresh variable.
You must assign a new scope name using the assignScopeName
method, and visit the binding
and the body with this new scope name.
You should assign a new scope name for that lambda and in that scope, you should get the type of the inputs (using computedUncheckedInputTypes
)
In the same manner as the application, get the output type of the lambda or generate a fresh variable, and add the following constraints:
e
should be of typeType.Arrow
.- the body should be a subtype of the output type of the lambda.
where the first
When encoutering a specified type, you should call evaluateTypeTree
.
You should evaluate the ascription
field of the AscribedExpression
using evaluateTypeTree
and depending on the ascription. Then add needed constraints, depending on the ascription sort:
Metatypes are the types of types. For example, in an alpine program, the type Int
in let x : Int = 42 { ... }
has the type MetaType[Int]
. This is useful to type check an expression where a type is expected, for example in a type ascription or arrow types.
You should lookup the type identifier in the current scope:
- if there is no type with that name, return
Type.Error
and report an error. - if there is a single type with that name, return the type corresponding to the meta-type (
Type.Meta
). - if there is more than a single type with that name, return a
Type.Error
and report an ambiguous use of the type.
Check the provided functions.
You should return the type of the record type, which is a Type.Record
with proper fields.
Note that
You should return the type of the arrow type, which is a Type.Arrow
with proper inputs and output.
Note that
On the same principle as the visitRecord
method and visitValuePattern
, you should add a constraint for the type of the pattern.
where
You should get the unchecked type of the function and memoize it. You should then type check the body and it must be a subtype of the output type of the function using checkInstanceOf
.
The function is provided, but here is the inference rule:
A selection can be performed on a record with an identifier or an integer. You should add a Member
constraint.
For the integer selector:
For the identifier selector: