Function Reference in SAMPL - A Design Mistake and the Fix

2018-06-19

Background

In most functional programming languages, a function IS a value. Therefore, it can be easily passed as a parameter for a function. For example, this can be done easily in JS:

function f(s) {
  return parseInt(s);
}

function test(f, a) {
  return f(a);
}

const ignoreMe = test(f, "3");

However, functions do not have first-class support on JVM. Although Java 8 and Kotlin have good and reasonable support for FP, function reference still need a special syntax like ::test in Kotlin.

Initial Design Mistake

When I'm first designing , I forgot there was an issue with this. SAMPL is functional, so I choose not to use the :: in Java and Kotlin. Instead, you simply write the name of the function as if it is a normal variable.

Later, when I fininished writing a Turing Machine simulator in SAMPL and try to compile it, I found a bug in code translation. Since in SAMPL function reference is the same as a normal variable, neither the environment nor the type decorated AST contains any information about the expression. Therefore, during compilation, the compiler has no idea whether this variable is an actual variable or a function.

You may ask: is it possible to tell from the string of the variable? Well, that's not possible:

const iAmAFunction = 3
function iAmAConstant(): number = 5

Although if the identifier has some generics in it, you are sure that it is a function, that's not a reliable indicator.

You may ask: what about adding the environment? Well, the problem is the environment also has no idea of what's going on. Although the environment can distinguish a function type and an identifier type, it can't distinguish different types of functions:

const fun1 = (a: number) => a;
function fun2(a: number): number {
  return a;
}

In Kotlin, class member functions and lambda functions are very different. Class members functions can only be referenced with the :: syntax while lambda functions can be referenced simply by its variable name.

This is very bad and it needs to be fixed. You may have already discovered that there exists a hack: wrap everything inside a lambda. However, lambda expression has some performance cost. Also, by doing so, the generated code for the most common use will be very ugly:

function test(a: number): number {
  return a + 3;
}
function abc(): number {
  return test(3);
}

Now consider what will the generated code for the body of abc be in this case. test is a function expression, so it will be wrapped inside a lambda and we will get:

function test(a: number): number {
  return a + 3;
}
function abc(): number {
  return ((_temp0: number) => test(_temp0))(3); // :-(
}

This is not a good fix and it defeats the purpose of generating efficient and readable Kotlin code.

Later Fix

If you followed carefully, you may find that the problem exists both in the environment and in the type-decorated AST. Firstly, the environment does not have enough information, so the problem propagates to the AST. Finally, the compiler is dealing with the AST without the environment, which demonstrates the problem.

Initially, the type-checking environment records:

  • declaredTypes, which contains all defined types;
  • typeDefinitions, which contains all the variant/struct type definitions in scope;
  • typeEnv, which contains all the mapping from name to type.

Therefore, I first split typeEnv to classFunctionTypeEnv and normalTypeEnv. This fix also surprisingly fixed another ugly design: now we can clearly now whether a variable has associated generic type parameter or not. In other words, all types in normalTypeEnv will not contain polymorphic functions, and types in classFunctionTypeEnv may legitimately have some.

With this additional information, type checking for variables can give some additional information. By testing which type bucket a variable is in, we can tell whether the variable is a class function or a normal one. The second fix is to add an additional isClassFunction flag for the variable expression in type-decorated AST. With this additional information, the compiler can now generate efficient code with functions on JVM.

One More Thing

Second Alpha of SAMPL is released today. Now the error messages contain line numbers so it's easier to debug now. Still, I welcome your participation!