Note 5, Programming Language Concepts (sestoft@dina.kvl.dk) 2001-03-06 ---------------------------------------------------------------------- Curried functions ----------------- A two-argument SML function, such as plus, may be defined in two ways: either as a function that takes a pair of arguments: fun plusp (x, y) = x+y; or as a function that takes one argument x and returns a new function which takes an argument y and then returns a result: fun plusc x y = x+y; The two functions have these types: plusp : int * int -> int plusc : int -> int -> int The plusc function is said to be a Curried version of the plusp function (named after Haskell B Curry, a logician). The advantage of the curried form is that it can be partially applied, to produce a new function. The function which adds 7 to its argument can be defined simply by providing only the first argument, like this: val addseven = plusc 7; The function addseven has type int -> int. A higher order functional language ---------------------------------- A higher-order functional language is one in which functions are values, just like integers and booleans. That is, the value of a variable may be a function, and a function may take functions as arguments and may return a function as a result. It is straightforward to extend our first-order functional language to a higher-order one. The concrete and abstract syntaxes already allow the function part e in a call Call(e, earg) to be an arbitrary expression e; it need not be a function name. In the interpreter eval (file fun/hofun.sml) one needs to accommodate the possibility that an expression evaluates to a function, and that a variable may be bound to a function (not only to an integer). A value of function type is a closure, as in the first-order language. Hence the possible values, and the variable environments, are described by these mutually recursive type declarations: datatype value = Int of int | RClo of string * string * expr * vfenv withtype vfenv = (string, value) env where the four components of a closure are the function name, the function's parameter name, the function's body, and an environment binding the function's free variables. The only difference between the higher-order interpreter and the first-order one is in the handling of function calls. A call Call(efun, earg) is evaluated by evaluating efun to a closure RClo(f, x, e, fenv), evaluating earg to a value v, and then evaluating e in fenv extended with a binding of x to v and of f to the closure. Type inference with parametric polymorphic types ------------------------------------------------ Consider an SML program with higher-order functions, such as this one: let fun tw (g : int -> int) (y : int) = g (g y) : int in let fun doubl (y : int) = 2 * y : int in tw doubl end end The function tw takes as argument a function g : int -> int and a value y : int and applies g twice to y, as in g (g y), producing an integer. The function doubl multiplies its argument by 2. Type checking of this program succeeds with the type int as result. The type ascribed to tw above is tw : (int -> int) -> (int -> int) which says that tw can be applied to a function of type int -> int, and then will give a function that can be applied to an int, and will then provide an int as result. With a modest extension of the abstract syntax, our micro-SML type checker (file fun/tychk.sml) might even have come up with this result. However, this is just one of infinitely many possible types for tw, but so far we have considered only monomorphic type inference, where every variable, parameter, expression and function is assigned just one (simple) type. Type inference using type variables ----------------------------------- If we do not have explicit types on function declarations, we can instead make a preliminary `guess' about the type of a program variable (or expression or function). We represent the `guess' by a type variable, which can stand for any type. Note that we now have type variables (used during type inference) in addition to program variables (used in the object-language program). During subsequent type checking of the uses of the program variable, the type variable may be instantiated to a concrete type, thus obtaining a concrete type for the program variable. Now the (meta-language) type of (object-language) types is: datatype typ = TypI (* integers *) | TypB (* booleans *) | TypF of typ * typ (* (argumenttype, resulttype) *) | TypV of typevar (* type variable *) Using type variables and unification, we can do type inference for a (higher-order) functional language without explicit types. For instance, we can find that this example program let fun tw (g : int -> int) (y : int) = g (g y) : int in let fun doubl (y : int) = 2 * y : int in tw doubl 11 end end is well-typed and has type int. The type (int -> int) -> (int -> int) for tw was inferred automatically from its use, as follows: First a type variable 'a is guessed for g and type variable 'b is guessed for y. Then, because g is applied to y, type 'a is instantiated to the function type 'b -> 'c, so the result of (g y) must have type 'c. But since g is applied to (g y), type variables 'b and 'c must be identical. Also, the result type of g (g y) must be 'c, and therefore, 'b. Hence the type of tw must be tw : ('b -> 'b) -> ('b -> 'b) Then in the body of the outermost let, we apply (tw doubl 11), which should be read as ((tw doubl) 11) so 'b must be int. Hence the type of tw in this case is tw : (int -> int) -> (int -> int) However, if we did had not applied tw to doubl and 11, then we would still have a type for it, namely tw : ('b -> 'b) -> ('b -> 'b), but that type contains the type variable 'b that has not become instantiated during type checking. If that type variable is used nowhere else in the program, this indicates that the function may be used at many different types: regardless how the type variable is instantiated, a valid type for the function is obtained. Indeed, in the case of tw, another valid type instance would be tw : (bool -> bool) -> (bool -> bool) e.g. in this program: let fun tw g y = g (g y) in let fun neg b = if b then false else true in tw neg false end end Since the function may have many different types, the type is said to be polymorphic (Greek: `many forms'), and since the type variable may be considered a kind of parameter for enumerating the possible types, the type is said be parametrically polymorphic. (Object-oriented languages are sometimes said to have polymorphism in method calls because of the dynamic dispatch, but that has nothing to do with parametric polymorphic types). A polymorphic type is represented by a type scheme, which is a list of type variables together with a type in which those type variables occur. In the case of tw, the list of type variables contains just 'b, so the type scheme for tw is (['b], ('b -> 'b) -> ('b -> 'b)) Since this type scheme may be read: for all ways to instantiate 'b, the type is ('b -> 'b) -> ('b -> 'b), this type is often written \/ 'b. ('b -> 'b) -> ('b -> 'b) where \/ denotes the universal quantifier `forall', an upside-down capital A, used in logic. In general, a type scheme is a pair (tvs, t) where tvs is a list of type variables and t is a type. A non-polymorphic type t is represented by a type scheme of the form ([], t) where the list of type variables is empty. A type scheme may be instantiated (or specialized) by systematically replacing all occurrences in t of the type variables from tvs by other types. When x is a program variable (such as tw) with a polymorphic type represented by a type scheme (tvs, t), then type inference will create a fresh type instance for every use of the program variable in the program. How do we discover which type variables can safely be generalized? The requirement is that the type variable is not used elsewhere. For example, in the (ill-typed) program below, the type of x in f should be constrained to be the same as that of y in g, because the comparison (x=y) requires its arguments to have the same type: let g y = let f x = (x=y) in f 1 & f false end in g 2 end Type inference proceeds as follows: we guess a type 'a for y in g, then guess a type 'b for x in f, and then realize that 'a must be equal to 'b because the values are compared by (x=y). Thus a plausible type for f is 'b -> bool. Can we generalize 'b in this type, obtaining the type scheme \/'b. 'b -> bool for f? No, because that would allow us to apply f a boolean in the let-body. That would not be good, because we can apply g to an integer in the outer let-body (as shown), and that would require us to compare integers and booleans, something our type system was not supposed to allow. The essential observation is that we cannot generalize type variable 'b (or 'a, which is the same) in the type of f because 'b is bound further out, namely in the declaration of g. Didier Rémy (1992) devised a way to efficiently decide whether a type variable can be generalized. With every type variable we associate a binding level, where the outermost binding level is zero, and the binding level increases whenever we enter the right-hand side of a let-binding. When equating two type variables, we reduce the binding level to the lowest (outermost) of their binding levels. When generalizing a type in a let-binding, we generalize only those type variables whose binding level is greater than the binding level of the let-body. In the above example, type variable 'a for y has level 1, and type variable 'b for x has level 2. When equating (unifying) the two, we set the level of 'b to 1, and hence we do not generalize 'b (or 'a) in the inner let-body (which is at level 1). In the kind of parametric polymorphic types discussed here, we generalize types to type schemes only at let-bindings, and require a function to have a monomorphic type in its own body. This (limited) style of parametric polymorphism is called let-polymorphism or ML-polymorphism, or Hindley-Milner polymorphism, after Hindley and Milner, who independently discovered this idea in 1968 and 1977. ML type inference has a very high complexity: it is complete for DEXPTIME, deterministic exponential time (Kfoury, Tiuryn, Urzyczyn 1990; Henglein), which means that in theory it is quite hopeless. But in practice, programmers write programs with rather non-complex types, and ML type inference is very fast. Some literature on type inference: Damas and Milner: Principal type schemes for functional programs (1982) presents algorithm W, the original type inference algorithm for ML. Michael Schwartzbach: Polymorphic Type Inference, 27 pages, March 1995; is a very readable presentation, at http://www.daimi.au.dk/~mis/ Hancock: Polymorphic type checking (chapters 8 and 9 in Peyton Jones: The implementation of functional programming languages, Prentice-Hall 1987). The binding level technique for efficient generalization is due to Didier Rémy: Extension of ML type system with a sorted equational theory on types, INRIA Rapport de Recherche 1766, October 1992. Higher-order functions in Java ------------------------------ A function closure is similar to a Java object containing a method; the object's fields bind the free variables of the method (function). To simulate higher-order functions in Java, one can describe an argument of function type by a Java interface Fun, and a parameter of function type as an object of a class (typically an anonymous inner class) that implements the interface Fun. This programming style is not widely used. Traditional object-oriented programming styles attempt to keep the data and the functions working on those data together, where functional languages keep them separate. However, the so-called Visitor design pattern in object-oriented programming provides a systematic way to obtain this separation between data (class hierarchy) and operations (encoded as so-called visitors). Java's lack of parametric polymorphic types restricts the generality of higher-order functions expressible in Java. One might define rather general higher-order functions by using subtyping and type casts to and from Object, but then one loses static type checking and gets dynamic type checking. Eager and lazy evaluation in functional programming languages ------------------------------------------------------------- In a function call such as f(e), one may evaluate the argument expression e eagerly, to obtain a value v before evaluating the function body. That's what we are used to. Alternatively, one might evaluate e lazily, that is, postpone evaluation of e until we have seen that the value of e is really needed. If it is not needed, we never evaluate e. If it is, then we evaluate e and remember the result until the evaluation of function f is complete. The distinction between eager and lazy makes a big difference in function such as this: let loop n = loop n in let f x = 1 in f (loop(2)) end end where the evaluation of loop(2) would never terminate. For this reason, the entire program would never terminate in an eager language. In a lazy language, however, we do not evaluate loop(2) until we have found that f needs the value of loop(2). And in fact f does not need it (because x does not appear in the body of f), so in a lazy language the above program would terminate with the result 1. For a less artificial example, note that in an eager language one cannot define a function that works like if-then-else. An attempt would be let myif b v1 v2 = if b then v1 else v2 but we cannot use that to define factorial recursively: let myif b v1 v2 = if b then v1 else v2 in let fac n = myif (n=0) 1 (n * fac(n-1)) in fac 3 end end because eager evaluation of the third argument to myif would go into an infinite loop. Thus it is important that the built-in if-then-else construct is not eager. Our small functional language is eager. That is because the interpreter (function eval in fun/fun.sml) evaluates the argument expressions of a function before evaluating the function body, and because the meta-language SML is strict. Most real-world programming languages (C, C++, Java, C#, Pascal, Ada, Lisp, Scheme, APL, ...) are eager. An exception is Algol 60, whose call-by-name parameter passing mechanism provides a form of lazy evaluation. Some modern functional languages have lazy evaluation, most notably Haskell. This provides for concise programs, extreme modularization, and very powerful and general functions, especially when working with lazy data structures. For instance, one may define an infinite list of the prime numbers, or an infinite tree of the possible moves in a two-player game (such as chess), and if properly done, this is even rather efficient. Lazy languages require a rather different programming style than eager ones. They are studied primarily at Chalmers University (Gothenburg), Oregon Graduate Institute, Yale University, University of Kent at Canterbury, and Microsoft Research Cambridge (where the main developers of the GHC, the Glasgow Haskell Compiler, reside). All lazy languages are purely functional (no updatable variables) because it is nearly impossible to understand side effects in combination with the hard-to-predict evaluation order of a lazy language. The lambda calculus: the simplest possible functional language -------------------------------------------------------------- (This is an aside; do not waste too much time on it) Anonymous functions such as SML's fn x => 2 * x are called lambda abstractions by theoreticians, and are written \x. 2*x where the backslash (\) should be the Greek lowercase letter lambda. The lambda calculus is the prototypical functional language, invented by the logician Alonzo Church in the 1930's to analyse fundamental concepts of computability. The pure untyped lambda calculus allows just three kinds of expressions e: * Variables: x * Lambda abstractions: \x. e * Applications: e1 e2 The three kinds of expression are evaluated as follows: * A variable x may be bound by an enclosing lambda abstraction, or may be free (unbound). * A lambda abstraction (\x. e) is a function. To apply it to an argument expression e2, as in ((\x. e) e2), substitute the argument e2 for x in e, and then evaluate e. * A function application (e1 e2) denotes the application of function e1 to argument e2. Thus an abstract syntax for the pure untyped lambda calculus could look like this: datatype lam = Var of string | Lam of string * lam | App of lam * lam This may seem to be a very restricted and rather useless language, but Church showed that the lambda calculus can compute precisely the same functions as Turing Machines (invented by the mathematician Alan Turing in the 1930's), and both formalism can compute precisely the same functions as an idealized computer with unbounded storage. Indeed, `computable' formally means `computable by the lambda calculus (or by a Turing Machine)'. In fact, it is fairly easy to encode numbers, lists, trees, arrays, objects, iteration, and recursion in the pure untyped lambda calculus. Recursion can be encoded using one of the so-called Y combinators. This is the recursion combinator for call-by-name evaluation: Y = \h.(\x. h(x x)) (\x. h(x x)) This is a recursion operator for a call-by-value evaluation, Yv = \h.(\x.(\a.h (x x) a)) (\x.(\a.h (x x) a)) One can define a non-recursive variant of, say, the factorial function, and then make it recursive using the Y combinator: fac0 = \fac. \n. if n=0 then 1 else n * fac(n-1) then fac = Y fac0 since fac 2 = Y fac0 2 = (\h.(\x. h(x x)) (\x. h(x x))) fac0 2 = (\x. fac0(x x)) (\x. fac0(x x)) 2 = fac0 ((\x. fac0(x x)) (\x. fac0(x x))) 2 = if n=0 then 1 else 2 * (((\x. fac0(x x)) (\x. fac0(x x))) (2-1)) = 2 * (((\x. fac0(x x)) (\x. fac0(x x))) (2-1)) = 2 * (((\x. fac0(x x)) (\x. fac0(x x))) 1) = 2 * fac0 ((\x. fac0(x x)) (\x. fac0(x x))) 1 = 2 * (if 1=0 then 1 else 1 * ((\x. fac0(x x)) (\x. fac0(x x))) (1-1)) = 2 * (1 * ((\x. fac0(x x)) (\x. fac0(x x))) (1-1)) = 2 * (1 * ((\x. fac0(x x)) (\x. fac0(x x))) 0) = 2 * (1 * fac0((\x. fac0(x x)) (\x. fac0(x x)))) = 2 * (1 * (if 0=0 then 1 else 0 * ((\x. fac0(x x)) (\x. fac0(x x))) (0-1))) = 2 * (1 * 1) = 2 For the sake of illustration we here assumed that we can use arithmetic on integers in the lambda calculus, although this was not included in the syntax above. In fact, the natural numbers (non-negative integers with addition, subtraction, multiplication, and test for zero) can be encoded as so-called Church numerals as follows (and also in a number of other ways): zero is \f.\x. x one is \f.\f. f x two is \f.\f. f (f x) three is \f.\f. f (f (f x)) and so on. Then successor (+1), addition and multiplication may be defined as follows: succ is \m.\f.\x. f (m f x) add is \m.\n.\f.\x. m f (n f x) mul is \m.\n.\f.\x. m (n f) x Some of these encodings are possible only in the untyped lambda calculus, so the absence of type restrictions is important. In particular, the pure *simply typed* lambda calculus cannot encode unbounded iteration or recursion. There is a rich literature on the lambda calculus, not least Henk Barendregt's comprehensive monography `The Lambda Calculus', North-Holland ca 1984. Several different evaluation strategies are possible for the untyped lambda calculus. To experiment with some encodings and evaluation strategies, you may try http://www.dina.kvl.dk/~sestoft/lamreduce/