Proceedings of the 2002 ACM SIGPLAN Haskell Workshop (Haskell '02) : Pittsburgh, Pennsylvania, USA ; October 3, 2002
 1581136056, 9781581136050

Table of contents :
Content: Template meta-programming for Haskell / Tim Sheard, Simon Peyton Jones --
A formal specification of the Haskell 98 module system / Iavor S. Diatchki, Mark P. Jones, Thomas Hallgren --
A recursive do for Haskell / Levent Erkök, John Launchbury --
Eager Haskell: resource-bounded execution yields efficient iteration / Jan-Willem Maessen --
Functional reactive programming, continued / Henrik Nilsson, Antony Courtney, John Peterson --
Testing monadic code with QuickCheck / Koen Claessen and John Hughes --
Haddock, a Haskell documentation tool / Simon Marlow --
A lightweight implementation of generics and dynamics / James Cheney, Ralf Hinze --
Techniques for embedding postfix languages in Haskell / Chris Okasaki.

Citation preview

Lambada, Haskell as a better Java Erik Meijer University of Utrecht mailto:[email protected]

Abstract The Lambada framework provides facilities for fluid interoperation between Haskell (currently both Hugs and GHC using non-Haskell98 extensions) and Java. Using Lambada, we can call Java methods from Haskell, and have Java methods invoke Haskell functions. The framework rests on the Java Native Interface (JNI). The Lambada release includes a tool for generating IDL from Java .class files (using reflection), which is fed into our existing HDirect to generate Haskell-callable stubs. 1

Introduction

It goes without saying that the ability to interact with other languages is of vital importance for the long-term survival of any programming language, in particular for niche languages such as Haskell or ML. Java is an interesting partner for interoperation for a number of reasons: • Java comes with an large set of stable, and usually welldocumented and well-designed libraries and APIs. • The interaction with Java via the Java Native Interface (JNI) [8, 4] effectively allows us to script the underlying JVM. This makes bidirectional interoperability with Java very flexible and vendor independent. • Because of Java’s platform independence, we hope that Lambada gains wider acceptance than our previous work. Notwithstanding all these advantages, using JNI in its most primitive form is tedious and error prone. Using raw JNI is like assembly programming on the JVM. Our mission is to make interoperation between Java and Haskell via JNI as convenient as programming in Java directly. The architecture of Lambada borrows heavily from our previous work on interfacing Haskell and COM [6, 2, 3], in particular the binding for Automation [7]. We refer the reader to those papers for a discussion on the rationale behind the object encoding used for Lambada. 2

The Java Native Interface

Simply said, JNI provides a number of invocation functions to initialize and instantiate a Java Virtual Machine (JVM),

Sigbjorn Finne Microsoft Research mailto:[email protected]

and defines a COM interface to the JVM (the JavaVM interface), and to individual threads that run inside a particular JVM (the JNIEnv interface). The JNIEnv interface contains 228 methods to interact with objects that live in its associated thread. 2.1

Calling C From Java and vice versa

To explain the way C and Java interact via JNI we use a slightly perverse version of Hello World! where a Java method calls a C procedure that calls back into Java to print Hello World!. To use foreign functions in Java we declare a method as native. Class Hello imports a native method cayHello from the HelloImpl DLL 1 , and defines a normal static Java method sayHello that will print the greeting on the standard output: class Hello { native cayHello (); static void sayHello () { System.out.println("Hello World!"); } static { System.loadLibrary ("HelloImpl"); } } In Java we cannot just call entries of an arbitrary DLL; Java dictates how native methods should look like. For this example HelloImpl.dll should have entry Java_Hello_cayHello with signature2 : JNIEXPORT void JNICALL Java_Hello_cayHello ([in]JNIEnv* env, [in]jobject this) In general, each native method implementation gets an additional JNIEnv argument, and a jobject object reference (or a jclass class object for static methods). The JNIEnv argument is an interface pointer through which the native method can communicate with Java, for instance to access fields of the object, or to call back into the JVM. The jobject (or jclass) reference corresponds to the this pointer (or class object) in Java. 1 A dynamic-link library (DLL) is an executable file that acts as a shared library of functions on the Win32 platform. 2 Throughout this paper we specify the C signatures for the different JNI procedures using IDL attributes such as [in] and [out].

Procedure Java_Hello_cayHello first gets the class object for the Hello class, then gets the methodID for the static void sayHello() method, and finally calls it: jclass cls = (*env) -> GetObjectClass (env, this); jmethodID mid = (*env) -> GetStaticMethodID (env, cls, "sayHello" "()V"); (*env) -> callStaticVoidMethod(env, cls, mid); In the above example the initiative came from the Java side. It is also possible to start on the C side. The JNI_CreateJavaVM function takes an initialization structure and loads and initializes a JVM instance. It returns both a JavaVM interface pointer to the JVM and a JNIEnv interface pointer to the main thread running in that JVM instance: jint JNI_CreateJavaVM ( [out]JavaVM** pvm , [out]JNIEnv** penv , [in]JavaVMInitArgs* args ); To call sayHello in this manner, we first load and initialize a JVM using JNI_CreateJavaVM to obtain a JNIEnv pointer, then call the sayHello method exactly as we did before except that we get the class object via the FindClass function and finish by destroying the JVM: JNIEnv* env; JavaVM* jvm; JavaVMInitArgs args; new_InitArgs (&args); JNI_CreateJavaVM (&jvm, &env, &args); jclass cls = (*env) -> FindClass (env, "Hello"); jmethodID mid = (*env) -> GetStaticMethodID (env, cls, "sayHello" "()V"); (*env) -> callStaticVoidMethod(env, cls, mid); (*jvm) -> DestroyJavaVM(jvm); Even though we have left out many details (including dealing with errors) it is clear that interacting with Java at this level of abstraction is quite mind numbing. In particular, the fact that we have to thread around the JNIEnv pointer in every call is something we would like to get rid of. In addition, encoding of the result-type in method calls (such as callStaticVoid Method) and the distinction between static and instance methods is something we want to hide to the user. 3

Essential FFI in Four Easy Steps

We will introduce some basic JNI concepts that underpin our JNI binding for Haskell while reviewing the basics of the Hugs/GHC foreign function interface. For a more thorough explanation of the FFI we refer to our ICFP99 paper [2]. 3.1

Importing Function Pointers

HMODULE LoadLibraryA([in,string]char*); by giving the following foreign import declaration: type CString = Addr type HMODULE = Int foreign import stdcall "kernel32" "LoadLibraryA" primLoadLibrary :: CString -> IO HMODULE This foreign declaration defines a Haskell function primLoadLibrary, that takes the address of a null terminated string which contains the filename of an executable module, dynamically loads the module into memory and returns a handle to the requested module. As its name suggests, function primLoadLibrary is rather primitive. Using the functions marshallString and freeCString we can easily write a friendlier version that takes a normal Haskell string as its argument: marshallString :: String -> CString freeCString :: CString -> IO () loadLibrary :: FilePath -> IO HMODULE loadLibrary = \p -> do{ s IO FARPROC Again we can easily construct a Haskell-friendly version of primGetProcAddress that takes a normal Haskell string as its argument: getProcAddress :: HMODULE -> String -> IO FARPROC getProcAddress = \h -> \n -> do{ pn JNIEnv -> Addr -> IO Int) To write a flexible version of createJavaVM we need several helper functions. Function newJavaVMInitArgs creates a default JVM initialization structure; it comes with a corresponding free routine freeJavaVMInitArgs: newJavaVMInitArgs :: IO Addr freeJavaVMInitArgs :: Addr -> IO () Function newResultAddr allocates space for a return value; it also comes with a corresponding free routine freeResultAddr: newResultAddr :: IO Addr freeResultAddr :: Addr -> IO () And finally, a function deref to shorten a pointer indirection: deref :: Addr -> IO Addr Using these helper functions, we can complete the tedious code for createJavaVM: type JavaVM = Addr type JNIEnv = Addr createJavaVM :: FilePath -> IO (JavaVM, JNIEnv) createJavaVM = \p -> do{ h IO () by using function mkSayHello. The latter function is defined using a foreign export dynamic declaration:

The double hash operator ## is just reverse monadic composition: f ## g = \a -> do{ b Jobject -> IO ()) -> FARPROC Thus mkSayHello is a Haskell function that takes any Haskell function of type JNIEnv -> Jobject -> IO () and at run-time creates a C function pointer, that when called will call the exported Haskell function: sayHello_ :: FARPROC sayHello_ = mkSayHello sayHello

The JNIEnv method RegisterNatives dynamically registers new native methods with a running JVM. The RegisterNatives entry takes a reference to a class object in which the native methods will be registered, and an array of JNINativeMethod structures that contain the names, types and implementations of the native methods:

sayHello = \env -> \this -> do{ putStr "Hello From Haskell!"; }

typedef struct { [string]char* name; [string]char* signature; [ptr]FARPROC fnPtr; } JNINativeMethod;

Assuming a function marshallJNINativeMethods that builds an array of JNINativeMethod structures from a Haskell list of name/signature/function pointer-triples

jint RegisterNatives ( [in]JNIEnv* env , [in]jclass clazz , [in,size_is(nMethods)]JNINativeMethod* meths , [in]jint nMethods );

it is not hard to define a function registerHello that registers the Haskell function sayHello as the native implementation of a Java method void sayHello ():

To be able to call RegisterNatives in Haskell through a JNIEnv pointer, we must first dereference the JNIEnv pointer, fetch the 215-th entry in the function table and coerce that into a Haskell callable function. Then we pass the resulting function the required arguments to compute the result: getProcAt :: Int -> Addr -> IO FARPROC foreign import dynamic mkRegisterNatives :: FARPROC -> IO (JNIEnv -> Jclass -> Addr -> Int -> IO Int) registerNatives :: JNIEnv -> Jclass -> Addr -> Int -> IO () registerNatives = \env -> \clazz -> \meths -> \n -> do{ f IO Addr

registerHello = \env -> \clazz -> do{ p ForeignObject -> JNIEnv -> IO call Method = \this -> \name -> \sig -> \a -> \env -> do{ cls ForeignObject -> Jobject -> IO call Method = \name -> \sig -> \args -> \this -> do{ env IO CString that takes a Haskell string and returns a pointer to a null terminated array of characters. The primitive string marshall function marshallString itself is defined in terms two more primitive functions. Function marshallByRefChar :: Char -> [Addr -> IO ()] returns (in this case a singleton) list of functions that writes a character at the indicated addresses. Function writeAt takes a list of Addr -> IO () functions and a start address and calls each function in the list to write the value at subsequent addresses: writeAt :: [Addr -> IO ()] -> Addr -> IO () To marshall a string, we allocate enough memory to hold the string, generate a list of marshall functions for each character in the string and then write each one into the allocated memory: marshallString :: String -> IO Addr marshallString = \s -> do{ let cs = s++[chr 0]; p IO (), which deallocates a memory block that was previously allocated by a call to malloc: freeString :: Addr -> IO () freeString = \s -> do{ free s; }

In general, we have a family of functions marshall , marshallByRef , free for each that we can marshall from Haskell to Java: marshall :: -> IO Addr marshallByRef :: -> [Addr -> IO ()] free :: Addr -> IO () At this moment bells should start to ring and alarms should go off: “ad-hoc overloading”, this is exactly what Haskell type classes were invented for! So we define a new type class GoesToJava with three methods marshall, marshallByRef and free. To be able to overload free as well, we add an extra phantom argument to the free function in the class: class GoesToJava a where { marshall :: a -> IO Addr; marshallByRef :: a -> [Addr -> IO ()]; free :: a -> Addr -> IO (); } and make all types that can be marshalled from Haskell to Java (Char, Int, Integer, Double, Float, Jobject, String, ....) an instance of the GoesToJava class: instance GoesToJava where { marshall = marshall ; marshallByRef = marshallByRef ; free = \a -> free ; } The marshallByRef instances for Double and Integer return a two element list with the second entry being \a -> do{ return () } to ensure that the marshall function will allocate enough memory to hold a double or a long. We continue to overload GoesToJava for tuples to encode Java methods with zero, or more than one argument. This is impossible if we use curried functions to represent multiple argument Java methods in Haskell. instance GoesToJava () where { marshall = \() -> do{ malloc 0; }; marshallByRef = \() -> []; free = \() -> \p -> do{ free p; }; } instance (GoesToJava a, GoesToJava b) => GoesToJava (a,b) where { marshall = \(a,b) -> do{ let ab = marshallByRef a ++ marshallByRef b; p marshallByRef a ++ marshallByRef b; free = \(a,b) -> \p -> do{ (deref ## free) p; (deref ## free) (incrAddr p); free p; }

By using the GoesToJava class we don’t have to marshall arguments to a JNI call into an array anymore, but we can simply write (again qualifying the previous version of call Method with Prim): call Method :: GoesToJava a => String -> String -> a -> Jobject -> IO call Method = \name -> \sig -> \a -> \this -> do{ p a -> Jobject -> IO b; } instance ComesFromJava where { callMethod = call Method; } At this point we have reduced the complexity for JNI calls quite significantly when compared with the most primitive form, but there are still ample possibilities to improve. The next thing that we are going to attack is the explicit passing of type descriptor strings, which at the moment is rather unsafe. 5.5

Generating type descriptors

The problem with the current version of function callMethod is that there is no connection between the value of the type descriptor string, which denotes the Java type of a field or method, and the corresponding Haskell types of the argument and result of the callMethod function. For example, the following definition is accepted by the Haskell type-checker: foo :: () -> Jobject -> IO Int foo = \() -> \obj -> do{ obj # callMethod "foo" "(I)V" (); } Unfortunately, it leads to a runtime error because the foo method is called as if it has Java type void foo(int) instead of at the correct type int foo() (which corresponds to the type descriptor string should be "()I"). We will generate type descriptors by a family of functions descriptor :: -> String5 . As we did in 5 This trick is used in the Haskell module Dynamic as well, and is attributed to the legendary hacker Lennart Augustsson.

the previous sections, we define a Haskell type class, in this case called Descriptor, that witnesses for which we can obtain a descriptor: class Descriptor a where { descriptor :: a -> String; } instance Descriptor where { descriptor = descriptor ; } The type descriptor for the primitive JNI type Bool is "Z": descriptorBool :: Bool -> String descriptorBool = \z -> "Z" The other basic types are mapped as follows: Byte → 7 "B", Char 7→ "C", Short 7→ "S", Int 7→ "I", Integer → 7 "L", Float 7→ "F", Double 7→ "D". Type descriptors for methods arguments are formed by concatenating the type descriptors of the individual arguments: instance Descriptor () where { descriptor = ""; } instance (Descriptor a, Descriptor b) => Descriptor (a,b) where { descriptor = \(a,b) -> descriptor a ++ descriptor b; } Type descriptors for methods are formed by enclosing the type descriptor for the arguments inside parenthesis followed by the type descriptor of the result type, with the exception that V is used for the void return type: methodDescriptor :: (Descriptor a, Descriptor b) => a -> b -> String methodDescriptor = \a -> \b -> concat [ "(", descriptor a ,")" , if null (descriptor b) then "V" else descriptor b ] To deploy function methodDescriptor we need to jump one more hurdle. In order to build the method descriptor string, we need to know the type of the result of a call. Since the methodDescriptor function does not need the value of its arguments to compute a type descriptor, we can recursively build the right method descriptor from a call which will take that descriptor as an argument: callMethod :: ( Descriptor a, GoesToJava a , Descriptor b, ComesFromJava b ) => String -> -> a -> Jobject -> IO b callMethod = \name -> \a -> \this -> let{ sig = methodDescriptor a (unsafePerformIO mb ); mb = this # Prim.callMethod name sig a } in mb

There are several things to note about the above code. First of all, it intimately relies on lazy evaluation, function methodDescriptor only depends on the types of its arguments and not on their values. Secondly, the call to unsafePerformIO is there solely to strip the IO from the result type of the callMethod function, which is completely safe. 5.6

Encoding Inheritance using phantom types

The callMethod function as we have reached by now is not yet as safe and flexible as it could be. In particular, we do not reflect the fact that in Java classes satisfy an inheritance relation. We can however encode inheritance using the same trick we used for COM components [6]. For the Java root class Object we define a phantom type Object_ that contains the low level untyped Jobject pointer, and immediately hide Object_ using type synonym Object (we will see in just a minute why): data Object_ a = Object Jobject type Object a = Object_ a Type Object_ is called a phantom type because its type parameter a is not used in its right hand side. For every class that extends its superclass , we define a new (empty) phantom type and a type synonym that expresses that extends : data a type a = ( a) The definition for expands to Object (... ( ( a)) ...), which precisely expresses the inheritance relationship between Object and and . An instance of the Java class is represented by a value of type (), which is still the same old Jobject pointer, but now guarded by a little type harness. The nice thing about this is that the Haskell type checker prevents us from using a method of the class at an instance for which is not a superclass. Of course, when actually calling a method we use the Jobject pointer that is hidden inside the Object wrapper: callMethod :: ( Descriptor a, GoesToJava a , Descriptor b, ComesFromJava b ) => String -> -> a -> Object o -> IO b callMethod = \name -> \a -> \(Object this) -> Prim.callMethod name a this 5.6.1

Class descriptors

Using the above encoding of Java classes into Haskell types, we can generate type descriptor strings for Java classes by defining a function descriptor for each Java class :

descriptor :: () -> String descriptor = \_ -> "L ;" instance Descriptor ( ()) where { descriptor = descriptor ; } A class descriptor is the fully qualified Java class with all .-s replaced by /-s; for instance the class descriptor for the Java class Object is java/lang/Object. For methods that take objects as arguments, JNI expects the class descriptor for the formal parameter and not of the actual argument. This means for example that if we call the method Component add(Component comp) of the Java class Container with an argument whose run-time class is Frame (a subclass of Container), we still must pass the class descriptor Ljava/awt/Component; instead of Ljava/awt/Frame;. The correct version of add silently casts its argument to a Component () just before making the actual call to function add: add :: Component a -> Container b -> IO (Component a) add = \comp -> \this -> do{ this # callMethod "add" (cast comp :: Component ()); } Inside function add we use the function cast :: a -> b to change the first argument of type Component a (which could be any subclass of Component) to exactly Component () so that we get the correct method descriptor. Function add itself remains completely type safe. The above situation is the only case where we have to explicitly up an upcast in Haskell; the question remains how to deal with the occasional need for downcasting. In Java, it is possible to unsafely cast between arbitrary classes, which might raise a dynamic ClassCastException. Using the Haskell cast to do an illegal arbitrary cast will either result in a ClassCastException or other exception being thrown in Java (which is checked using the JNI entry ExceptionOccured), or a call to one of underlying the JNI entry calls will signal an error (usually by returning NULL). The Haskell wrapper will catch these and propagate the errors to the Haskell level by raising a userError when returning. In other words, using cast to simulate arbitrary Java casting in Haskell is no more safe or unsafe than arbitrary casting in Java. 5.7

Dealing with interfaces

Besides classes, Java also supports interfaces. The role of interfaces in Java is quite similar to that of type classes in Haskell. Saying that a Java class implements an interface is like making an instance declaration in Haskell. To deal with interfaces in Lambada we introduce an empty Haskell type class for each Java interface, possibly by extending some one or more base interfaces, for example: class ImageObserver a class MenuContainer a class Serializable a

class EventListener a class EventListener a => WindowListener a For each method in an interface, we generate a function that requires its argument to satisfy the class constraint, eg for the MenuContainer interface we define: getFont :: MenuContainer a => () -> a -> IO (Font ()) getFont = \() -> \this -> do{ this # callMethod "getFont" (); } For each Java class that implements an interface , we make the corresponding Haskell type an instance of the Haskell type class that corresponds to the Java interface . For instance, the Java class Component implements the interfaces ImageObserver, MenuContainer, and Serializable, so in Haskell we write: instance ImageObserver (Component a) instance MenuContainer (Component a) instance Serializable (Component a) Note that we have to be careful not to write Component (), since every subclass of Component implements the interfaces that Component implements. Since Component is an instance of MenuContainer, a call c # getFont () is well-typed whenever c has type Component (), Container (), Window (), Frame (), or any subclass of these. Some methods expect interface ‘instances’ as arguments: void addWindowListener(WindowListener) To deal with this situation, we have to be able to generate descriptor strings for interfaces. For each interface we define a new type that uniquely identifies that interface and make an instance of Descriptor. Of course an interface satisfies its own interface constraint (in reality, we have to do name mangling because in Haskell data types and type classes are in the same name spaces but here no confusion can arise): data descriptor :: -> String descriptor = \_ -> "L ;" instance Descriptor where { descriptor = descriptor ; } instance interface Now we can (safely) coerce any object reference that implements a certain interface to that interface and obtain the correct descriptor. 5.8

Static methods and class objects

Up to now we have ignored static methods. A static or class method is a method that does not act on a class instance, but on the class object instance.

Calling a static method, say void Foo(), on a class object cls and given a JNIEnv pointer env is quite similar to calling an instance method, except that the method is not invoked on the object instance but on its class object: 1. Look up the methodID for the static method via its name and type description using the JNI method GetStaticMethodID: mid String -> a -> Jclass -> IO b Although in Java we can call a static method on an object (which the compiler translates to the right class), the recommended style is to call it on the class directly. By requiring the latter style, we can distinguish the types of callStaticMethod and callMethod. This enables us to introduce overloading to hide the distinction between calling static and instance methods: class JNI j where { call :: ( Descriptor a, GoesToJava a , Descriptor b, ComesFromJava b ) => String -> a -> j -> IO b; } For instance method calls we can immediately call callMethod: instance JNI (Object a) where { call = callMethod; } For static methods we define a new phantom type ClassObject such that ClassObject uniquely encodes the class object of a Java class . Type ClassObject just wraps a type harness around an untyped Jclass pointer: data ClassObject a = ClassObject Jclass The JNI instance declaration for static methods peels off the type harness and calls callStaticMethod: instance JNI (ClassObject a) where { call = \n -> \a -> (ClassObject clazz) -> clazz # callStaticMethod n a; }

So now we can call static methods on class objects, but how do we obtain class objects in the first place? The JNIEnv entry findClass takes a descriptor for the Java class and returns the class object for : getClassObject :: Descriptor (Object o) => (Object o) -> ClassObject (Object o) getClassObject = \this -> unsafePerformIO $ do{ env ClassObject (Object o) classObject = let{ c = getClassObject (stripObject c) } in c 5.9

Constructing object instances

The distinction between instances (subclasses of java.lang.Object) and class objects (subclasses of java.lang.Class) can be confusing at first, especially because we can also call non-static methods on class references. One such example are constructor functions to create class instances. Without using overloading to generate method signatures, we can combine the steps to create a new object into a single function as follows: new :: (GoesToJava a) => String -> Jclass -> a -> IO (Object o) new = \sig -> \clazz -> \a -> do{ cid a -> IO new_ = new 6

Simulating anonymous inner classes

Haskell functions/actions are represented in Java as objects of class Function, which store the actual exported Haskell function pointer as an int in a private field f and implement a native method call that takes an Object array as argument, and an Object representing the this pointer (or class object) and returns an Object as result: class Function { private int f; Function(int f){ this.f = f; } public native Object call (Object[] args, Object this); ... } A typical situation is that we have to create an event-handler that implements the WindowListener interface. To facilitate the construction of such a handler in Haskell, we define a Java class WindowHandler that has a private Function object field for each of its methods which are initialized by the constructor function. Each method in the WindowHandler class simply delegates its work to the appropriate Function object: class WindowHandler implements WindowListener { private Function activated; ...; WindowHandler (Function activated, ...) { this.activated = activated; ... } void windowActivated(WindowEvent e){ this.activated.call (new Object [] {e}, this); } ... } The JNI framework dictates that the native implementation of the call method is a function that takes the JNIEnv and the Jobject this-pointer as all native methods do, a Jobject argument that represents the argument array, the explicit this argument of the instance in which the call was made, and that returns a Jobject: call :: JNIEnv -> Jobject -> (Jobject -> Jobject -> IO Jobject) Function call uses the this pointer of the Function instance to retrieve the function pointer from the private field f and then calls that as a function of Haskell signature (Jobject -> Jobject -> IO Jobject): foreign import dynamic importFunction :: FARPROC -> IO (Jobject -> Jobject -> IO Jobject) call = \env -> \this -> \args -> \t -> do{ f Jobject -> IO Jobject) -> FARPROC new_Function :: Int -> IO (Function ()) All this quite low-level and error prone, so let’s use a type class to tidy up and do the grungy work for us. First of all, we extend the ComesFromJava class with a new member function unmarshallArray that unmarshalls zero or more values from a Jobject pointer representing an Object array to a tuple of corresponding Haskell values, and a new member function unmarshall that marshalls a single Jobject pointer into its corresponding Haskell value: class ComesFromJava b where { callMethod :: GoesToJava a => String -> String -> a -> Object o -> IO b; unmarshall :: Jobject -> IO b; unmarshallArray :: Jarray -> IO b } For basic types we simply use the unmarshaller for that type, or first get the single element from the array and unmarshall that: instance ComesFromJava where { unmarshall = \o -> do{ unmarshall o; unmarshallArray = \args args # (getArrayElement 0 ## unmarshall); } For pairs, triples, etc, we recursively unmarshall the elements from the array one at a time. instance (ComesFromJava a, ComesFromJava b) => ComesFromJava (a,b) where { unmarshall = unmarshallArray; unmarshallArray = \args -> do{ a Object o -> IO b), turns it into a low-level function of type Jobject -> Jobject -> IO Jobject and then dynamically exports it using exportFunction: marshallFun :: (ComesFromJava a, GoesToJava b) => (a -> Object o -> IO b) -> IO FARPROC marshallFun = \f -> exportFunction $ \args -> \this -> do{ a IO b) where { marshall = \f -> do{ a IO () that handle each of the seven possible Window events:

one-way interop between Haskell and Java using the basic low-level JNI methods, ie he does not support calling back from Java to Haskell. The most important contribution of our work over his is our sophisticated use of overloading and other advanced language features to hide the idiosyncrasies of JNI. The integration that Lambada offers between Haskell and Java is very close to the tight integration offered by MLj [1], with the important exception that Lambada is implemented completely within Haskell, whereas MLj needs many language changes. Acknowledgements We would like to thank Sheng Liang for inviting us explain Lambada inside the lion’s cave at JavaSoft, Martijn Schrage for being a model Lambada beta-tester, and Conal Elliot and Simon Peyton Jones for proofreading the paper during pleasant visits of the first author to Microsoft. S´ergio de Mello Schneider suggested the name Lambada. References

new_WindowHandler ( \event -> \this -> do{ this # activated event; } , ... ) This nearly exactly matches the corresponding code we would write in Java new WindowHandler () { void activated (Event e) { ...; } ... } 7

Tool support

To reiterate the introduction of this paper, Java is an interesting partner for a foreign function interface because there are a lot of Java classes out there. However, even with the layers of abstractions we have introduced on top of JNI, manually coding the Haskell callable stubs for the Java APIs we are interested in is not always practical. Lambada provides tool support in concert with HDirect[3]. HDirect is IDL based, so provided we can derive an IDL specification from a Java class or interface, HDirect will take care of generating valid Haskell stubs. We do this by providing a tool that uses the Java Reflection API to generate IDL interfaces (annotated with a couple of custom attributes) corresponding to Java classes or interfaces. Provided with that information, HDirect is then able to automatically generate the Haskell callable stubs we have described in this paper. HDirect is also capable of generating Java callable stubs to Haskell implemented classes/methods. At the Lambada implementation level, HDirect was also used to generate the low-level JNI stubs outlined in Section 3. 8

Related work

Claus Reinke was a remarkably early adaptor of JNI, as early as 1997 he proposed to use JNI to interoperate between Haskell and Java [10]. Reinke’s system provides only

[1] Nick Benton and Andrew Kennedy. Interlanguage Working Without Tears: Blending SML with Java. In Proceedings of ICFP’99. [2] Sigbjorn Finne, Erik Meijer, Daan Leijen, and Simon Peyton Jones. Calling Hell from Heaven and Heaven from Hell. In Proceedings of ICFP’99. [3] Sigbjorn Finne, Erik Meijer, Daan Leijen, and Simon Peyton Jones. HDirect: A Binary Foreign Function Iinterface for Haskell. In Proceedings of ICFP’98. [4] Rob Gordon. Essential JNI : Java Native Interface. Prentice Hall, 1999. [5] Simon Peyton Jones, Simon Marlow, and Conal Elliott. Stretching the Storage Manager: Weak Pointers and Stable Names in Haskell. In Proceedings of IFL’99, LNCS. [6] Simon Peyton Jones, Erik Meijer, and Daan Leijen. Scripting COM components from Haskell. In Proceedings of ICSR5, 1998. [7] Daan Leijen, Erik Meijer, and Jim Hook. Haskell as an Automation Controller. LNCS 1608. 1999. [8] Sheng Liang. The Java Native Interface. AddisonWesley, 1999. [9] Alastair Reid. Malloc Pointers and Stable Pointers: Improving Haskell’s Foreign Language Interface. http://www.cs.utah.edu/~reid/writing.html, September 1994. [10] Claus Reinke. Towards a Haskell/Java Connection. In Proceedings of IFL’98, LNCS 1595. [11] Julian Seward, Simon Marlow, Andy Gill, Sigbjorn Finne, and Simon Peyton Jones. Architecture of the Haskell Execution Platform (HEP) Version 6. http://www.haskell.org/ghc/docs/papers/hep.ps.gz, July 1999.

Writing High-Performance Server Applications in Haskell Case Study: A Haskell Web Server Simon Marlow Microsoft Research Ltd., Cambridge [email protected] 14th August 2000

Abstract Server applications, and in particular networkbased server applications, place a unique combination of demands on a programming language: lightweight concurrency and high I/O throughput are both important. This paper describes a prototype web server written in Concurrent Haskell, and presents two useful results: firstly, a conforming server could be written with minimal effort, leading to an implementation in less than 1500 lines of code, and secondly the naive implementation produced reasonable performance. Furthermore, making minor modifications to a few time-critical components improved performance to a level acceptable for anything but the most heavily loaded web servers.

1

Introduction

The Internet has spawned its own application domain: multithreaded server applications, capable of interacting with hundreds or thousands of clients simultaneously, are becoming increasingly important. Examples include FTP (File Transfer Protocol), E-Mail transport, DNS (name servers), Usenet News, chat servers, distributed file-sharing, and the most popular of all: HTTP servers. HTTP (HyperText Transfer Protocol [1]) is the protocol used to transport web pages over the Internet. The HTTP protocol is essentially transactionbased. The client first opens a connection to the server and sends a request message. The server interprets the request and, if the request refers to a valid document on the server, replies to the client

sending it the contents of the document. In early versions of the HTTP protocol, the connection between client and server would be closed at this point, requiring the client to open a new connection for each document request. The latest revision of the HTTP protocol allows the connection to remain open for further transfers (known as a keep-alive connection). The nature of the HTTP protocol imposes certain demands on a server implementation: • Concurrency is essential, since the server cannot allow itself to be tied up by a slow client, or a client requesting a large document from the server. The choice of concurrency model has the largest impact on the server’s performance, as we will discuss in Section 2. The server should be able to handle a large number of clients simultaneously transferring documents without performance being affected. • For small requests, the server must be able to process the request quickly to avoid tying up resources for too long. This means the latency for a new connection should be as low as possible. This aspect of a server’s performance can be measured by firing small requests at it at an increasing rate; at some point, the server’s performance will start to drop off as the rate increases beyond the latency and a backlog of connections in progress builds up. Another advantage of low latency is that the server is more resistant to denial-of-service type attacks. If it can throw out bogus requests as quickly as possible, then the overall server performance will be impacted less

when the server is flooded with requests from memory footprint of 3M. This test unconvered one a malevolent client. bug in the HTTP protocol implementation, which has since been fixed. We hope to replace the main • Fault tolerance is as important as performance: haskell.org web server (currently Apache) with the server should be able to recover gracefully the Haskell implementation at some point in the from errors, and never crash (or if it does, ar- future. range that it restarts itself so that there is a minimal interruption in service). It should also be possible to re-configure the server without 2 Concurrency Model taking it down. The choice of concurrency model is crucial in the So why write a web server in Haskell? Firstly, design of a web server. In this section we briefly because we can! It’s no bad thing for Haskell if examine the common models in use and list their we can compete effectively in important application advantages and disadvantages, and compare them domains such as web serving. Secondly, Haskell has with Concurrent Haskell’s approach. plenty of relevant attributes to bring to the party: • Concurrent Haskell [5] provides a lightweight concurrency model that helps to provide the low latency and low overhead for multiple simultaneous clients that is essential for good server performance. • Recent extensions to support exceptions [6] provide useful facilities for coping with runtime errors. • Asynchronous Exceptions [3] allow concise implementations of important features such as timeouts, as we shall see in Section 4. Asynchronous exceptions are also useful for allowing the server to respond to external stimulus, such as when the administrator wishes to make changes to the server’s configuration (see Section 6).

2.1

Separate Processes

This is the model used by Apache, where each new connection is handled by a new process on the machine. A single top-level process monitors incoming connections and spawns a new worker-process to talk to each client. Advantages: • Simple to implement. • Takes advantage of multiple processors automatically. Disadvantages: • Very heavy-weight, in terms of the startup and shutdown cost, context-switch overhead, and memory overhead for each process. (Apache maintains a cache of spare processes in order to reduce the start-up time for new connections).

• We used a wide range of libraries in constructing the web server, including a networking li• Interprocess Communication is hard. brary, a parsing combinator library, an HTML Interprocess communication is required for cergeneration library and a POSIX system interface library. None of these libraries are speci- tain aspects of a web server’s operation. For examfied as part of the Haskell standard as yet, but ple, there is normally a single log file which records transaction events (see Section 5), and hence mulall are distributed with the GHC compiler. tiple threads must cooperate for access to the log This paper describes our web server implementa- file. tion, focussing on the parts which required extensions to Haskell, in particular the concurrency and 2.2 Operating-System Threads exception support. The server implementation performs well, as we shall show in Section 7. It is also Operating system threads may map onto processes, reliable - we tested it as an alternate server for the in which case the overhead will be similar as for haskell.org site, where it ran for a week, collected separate processes (modern operating systems will about 2000 hits during that time, and kept within a share page tables between forked processes in any

case), or they may be implemented as light-weight kernel threads, in which case the overhead will be lower. Advantages: • Interprocess communication is easier. • Takes advantage of multiple processors automatically. Disadvantages:

Advantages of these methods: • Very fast, especially the latter methods. Disadvantages: • A web server is an inherently multi-threaded application, so programming without the concurrency abstraction is bound to be painful.

• Fairly heavy-weight, although possibly lighter than separate processes.

There exist several web servers which use these methods, and they are currently the fastest servers • Writing multithreaded code is harder and more around. error prone than single threaded code.

2.3

Monolithic Process with I/O 2.4 multiplexing

Another approach is to implement the desired concurrency directly, foregoing any time-sharing facilities provided by the operating system itself. The single requirement for this approach is to be able to multiplex several I/O channels. The existing methods for multiplexing I/O in a single process include: • Use POSIX’s select() (or equivalently pol()) functions. These functions tests multiple file descriptors simultaneous, returning information on which of the descriptors are available for reading or writing. The idea is that the application then performs any available reads and writes (using non-blocking I/O), and then returns to call select() on the list of open file descriptors again. This approach suffers from the problem that select() is O(n), where n is the number of file descriptors being tested, because the application must build up a list of length n to pass to select() and the OS must traverse this list to build up the results. • Asynchronous I/O and POSIX real-time signals. These methods are both ways of alleviating the linear behaviour of select(). They are both relatively new features and only a few operating systems support them. Any implementation which uses select() can be converted to use asynchronous I/O or realtime signals with relatively little effort.

User-space threads

User-space threads are essentially an implementation of a thread abstraction inside a single process (or possibly on top of a small number of operating system threads; see later). The programmer gets to write his/her application using the concurrency primitives provided by the language, and the user-space threads implementation will provide the low-level time-sharing and I/O multiplexing support. Several implementations of user-space POSIX threads exist for Unix. Concurrent Haskell (or at least the implementation in GHC) is also an instance of this model; the Haskell runtime system runs in a single operating system process and multiplexes many Haskell threads. To support multiple Haskell threads performing I/O simultaneously, the runtime system may choose between the I/O multiplexing options described in the previous section. However this is implemented, the choice is invisible to the programmer. In a way, user-space threads provide the best of both worlds. The concurrency is lightweight, and the programmer doesn’t need to be concerned with the details of I/O multiplexing. There is one disadvantage, though: a user-space threads package won’t normally be able to take advantage of multiple processors on the host machine. The GHC development team are currently working on an implementation of Concurrent Haskell that doesn’t suffer from this deficiency, by using a small number of operating-system threads to share the load.

3 3.1

Structure of the web server The main loop

The main loop is strikingly simple: acceptConnections :: Config -> Socket -> IO () acceptConnections conf sock = do (handle, remote) logError e) ) acceptConnections conf sock

catch :: IO a -> (Exception -> IO a) -> IO a This combinator performs its first argument, and if an exception is raised, passes it to the second argument (the exception handler ), otherwise it returns the result. In contrast to finally, catch specifies an action to be performed only when an exception is raised, whereas finally specifies an action which is always to be executed. The above code uses catch to catch any errors and log them to the error log file, which we describe in Section 5.

3.2

HTTP protocol implementation

Serving a request is a simple pipeline:

1. read the request from the socket, acceptConnections takes a server configuration and a listening socket, and waits for new connec2. parse the request, tion requests on the socket. When a connection request is received, a new worker thread is forked 3. generate the response, with forkIO, and the main loop goes back to waiting for connections. 4. send the response back to the client, The worker thread calls talk (the definition of 5. if the connection is to be kept alive, return to talk is given in the next section), which is the step 1. main function for communicating in HTTP with a client. The interesting part here is what happens Reading the request from the socket is performed if an exception is raised during talk. The finally by getRequest: combinator allows strict sequencing to be specified, independent of exceptions: getRequest :: Handle -> IO [String] finally :: IO a -> IO b -> IO a which takes the file handle representing the socket This combinator behaves much like finally in on which communication with the client is taking Java. It performs its first argument, then performs place, and returns a list of strings, each one being its second argument (even if the first argument a single line of the request. A typical request looks raised an exception), then returns the value of the something like this: first argument (or re-raises the exception). In the main loop above, we’re using finally GET /index.html HTTP/1.1 to ensure that the socket to the client is properly Host: www.haskell.org closed down if we encounter an error of any kind, Date: Wed May 31 11:08:40 GMT 2000 including a bug in our code. Although the runtime system will automatically close files which are The first line gives the command (GET in this determined to be unused, it is beneficial to close case), the name of the object requested, and the them down as early as possible in order to free up version of the HTTP protocol being used by the client. Subsequent lines, termed headers, give adthe resources associated with the file. The call to talk is also enclosed in a catch com- ditional information, and are mostly optional. The server is required to ignore any headers it doesn’t binator1 : understand. 1 this catch differs from the standard IO.catch in The next stage is to parse the request into a Haskell; here we are referring to Exception.catch from ghc’s Exception library Request structure:

data Request = Request { reqCmd :: RequestCmd, reqURI :: ReqURI, reqHTTPVer :: HTTPVersion, reqHeaders :: [RequestHeader] }

sendResponse :: Config -> Handle -> Response -> IO () Pulling all this together, the top-level talk function looks something like this:

talk :: Config -> Handle -> HostAddress -> IO () parseRequest :: Config -> [String] talk conf handle haddr = do -> Either Response Request strs sponse: this indicates failure, and the response will sendResponse conf handle resp in most cases be a “Bad Request” response, but Right req -> do may be something more specific. resp IO a -- action to run ated HTML describing the error in the body (the -> IO a -- action to run on timeout HereItIs body type is for this purpose). -> IO a In the common case where the response body consists of an entire file verbatim, the respBody The application timeout t a b should behave component of the Response structure doesn’t con- as follows: a is run until it either completes, or t tain the entire file body as a string, rather it con- seconds passes. If it completes in time t, timeout tains just the path to the file. This is so that we returns the result immediately, otherwise a is tercan use more efficient methods for sending the file minated with an exception and b is executed. If a to the client than simply converting to the contents (or b, in the case of a timeout) raises an exception, to and from a String. then the exception will be propagated by timeout. The final step is to send the response to the The timeout function has no other side effects, client: so timeouts can be nested arbitrarily. genResponse

:: Config -> Request -> IO Response

Thanks to asynchronous exceptions [3], we can implement a timeout combinator with the above properties. Note that because the action a can be terminated with an asynchronous exception at any time, it should be exception safe, that is it must be sure not to leave any mutable data structures in an inconsistent state or leak any resources. In fact, all our code should be written to be exception-safe, because exceptions like stack overflow and heap overflow are delivered asynchronously. There are two primitives which are useful in writing exception-safe code: block :: IO a -> IO a unblock :: IO a -> IO a where block a executes a with asynchronous exceptions blocked, that is any thread wishing to raise an asynchronous exception in the current thread must wait until exceptions are unblocked again. Similarly, unblock a unblocks asynchronous exceptions during the execution of a. Applications of block and unblock can be arbitrarily nested. Here’s an example of acquiring a lock, where the lock is represented by an MVar, m, such that the lock will always be released safely if we receive an exception: block (do a do putMVar m a; throw e) putMVar m a ) Use of combinators such as finally (described in the previous section), and bracket:

5

Logging

A web server normally produces log files listing all the requests made and certain information about the response sent by the server. Each entry in the log normally records • the time the request was received, • the requestor’s address, • the URL requested, • the response code (either success or some kind of failure), • the number of bytes transfered, • the time taken for the request to complete, • the client software’s type & version, • the referring URL. The format of log entries is configurable, and may include other fields from the request or response. There exist standard log entry formats produced by the popular servers, and software available which processes the log files to produce reports. For this reason, we decided that the Haskell web server should be able to produce compatible logs. A worker thread causes a log entry to be recorded by calling the function logAccess: logAccess :: -> -> -> ->

Request Response HostAddress TimeDiff IO ()

passing the request, the response, the address of the client and the time difference between the request being received and completion of the response. The actual writing of log entries to the file is are also helpful in writing exception-safe code. For example, a simpler way of writing the above locking done by a separate thread. Worker threads communicate with the logging thread via a global unsequence is bounded channel, by calling logAccess. The logging thread removes items from the channel and bracket (takeMVar m) (putMVar m) (...) manufactures log entries which are then written to The full story on asynchronous exceptions in a log file. Haskell, including the implementation of the Placing the logging in a separate thread is a good timeout combinator above, can be found in [3]. idea for several reasons: bracket :: IO a -> (a -> IO b) -> (a -> IO c) -> IO c

• It helps reduce the total load on the system, While global variables are very convenient in cerbecause the worker threads can finish, and tain cases, there are a few points to bear in mind: hence be garbage collected, before the log en• This use of unsafePerformIO really is untry has been written. safe, because the program now behaves • It means that a thread serving multiple redifferently if occurrences of global_mvar quests can proceed immediately with the next are replaced with their values, namely request without waiting for the log entry of the (unsafePerformIO newEmptyMVar). In order first request to be written. to stop this happening, we have to circumvent any optimisations in the compiler which re• The logging thread can batch multiple requests place names with their values. In GHC, this and write them out in one go, which is likely amounts to adding the pragma to be more efficient than writing them one at a time. {-# NoInline global_mvar #-} The logging thread is designed to be fault tolerant: if it receives an exception of any kind, it atsomewhere in the source code. tempts to restart itself by re-opening the log file for • Care must be taken to give a type signature for writing, and continuing with the next log request. the global variable and not to declare global This behaviour is (ab)used by the main loop, which mutable variables with polymorphically-typed needs to restart the logging thread whenever it recontents. Type safety is in danger if this ceives a request to re-read the log file. rule is broken, because the contents of a polymorphically-typed variable could be ex5.1 Error logging tracted and used at any type. Logging of server errors is handled in a similar way • In a concurrent program, it is important to to logging of requests. A separate thread writes log use MVars instead of just IORefs when multiple entries to an error log file, taking log requests from threads may have access to the variable (an a global channel. Exception handlers are scattered IORef is a plain mutable variable, whereas an around the main request/response handling code, MVar adds synchronisation). which catch exceptions and log them with an informative message indicating where the error ocIf we observe these rules, however, global mucurred, before passing the exception on to be hantable variables are a useful concept. We use global dled at the top level. mutable variables in the web server in the following The error logging thread also restarts itself if it places: receives an exception (and logs this event to the log file). • To store the command line options. The program is only started once, so this is a write5.2 Global variables once mutable variable. In the above description of the logging threads, we mentioned that communication between a worker thread and a logging thread was via a “global channel”. How does one define a global channel in Haskell? This is one instance of global mutable variables, which we make use of in several places in the web server. A global MVar, for example, can be declared as

• To store the channels by which the worker threads can communicate with the logging threads. • To store the ThreadIds of the logging threads, so that the main thread can send them an exception to restart them.

A cleaner, but less efficient, alternative to using global_mvar :: MVar String global variables would be to use implicit parameters global_mvar = unsafePerformIO newEmptyMVar [2]. We haven’t investigated this route as yet.

6

Run-time configuration

Our web server is configured by editing a text file, in a similar way to other popular web servers. The syntax of the configuration file is similar to that of Apache’s. When the server starts up, it parses the configuration file, and if there are no errors found, immediately starts serving requests. In the interests of high availability, a web server should preferably also be run-time configurable. For example, when new content is placed on the server, the administrator somehow needs to inform the server that the new content is available and where to find it. It is occasionally necessary to change certain options, or tweak security settings, on a running server. To take the server down and restart it with the new configuration would be unsatisfactory, because the site would be off-line during the restart. So a running server should be able to re-read its configuration file without interrupting operations. But what about transactions that are already in progress? Should they see the new configuration immediately? In our server, we take the approach that the new configuration should only take effect for new connections, and existing connections should be allowed to continue using the old settings. This approach avoids a number of problems with changing configuration settings while a request is in progress: for example, if the security settings are changed such that a file being transmitted is no longer available to the client that requested it, should the transfer be terminated? In fact this behaviour is desirable, but a sledgehammer solution is to restart the server altogether if any security settings need to change. Implementing run-time configuration updates in the Haskell web server turned out to be straightforward: as we’ve already seen, the key functions in the inner pipeline all take an argument of type Config, which contains the current configuration. The configuration is passed into the worker thread when it is created, so when the configuration changes all we need to do is ensure that any new threads receive the new configuration. The approach we took is to send the main thread an exception when it should re-read the configuration file. This gives us the option of having several ways to force a configuration change:

• A signal on Unix-like operating systems. This is the traditional way to kick a process into re-reading its configuration file, and consists of sending the process a signal from the command line. This method is implemented in our web server as follows: the incoming signal causes a new thread to start, which immediately sends an exception to the main thread. The main thread catches the signal and re-reads the configuration file. • Implementing a proprietary HTTP command, which the administrator can use to reconfigure the server on-line and remotely. Secure authorisation would certainly be needed if this method were to be used. • Any other type of inter-process communication provided by the host operating system.

7

Performance Results

In this section we present our preliminary performance results for the Haskell web server.

7.1

Performance tweaks

We made several tweaks to the initial implementation of the server to remove some of the larger performance bottlenecks. • We replaced the naive file transfer code which used getContents and hPutStr, with a version which does I/O directly to and from an array of bytes. GHC’s IOExts library provides simple primitives for doing this. • By default, GHC’s scheduler context switches about 5000 times a second. We reduced this to something more reasonable, 50 times/sec, which made a substantial difference to the results. The reason is that GHC’s scheduler currently does a select on every context switch to determine which I/O bound threads can be woken up. As discussed in Section 2, select is O(n), so reducing the number of times we do it is a win when the system becomes heavily loaded. • Tweaking the garbage collection settings had some effect, in particular increasing the allocation area size. GHC by default increases

750 "Haskell Web Server" 700 650

Reply rate (1/s)

600 550 500 450 400 350 300 250 300

400

500

600 Request rate (1/s)

700

800

900

Figure 1: Connection latency results

Note that only one of these tweaks, namely the its heap usage in line with the program’s demands, but giving the program more memory optimisation of the file transfer code, was made to from the outset is usually a win, and was in the web server code itself, the rest were tweaks to GHC’s runtime system and libraries. Indeed, the this case. web server has been a useful source of insight into performance bottlenecks in GHC’s concurrency and • Reading a request turned out to be expen- I/O support. sive, due to an inefficient implementation of hGetLine in GHC’s I/O library. Rewriting this function improved performance by 10% or 7.2 Connection latency so. These measurements were made using httperf [4], a tool which can be used for generating requests at • GHC’s I/O library uses a system of finalizers a specific rate. It is used primarily for determining to ensure that the buffers and other resources the rate of connection requests a web server can associated with a file descriptor are freed when sustain before performance starts to drop off. In this test, the server machine was a singlethe program releases the file handle. Finalizprocessor PII/450 running Linux 2.2. The client ers are normally run in a thread by themselves, was a separate machine on a local 100Mbit etherbut this turned out to be expensive for the web server, since most connections give rise to two net connection. The total number of requests sent finalizers: one for the handle to the socket it- was 4000 in each test. All the requests were for the self, and one for the file being transfered. We same 1k file. The timeout on the client was set at changed the finalisation mechanism to batch 1 second. finalizers in a single thread after each garbage A graph of reply rate against requests issued per collection, which led to a small overall perfor- second is given in Figure 1. This shows clearly how the server keeps up with the client until the remance improvement.

quest rate rises above the rate that the server can handle without accumulating a backlog (about 710 requests/second), at which point performance begins to decrease sharply. Why does performance decrease so dramatically? Two possible factors are:

the HTTP/1.1 standard (and more) in less than 1500 lines of Haskell (not including library code), and the resulting server performs admirably in realworld conditions. Furthermore, it is fault-tolerant and runs in a constant, and small, amount of memory over a long period of time. • As connections in progress accumulate on the In order to achieve this, we had to make use of server, the O(n) behaviour of select() as a number of extensions to Haskell, the main ones used by GHC’s scheduler comes into play. being concurrency and exceptions. We also made use of a large amount of library code, all of which is • As the number of threads in the system inpart of GHC’s library collection. The libraries we creases, thus the cost of garbage collection used include a networking library, a parsing comalso increases. Garbage collection is necessarbinator library, an HTML generation library and a ily O(n) in the number of live threads, since POSIX interface library. it must traverse the active thread queues to determine which threads are live.

The server doesn’t currently limit the number of of connections in progress, and in fact at a rate of 850 connections/sec the number of concurrent connections observed on the server was over 700 during the test. Setting a limit on the number of concurrent connections would help to flatten the graph after the drop-off point. On the same hardware, Apache (the most commonly used web server software) tops out at 950 requests/second, and the drop off is less sharp. One reason for the shallower drop off is that Apache limits the number of active connections to 256 by default, with any incoming connections over the limit being simply refused by the operating system. To put these figures into perspective, the most heavily loaded web servers on the net (eg. http://www.yahoo.com/) take an average of about 5000 hits/second, with peaks of probably 10000 hits/second. These sites use collections of identically configured servers with a load-balancing arrangement to spread the requests between the available machines. However, for most sites on the net the performance turned in by our Haskell Web Server is more than adequate, and there’s still plenty of opportunities for improvement: we haven’t really made any attempt to optimise the code of the server itself, beyond fixing the slow file transfer.

8

Conclusions

The primary result presented here is that we constructed a web server in Haskell which conforms to

References [1] R. Fielding et al. Hypertext transfer protocol HTTP/1.1, June 1999. RFC 2616. [2] J. Lewis, M. Shields, E. Meijer, and J. Launchbury. Implicit parameters: Dynamic scoping with static types. In 27th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’00), Boston, Massachusetts, January 19–21 2000. [3] S. Marlow, S. P. Jones, and A. Moran. Asynchronous exceptions in haskell. In 4th International Workshop on High-Level Concurrent Languages, Montreal, Canada, September 2000. [4] D. Mosberger and T. Jin. httperf—a tool for measuring web server performance. http://www.hpl.hp.com/personal /David Mosberger/httperf.html. HewlettPackard Research Labs. [5] S. L. Peyton Jones, A. Gordon, and S. Finne. Concurrent Haskell. In Proc. POPL’96, pages 295–308, St. Petersburg, Florida, Jan. 1996. ACM Press. [6] S. L. Peyton Jones, A. Reid, T. Hoare, S. Marlow, and F. Henderson. A semantics for imprecise exceptions. In Proc. PLDI’99, volume 34(5) of ACM SIGPLAN Notices, pages 25–36. ACM Press, May 1999.

Haskell Server Pages Functional Programming and the Battle for the Middle Tier Erik Meijer & Danny van Velzen Utrecht University mailto:[email protected] mailto:[email protected] http://www.cs.uu.nl/~erik

Abstract

site went down for 48 hours recently, they lost over five million dollars.

Haskell Server Pages (HSP) provide an easy way to create dynamic web pages and simplify the task of building middle tier components. This article gives an overview of HSP from a programmer’s perspective. It includes examples of HSP in action and gives a precise description of translating HSP scripts into plain Haskell.

Viewed in this light it is remarkable that middle tier components are written using weakly typed languages such as Visual Basic [28], JavaScript [4] or Perl [13]. Such languages provide no guarantees that programs don’t crash or give incorrect answers. Server redundancy, process protection etc. are just band-aid measures that don’t solve the real problem.

1

Overview

Over the next few years, virtually every major application will be completely web-based, or at a minimum have a web front end. Such distributed applications are often structured using a three-tier model, which consists of: • User-Services Tier: The front-end client that communicates with the user through some user interface, eg textual, graphical, telephone- or voice-driven.

Our research program is targeted at providing languages, tools and libraries for fast and safe construction of middle tier components. To communicate between the middle tier and the data access layer we have developed HaskellDB [15]. To make effective use of the middle tier runtime we have added support for accessing classic COM [14] and Java [22] to Haskell, and we are currently building a new scripting language called Mondrian [25] and a new back end for GHC that compiles to Java and the new Microsoft .NET runtime.

• Data-Services Tier: The back-end database server that stores business data.

To communicate between the presentation layer and the middle tier we have implemented HaskellScript [23] for writing client-side DHTML scripts and HaskellCGI [21], a library for writing server-side CGI scripts. Experience with the CGI library however, showed that embedding HTML or XML fragments using combinators in Haskell is rather awkward. The reason is that documents often contain more plain text than markup elements. Writing long string literals in Haskell using quotes and literal gaps1 is not the most convenient way to denote string content in XML fragments.

From an application programmer’s point of view all the action is in encoding the business logic and providing the glue between the user-services tier and the data-services tier.

In this paper we present Haskell Server Pages (HSP) as a new way to generate dynamic content using Haskell. The most important contributions of HSP are that

The business-services or middle tier is usually built from software components that live in a special runtime environment (such as COM+, EJB, or Corba Component Services) [29] that provides services for:

• XML fragments are first class values: XML fragments have the same status as ordinary Haskell expressions, so we can put them in lists, pass them as arguments to and return them as results from functions, etc. This allows many common document processing patterns to be captured in small highly reusable libraries [11].

• Business-Services Tier: A collection of software components that encode business logic, and process the information flow between the user- and data-services tiers.

• Distributed transaction management. • Security services to control object creation and use. • Management and pooling of processes and threads, object instances, database connections, and other resources. Internet companies such as eBay, Amazon, Barnes and Noble, and AOL must have their business up and running 24 hours a day, 365 days a year. For instance when the eBay

• XML fragments can be pattern-matched: Constructing XML fragments is just one side of the coin. In many cases we need to transform one XML document into another. Being able to do this in a higher-order, polymorphic functional language is much more convenient than 1 A little know feature of the Haskell language, formed by two backslashes enclosing white space. Literal gaps allow long strings to be spread across different lines.

using separate languages such as XSL [35] or frameworks like the Document Object Model (DOM) [6]. • PCDATA fragments need not be quoted: Since we write XML fragments using concrete XML syntax, PCDATA strings need not be quoted. This simplifies the construction of large documents. • Haskell values can be recursively embedded inside XML fragments: We can escape from the XML level back to Haskell and embed any Haskell that is an instance of the class IsChild (or for attribute IsValue). Since IO is an instance of these classes, it is even possible to embed commands inside documents. • HSP is easily translated into pure Haskell: As we show in this paper, HSP can be implemented as a simple preprocessor to Haskell. The amount of syntactic sugar is comparable to that of the do-notation or list-comprehensions, and is defined by just a handful of rules. A popular route to implementing domain-specific languages is by embedding into Haskell [10], and HSP is no exception. However, a necessary condition for this to work is that all the features of the domain-specific language can be mapped to a corresponding feature of Haskell. In the case of HSP, Haskell falls short in two areas [24, 31]: • Proper pattern-matching on XML fragments requires the use of a Glushkov automaton to direct the matching of regular expressions. Because we are piggybacking on Haskell’s more primitive pattern-matching mechanisms, we are forced to introduce some over specification in certain places. • Guaranteeing validity of XML documents against their DTDs requires a sophisticated type system that supports type indexed rows. HSP does guarantee wellformedness of XML fragments, but unfortunately the type system needed for validating XML cannot be simulated in Haskell. Even well-formedness and simple pattern-matching on documents alone is already a much stronger property than most other competing languages can offer, but there are many occasions where we do want to match and type arbitrary XML. For this we are developing a new language XMLambda [24] which has XML documents as its basic data types. In the mean time, HSP remains an excellent example of the Pareto principle2 ; we get 80 percent of the functionality of XMLambda for 20 percent of the cost. In the rest of this paper we will first give an overview of the current state of affairs in generating documents in the middle tier (section 2) and explain why they fall short. Next we give an informal introduction to HSP (section 3) and show in detail how HSP scripts can be translated into pure Haskell (section 4). We end the paper by describing how HSP applications are integrated into the middle tier runtime environment (section 5). 2 Dr. Joseph Juran (of total quality management fame) formulated the Pareto Principle after expanding on the work of Wilfredo Pareto, a nineteenth century economist and sociologist. It is a shorthand name for the phenomenon that in any population which contributes to a common effect, a relative few of the contributors account for the bulk of the effect

2

Web-based applications

In most 3-tier applications the work of the middle tier is usually split recursively into three tiers again. Client applications run inside a browser and submit requests to a web server using HTTP. The ‘presentation layer’ on the server transforms the request and passes it on to the ‘business layer’ which will perform some computation by interacting with the ‘data layer’. The results from the ‘business layer’ are then transformed into HTML or XML by the ‘presentation layer’ and returned as the response to the client. 2.1

Server Pages

The most popular way of generating HTML (or XML) responses in the middle tier is by using so called server pages. A server page is a special HTML page that contains embedded scripts. Server pages exist for many languages, such as JSP (Java Server Pages) [26], PSP (Python Server Pages) [1], PHP (PHP: Hypertext Preprocessor) [17], XSP (Extensible Server Pages) [20], and ASP (Active Server Pages) [9]. The latter is quite interesting as it is a language independent framework. Any scripting language that supports the ActiveX Scripting Architecture can be used to write ASP scripts [16, 3]. Here is the inevitable “Hello World” example written as a server page in Visual Basic. The Response.Write "Hello World!" statement writes the string "Hello World" on the Response output stream that is sent to the client:

Example



Sometimes is used as an abbreviation for . The choice of primitive functions and the tags that delimit code differ for the various flavors of server pages. For example PHP uses echo instead of Response.Write, and as tags but also supports . Server pages are typically implemented using a process called page compilation, which transforms a hybrid HTML/script page into a single script that will generate the required HTML. For instance, the example above would be translated into a script that writes each line of the original server page to the Response stream: Response.Write Response.Write Response.Write Response.Write Response.Write Response.Write Response.Write Response.Write

"" "" "Example" "" "" "Hello World!" "" ""

It is important to note that the translation is not applied recursively inside code fragments.

This simple semantics can make server pages rather confusing, especially if control structures are used to generate HTML. As the next example shows, code fragments embedded inside need not be syntactically complete phrases themselves:

Example

Hallo Wereld!

Hello World!

Server pages force you to think backwards from the fully explicit generated code to some level of HTML mixed with script to make sure that the generated code is syntactically correct: Response.Write "" Response.Write "" Response.Write "Example" Response.Write "" Response.Write "" If Domain = "nl" Response.Write "Hallo Wereld!" Else Response.Write "Hello World!" End If Response.Write "" Response.Write "" 2.2

Server Pages break the principle of Abstraction

The semantics of server pages is not compositional, that is, to understand what a server page means, we need to do a complete page translation and inspect the resulting code. Naive server pages provide just a very thin layer of veneer that hides a few calls to Response.Write, but they do not make HTML fragments into first class citizens. Hence they break Tennent’s principle of abstraction [32, 27] that says that values of a syntactically relevant domain can be given a name. An abstraction mechanism that does not satisfy this basic principle is rather useless! As an example server page that shows the lack of abstraction, we will generate an HTML table of 16x16 cells, where the background color of each cell is determined by the cell’s coordinates via the (unspecified) function GenColor. The ASP page contains two nested loops, the outer one generates the table rows and the inner generates the cells:









(,)
Note that it is possible to use script inside attributes as well. Suppose we want to lift out the inner loop into a separate procedure GenData to make the page more orderly.





To do so, we cannot merely copy and paste the inner loop into the body of the GenData procedure. We have to keep in mind that we are inside an ASP page and enclose the procedure header and footer inside braces.

(,)

The downside of this is that we cannot take this fragment out of the ASP page and reuse it inside an ordinary Visual Basic module. Inside Visual Basic, the only way to produce HTML is by using Response.Write to write the HTML as a string. Hence to truly abstract the inner loop, we must first manually apply the page compilation algorithm (in Visual Basic & denotes string concatenation), and turn things inside out to write HTML strings to Response object. Instead of code embedded inside HTML, we now have unstructured strings that represent HTML embedded inside code:

The weakness of ASP and almost all other naive server page frameworks based on page translation is that they don’t really mix HTML and script; they don’t make HTML into a first class citizen. As a result it becomes hard to use the procedural abstraction mechanisms of the scripting language to structure a page. Before we can do so, we first need to translate the server page into its underlying sequence of Response.Write statements and then use procedural abstraction to name that sequence of statements.

3

Informal introduction to Haskell Server Pages

The main idea of Haskell Server Pages is to treat HTML (or in fact XML) fragments as ordinary expressions. Inside HTML fragments, strings are not escaped, so we need to escape code fragments instead. For this we reuse the tags. Inside embedded expressions we can of course again introduce HTML fragments, and inside those we can again embed normal expressions by escaping them inside , etc. etc. The meaning of HSP scripts is compositional, and thus easy to understand. Just replace the expression between the by its value. Attribute values need to be surrounded by double quotes already, hence we can allow any atomic expression to occur in an attribute value position without the need for an escape mechanism. The following HSP code generates the same 16x16 table as the ASP code we have seen above (the types TABLE, TR and TD are all synonyms for Element): table :: TABLE table =

cells cells [ [ | y ]

:: [[(Int,Int)]] = (x,y) | x mkColumns :: [(Int,Int)] -> [TD] mkColums = map $ \c ->

In this example, and in many of the examples that follow, we use right-associative function application $ to eliminate parenthesis. HSP is also implemented using page translation. Instead of translating a page into a sequence of imperative statements, HTML fragments are translated into Haskell expressions using the ‘smart’ constructor mkElement :: Name -> Attributes -> [Child] -> Element and the overloaded mkChild :: IsChild a => a -> Child and mkValue :: IsValue a => a -> Value functions (a detailed description of the translation is given in section 4): table = mkElement "TABLE" [("border", Value "1")] $ [mkChild (mkRows cells)] mkRows = map $ \cs -> mkElement "TR" [] $ [mkChild $ mkColums cs] mkColums = map $ \c -> mkElement "TD"

[("bgcolor", mkValue $ genColor c)] $ [mkChild c] By using HSP, programmers can write concrete HTML in their programs. This shields them from the irrelevant details of using some (arbitrary) encoding of HTML in Haskell, and, most importantly, string literals need not be escaped. At the same time, they have all the power of Haskell’s abstraction mechanisms to define and structure functions over such HTML fragments. 3.1

Embedded commands

So far we have only dealt with embedded expressions. What about embedded side-effecting commands? Note that since in Haskell side-effecting computations are just normal values in the IO monad, we need not leave the expression world in order to create them, ie embedded commands are already paid for. The following example generates an HTML list that records the current date (assuming the existence of the three side-effecting functions getDay :: IO WeekDay, getMonth :: IO Month and getYear :: IO Year): today =
  • Day:
  • Month:
  • Year:
When translated into Haskell, the mkChild function leaves embedded commands untouched (in this particular case mkChild getDay evaluates to ChildIO (fmap (Leaf.show) getDay). today = mkElement "UL" [] [ ChildElement $ mkElement "LI" [] [ Leaf "Day:", mkChild getDay ] , ChildElement $ mkElement "LI" [] [ Leaf "Day:", mkChild getMonth ] , ChildElement $ mkElement "LI" [] [ Leaf "Day:", mkChild getYear ] ] The function run :: Element -> IO Element executes the commands embedded in a document in depth first order. 3.2

Pattern-matching

Being able to construct HTML fragments using concrete syntax is just one side of the coin. Often we also want to do pattern-matching on HTML or XML fragments using concrete syntax. As an example of using pattern-matching in HSP, we will develop a simple e-mail component that accepts e-mail messages in some XML format and translates them into the standard RFC822 format [12, 19]. Our program will translate the following XML message

[email protected]

[email protected] [email protected]

HSP

HSP is cool, isn’t it?

It sure beats ASP!







is translated by the HSP preprocessor into the following normal Haskell pattern Node "MSG" [] [ ChildElement , ChildElement , ChildElement , ChildElement ]

into the ASCII Internet message From: [email protected] To: [email protected],[email protected] Subject: HSP HSP is cool, isn’t it? It sure beats ASP! The toRFC822 function does a match on the top-level structure to get to the individual fields of which the message is composed (the types MSG and RFC822 are synonyms for Element respectively RFC822, but express the intent that their values are valid MSG trees respectively valid RFC822 messages): toRFC822 :: MSG -> RFC822 toRFC822 = \



-> concat [ "From: " ++ from , crlf , "To: " ++ intersperse "," (stripTOs rcpts) , crlf , "Subject: " ++ subject , crlf , crlf , intersperse crlf (stripPs paras) , crlf ] To build the To header and the message body, function toRFC822 uses two functions stripTOs and stripPs that remove the respectively the

tags:

HSP will match a single child to the pattern [Leaf p] of a singleton list of a Leaf child, and a single child to p itself. All other child patterns are translated into a finite list. Hence the pattern that appears in the toRFC822

"FROM" "RCPT" "SUBJECT" "BODY"

[] [] [] []

[Leaf from]) rcpts) [Leaf subject]) paras)

This convention turns might perhaps look odd at first sight, but in practice it turns out to be very convenient to circumvent the inability to express the regular expression matching that is needed to do ‘real’ pattern-matching on XML documents. 4

Embedding and Translation

Now that we have seen a number of examples of using HSP, it is time to get down to the nitty gritty details of the HSP implementation. 4.1

Syntax

Syntactically, HSP simply adds XML fragments as possible atomic Haskell expressions and patterns. aexp

xml

→ | | | | |

qvar gcon literal (exp) ... xml

→ child1 . . . childn |

Inside XML fragments, strings (PCDATA in XML lingo) need not be enclosed inside double quotes, but Haskell code needs to be escaped using brackets. child

stripTOs :: [TO] -> [String] stripTOs = map $ \ -> to stripPs :: [P] -> [String] stripPs = map $ \

-> p

(Node (Node (Node (Node

→ | |

PCDATA xml

Attributes are simple name = value pairs, where the value can be any atomic Haskell expression. attrs

→ name = aexp . . . name = aexp

As usual, pattern syntax is a close copy of the expression syntax. In our case we extend Haskell’s apat production with a new alternative for XML patterns xpat . The productions for the nonterminal xpat are the same as those for xml but replacing exp everywhere by pat and aexp by apat .

4.2

Semantics

4.2.1

Representation

Our current implementation assumes a universal document representation for XML fragments that closely follows the concrete grammar given above: type Name = String data Element = Node Name Attributes [Child] data Child = Leaf String | ChildElement Element | ChildList [Child] | ChildIO (IO Child) The only surprises are perhaps the ChildList and ChildIO constructors. The ChildIO constructor encapsulates an embedded command that, when executed, will produce a child document. The ChildList constructor is necessary because embedded child-expressions might return a list of child elements themselves, as in the following simple example:

the fox

In the translation from HSP to plain Haskell we will use a ‘smart’ constructor for Element that flattens its children to remove ChildList. Since HSP attributes can be any Haskell expression, we also introduce a ValueIO constructor to encapsulate an embedded command, that when executed will return a Value. type Attributes = [(Name,Value)] data Value = Value String | ValueIO (IO Value) Note that nothing prevents us from using one of the more strongely typed representations that have been proposed for representing HTML or XML documents [34, 8, 33] in Haskell. As we have argued before, the Haskell type system is fundamentally too weak to truly embed the XML type system. We have chosen the simplest possible solution over more complicated, but nonetheless partial ones. 4.2.2

Translation

Function mkElement is the ‘smart’ constructor we promised above that flattens its children to a list of Element, Leaf, and ChildIO children by calling the function flatten :: [Child] -> [Child]: mkElement :: Name -> Attributes -> [Children] -> Element mkElement = \tag -> \attributes -> \children Node tag attributes (flatten children) Function C[[ ]] translates a PCDATA child into a corresponding Haskell string: C[[ ]] ∈ C[[s]] =

PCDATA → expr :: Child Leaf "s"

If we were really pedantic we should also define how XML entities (such as >) that can occur in PCDATA strings are translated into Haskell. A normal XML child is translated recursively and wrapped as an ChildElement: C[[ ]] ∈ C[[x]] =

xml → expr :: Child ChildElement X [[x]]

An embedded Haskell expression wraps a mkChild around the translation of the expression so that it has the required Child type: C[[ ]] C[[]]

∈ → expr :: Child = mkChild E[[e]]

The E[[ ]] translation function removes all nested XML fragments in an embedded Haskell expression: E[[ ]] E [[e[x]]]

∈ expr → expr = e[X [[x]]]

The overloaded function mkChild of the IsChild type class transforms values into children of XML elements. class IsChild a where { mkChild :: a -> Child } Assuming overlapping instances, we provide standard instances for any type that is an instance of Show in conjunction with those for Char, String, [] and IO: instance Show a => IsChild a where { mkChild a = Leaf (show a) }

The semantics of HSP is defined by a handful of simple translation functions that map HSP scripts into pure Haskell. The types of the translation functions show the source and target grammar productions. For the target, we also indicate the required type of the resulting phrase.

instance IsChild Char where { mkChild c = (Leaf . dropQuotes . show) c }

Function X [[ ]] translates XML fragments into applications of the mkElement function on the recursive translations of the attributes and children:

instance IsChild a => IsChild [a] where { mkChild as = ChildList (map mkChild as) }

X [[ ]] ∈ xml → aexp :: Element X [[c1 . . . cn ]] = (mkElement "e" A[[as]] [C[[c1 ]], . . . ,C[[cn ]]])

instance IsChild String where { mkChild s = (Leaf . dropQuotes . show) s }

instance IsChild a => IsChild (IO a) where { mkChild ma = ChildIO (fmap mkChild ma) }

The translation function A[[ ]] for attributes translates each value using V[[ ]]. A[[ ]] ∈ attrs → aexpr :: Attributes A[[n1 = e1 . . . nn = en ]] = [("n1 ",V[[e1 ]]) . . . ("nn ",V[[en ]])] The translation function for attribute values V[[ ]] inserts the overloaded function mkValue when the value is not a string literal. V[[ ]] ∈ V[["s"]] = V[[ ]] ∈ V[[e]] =

string → aexp :: Value Value "s" aexp → aexp :: Value mkValue e

The instances of class IsValue are similar to those of the IsChild class, so we won’t repeat them here. class IsValue a where { mkValue :: a -> Value } Having seen the translation of expressions, the translation of patterns should pose no problems. The only thing to note is the special cases when an element has exactly one embedded Haskell child. X P[[ ]] ∈ xpat → apat :: Element X P[[]] = Element "e" AP[[as]] [Leaf P[[p]]] X P[[]] = Element "e" AP[[as]] P[[p]] X P[[c1 . . . cn ]] = Element "e" AP[[as]][CP[[c1 ]], . . . ,CP[[cn ]]] Patterns occurring in child position are straightforward as well. Leaf and element children introduce their respective constructors: CP[[ ]] ∈ PCDATA → pat :: Child CP[[s]] = Leaf "s" CP[[ ]] ∈ xpat → pat :: Child CP[[x]] = ChildElement X P[[x]] Embedded Haskell patterns are recursively translated using P[[ ]]: CP[[ ]] CP[[]] P[[ ]] P[[p[x]]]

∈ = ∈ =

→ pat :: Child P[[p]] pat → pat p[X P[[x]]]

The translation of attribute patterns uses an auxiliary translation function VP[[ ]] for values: AP[[ ]] ∈ pattrs → apat :: Attributes AP[[n1 = p1 . . . nn = pn ]] = ("n1 ",VP[[p1 ]]): . . . :("nn ",VP[[pn ]]): Function VP[[ ]] takes care of the special case of a literal string pattern occurring as an attribute value. VP[[ ]] ∈ string → apat :: Value VP[["s"]] = Value "s" VP[[ ]] ∈ apat → apat :: Value VP[[p]] = p

5

The HSP runtime

To use HSP to write dynamic web pages, we need to provide interaction with a web server and the middle tier runtime environment. To make HSP really useful we must not tie ourselves too close to a particular web server. The HSP runtime provides a number of components (Request, Response, Application, and Session) that assist in handling HTTP requests and responses, and to maintain (session and application) state across different invocations of a HSP application (a set of HSP scripts). Our choice is influenced by the ASP built-in objects, but similar objects are available in one form or another in most other server-side scripting frameworks (Java Servlets, CGI, Apache) [5]. We pass the HSP runtime components as implicit arguments [18] to scripts. In this way we do not need to explicitly pass object references, but nevertheless they show up in the type of the functions that use them. We don’t provide public access to the real objects, so we can enforce constraints on the use of these objects. The programmer simply cannot bind the implicit arguments to a value (except for undefined, but that is the price we have to pay for having full recursion and partial functions). In section 5.1, we describe several events that the HSP runtime will fire and how to handle these in HSP. For instance when the first page in an application is about to be served, the HSP runtime will first call the event handler onApplicationStart :: (?a :: Application) => IO (). When this event occurs, only the Application object is available, which is precisely captured in the handler’s type. Request The Request object contains the information passed from the client to the server through an HTTP request. Function getParameter gives access to the query string, while getServerVariable gives access to the environment variables set by the server for this request. getParameter :: (?q :: Request, Read a) => String -> [a] getServerVariable :: (?q :: Request) => String -> String For example we can make the call getServerVariable "HTTP USER AGENT" to learn which browser has sent the request. And if the client request was .../example.hsp?language=haskell&version=98 then in the script example.hsp we could ask getParameter "version" to read the value of the version parameter in the query string. Application The Application object stores information that is accessible by the application as a whole, ie that is shared by all sessions. Access to the Application object is automatically synchronized. getApplication :: (?a :: Application, Read a) => String -> IO a setApplication :: (?a :: Application, Show a) => String -> a -> IO ()

A typical use of the application object is to count the total number of times a page has been visited (see the example in section 5.2). Session The Session object stores information for each individual user of the application. A Session object is created whenever a new user enters the application, and is destroyed when the server terminates the whole application, the session times out (in a web application you can never know when or wether the user will return), or is explicitly abandoned by the script. The functions getSession and setSession implement variables with session-wide scope, and life time. A typical use for the Session object is to store user preferences (eg with/without frames) or shopping carts in e-commerce applications. getSession :: (?s :: Session, Read a) => String -> IO a setSession :: (?s :: Session, Show a) => String -> a -> IO () The sessionID function returns the unique ID that the server has generated for this session. The server usually stores this information in a cookie, so things will break if the client does not accept cookies. sessionID :: (?s :: Session) => String If the user does not request another page within TimeOut minutes, the session is terminated by the server. Scripts can explicitly ask to end the current session (after the script has terminated of course) by calling the abandon method. setTimeOut :: (?s :: Session) => Int -> IO () getTimeOut :: (?s :: Session) => IO Int abandon :: (?s :: Session) => IO () 5.1

onSessionStart :: ( ?a :: Application , ?s :: Session , ?q :: Request ) => IO () When a session is abandoned or times out, the runtime calls onSessionEnd function. As the type shows, only the Application and Session objects are available at that time. onSessionEnd :: ( ?a :: Application , ?s :: Session ) => IO () 5.2

A complete HSP application

As an example of a complete HSP application, we will program a page that maintains two page counters. One counter counts the total number of visitors that have accessed the page since the application was started, the other counter counts the number of times an individual user of the application has accessed the page. The two counters are initialized in the onApplicationStart and onSessionStart functions in the Global.hsp file. module Global ( onSessionStart , onSessionEnd , onApplicationStart , onApplicationEnd ) where onSessionStart = do{ setSession "Count" 0 } onApplicationStart = do{ setApplication "Count" 0 }

HSP Application

A HSP application consists of a virtual directory and all its subdirectories. The root-directory should contain a file Global.hsp that contains definitions for four callback functions. The onApplicationStart function is called before the first new session is created, ie before the first call to onSessionStart. At that moment, only the Application object is available. The type of onApplicationStart enforces this constraint. onApplicationStart :: (?a :: Application) => IO () When the application quits, the HSP runtime invokes the onApplicationEnd (after the last call to onSessionEnd). At that moment, only the Application object is available; again, the type of onApplicationEnd enforces this constraint. onApplicationEnd :: (?a :: Application) => IO () When the server creates a new session, it first invokes the onSessionStart function before serving the page. As the type of onSessionStart shows, all the built-in objects are available at that time.

The functions onApplicationEnd and onSessionEnd don’t do anything interesting so we have omitted their definitions. The page uses two helper functions countPage and countSession, that are in turn defined using a helper function count. count do{ ; ; ; }

= \getter -> \setter n IO Int countPage = count (getApplication "Count") (setApplication "Count") countSession :: (?s :: Session) -> IO Int countSession = count (getSession "Count") (setSession "Count") The code for the page itself uses a small embedded script to display the page counter as a sequence of gifs for each digit.

page :: ( ?a :: Application , ?s :: Session ) => Element page =

Counter

This page has been accessed Session -> Request -> Response -> IO () page = \a -> \s -> \q -> \r -> in do{ p Hash a where hash :: a -> Int -- forall x,y :: a. (x == y) implies (hash x == hash y) class Hash a => UniqueHash a -- no new methods, just a stronger invariant -- forall x,y :: a. (x == y) iff (hash x == hash y) class UniqueHash a => ReversibleHash a where unhash :: Int -> a -- forall x :: a. unhash (hash x) == x Figure 1: The EdisonPrelude module. module BreadthFirst where import EdisonPrelude import qualified SimpleQueue as Q data Tree a = Empty | Node a (Tree a) (Tree a) breadthFirst :: Tree a -> [a] breadthFirst t = bfs (Q.single t) where bfs q = case Q.lview q of Just2 (Node x l r) q’ -> x : bfs (Q.snoc (Q.snoc q’ l) r) Just2 Empty q’ -> bfs q’ Nothing2 -> [] Figure 2: Sample program using Edison.

2

General Organization

Each family of abstractions is implemented as a class hierarchy and each data structure is implemented as a Haskell module. For the operations in each class and module, I have attempted to choose names that are as standard as possible. This means that operations for different abstractions frequently share the same name (empty, null, size, etc.). It also means that in many cases I have reused names from the Prelude. Therefore, Edison modules should nearly always be imported qualified. The one Edison module that is typically imported unqualified is the EdisonPrelude, shown in Figure 1, which defines a few utility types in the Maybe family used by every other Edison module, as well as a few classes related to hashing. When importing Edison modules, I recommend renaming each module using the as keyword. See, for example, the sample program in Figure 2, where the imported module SimpleQueue has been renamed locally as Q. This both reduces the overhead of qualified names and makes substituting one module for another as painless as possible. If I wanted to replace SimpleQueue with a fancier implementation such as HoodMelvilleQueue, I could do so merely by modifying the import line. Such substitutions are further facilitated by the convention

that related data structures should use the same type name. For example, most implementations of sequences define a type constructor named Seq. The sample program in Figure 2 also illustrates another important point about Edison—although each abstraction is defined in terms of type classes, all the operations on each data structure are also available directly from the data structure’s module. If we wanted to access methods such as single through the type class instead, we could change the line import qualified SimpleQueue as Q to import qualified Sequence as Q import SimpleQueue (Seq)

-- import class -- import instance

and then use a type annotation somewhere inside the breadthFirst function to indicate that the intermediate queues are of type Seq a. Note that, because I am selectively importing only the type constructor Seq from SimpleQueue, I do not bother importing it qualified.

3

Sequences

The sequence abstraction is usually viewed as a hierarchy of ADTs including lists, queues, deques, catenable lists, etc. However, such a hierarchy is based on efficiency rather than functionality. For example, a list supports all the operations that a deque supports, even though some of the operations may be inefficient. Hence, in Edison, all sequence data structures are defined as instances of the single Sequence class: class (Functor s, MonadPlus s) => Sequence s As expressed by the context, all sequences are also instances of Functor, Monad, and MonadPlus. In addition, all sequences are expected to be instances of Eq and Show, although this is not enforceable in Haskell.1 Figure 3 summarizes all the methods on sequences. Sequences are currently the most populated abstraction in Edison. There are six basic implementations of sequences, including ordinary lists, join lists, simple queues [1], banker’s queues [9], random-access stacks [6], random-access lists [7], Braun trees [4, 8], and binary random-access lists [9], plus two sequence adaptors, which are representations of sequences parameterized by other representations of sequences. One adds an explicit size field to an existing implementation of sequences and the other reverses the orientation of an existing implementation of sequences so that adding an element to the left of the sequence actually adds the element to the right of the underlying sequence.

4

Collections

The collection abstraction includes sets, bags, and priority queues (heaps). Collections are defined in Edison as a set of eight classes, organized in the hierarchy shown in Figure 4. These classes make essential use of multi-parameter type classes, as in [11]. All collections assume at least an equality relation on elements, and many also assume an ordering relation. The use of multi-parameter type classes allows any particular instance to assume further properties as necessary (such as hashability). The hierarchy contains a root class, CollX, together with seven subclasses satisfying one or more of three orthogonal sub-properties: • Uniqueness. Each element in the collection is unique (i.e., no two elements in the collection are equal). These subclasses, indicated by the name Set, represent sets rather than bags. 1 Enforcing

this condition would require being able to write constraints like ∀a.Eq a => Eq (s a) inside class contexts.

Sequence Methods Constructors: empty, single, cons, snoc, append, fromList, copy, tabulate Destructors: lview, lhead, ltail, rview, rhead, rtail Observers: null, size, toList Concat and reverse: concat, reverse, reverseOnto Maps and folds: map, concatMap, foldr, foldl, foldr1, foldl1, reducer, reducel, reduce1 Subsequences: take, drop, splitAt, subseq Predicate-based operations: filter, partition, takeWhile, dropWhile, splitWhile Index-based operations: inBounds, lookup, lookupM, lookupWithDefault, update, adjust, mapWithIndex, foldrWithIndex, foldlWithIndex Zips and unzips: zip, zip3, zipWith, zipWith3, unzip, unzip3, unzipWith, unzipWith3 Figure 3: Summary of methods for the Sequence class.

Eq a

CollX c a

✟✟ ✟✟ ✟✟ ✟

Ord a

❍❍ ❍❍ ❍❍ ❍

OrdCollX c a

SetX c a

OrdColl c a

Set c a

Coll c a ✟ ❍❍ ✟✟ ❍ ✟ ❍ ❍❍ ❍❍ ✟✟ ✟✟ ✟✟ ❍❍ ✟✟ ❍❍ ❍❍ ✟✟ OrdSetX c a

❍❍ ❍ ❍❍ ❍❍

✟ ✟✟ ✟✟ ✟✟

OrdSet c a

CollX empty,insert union,delete null,size member,count ···

OrdCollX deleteMin unsafeInsertMin filterLT ···

SetX intersect difference subset subsetEq

OrdSetX no methods

Coll toSeq lookup fold filter ···

OrdColl minElem foldr,foldl toOrdSeq ···

Set insertWith unionWith intersectWith ···

OrdSet no methods

Figure 4: The collection class hierarchy, with typical methods for each class.

Collection Methods Constructors: CollX: empty, single, insert, insertSeq, union, unionSeq, fromSeq OrdCollX: unsafeInsertMin, unsafeInsertMax, unsafeFromOrdSeq, unsafeAppend Set: insertWith, insertSeqWith, unionl, unionr, unionWith, unionSeqWith, fromSeqWith Destructors: OrdColl: minView, minElem, maxView, maxElem Deletions: CollX: delete, deleteAll, deleteSeq OrdCollX: deleteMin, deleteMax Observers: CollX: null, size, member, count Coll: lookup, lookupM, lookupAll, lookupWithDefault, toSeq OrdColl: toOrdSeq Filters and partitions: OrdCollX: filterLT, filterLE, filterGT, filterGE, partitionLT GE, partitionLE GT, partitionLT GT Coll: filter, partition Set operations: SetX: intersect, difference, subset, subsetEq Set: intersectWith Folds: Coll: fold, fold1 OrdColl: foldr, foldl, foldr1, foldl1 Figure 5: Summary of methods for the collection classes. • Ordering. The elements have a total ordering and it is possible to process the elements in non-decreasing order. These subclasses, indicated by the Ord prefix, typically represent either priority queues (heaps) or sets/bags implemented as binary search trees. • Observability. An observable collection is one in which it is possible to view the elements in the collection. The X suffix indicates lack of observability. This property is discussed in greater detail below. Figure 5 summarizes all the methods on collections. Note that neither OrdSetX nor OrdSet add any new methods, which is why there is no explicit dependency between these classes in the hierarchy. These classes serve as mere abbreviations for the combinations of OrdCollX/SetX and OrdColl/Set, respectively. As with sequences, the hierarchy of collections is determined by functionality rather than efficiency. For example, the member function is included in the root class of the hierarchy even though it is inefficient for many implementations, such as heaps. Because collections encompass a wide range of abstractions, there is no single name that is suitable for all collection type constructors. However, most modules implementing collections will define a type constructor named either Bag, Set, or Heap. Edison currently supports one implementation of sets (unbalanced binary search trees), four implementations of heaps (leftist heaps [5], skew heaps [13], splay heaps [9], and lazy pairing heaps [9]), and one heap adaptor

that maintains the minimum element of a heap separate from the rest of the heap. This heap adaptor is particularly useful in conjunction with splay heaps.

4.1

Observability

Note that the equality relation defined by the Eq class is not necessarily true equality. Very often it is merely an equivalence relation, where equivalent values may be distinguishable by other means. For example, we might consider two binary search trees to be equal if they contain the same elements, even if their shapes are different. Because of this phenomenon, implementations of observable collections (i.e., collections where it is possible to inspect the elements) are rather constrained. Such an implementation must retain the actual elements that were inserted. For example, it is not possible in general to represent an observable bag as a finite map from elements to counts, because even if we know that a given bag contains, say, three elements from some equivalence class, we do not necessarily know which three. On the other hand, implementations of non-observable collections have much greater freedom to choose abstract representations of each equivalence class. For example, representing a bag as a finite map from elements to counts works fine if we never need to know which representatives from an equivalence class are actually present. As another example, consider the UniqueHash class defined in the Edison Prelude. If we know that the hash function yields a unique integer for each equivalence class, then we can represent a collection of hashable elements simply as a collection of integers. With such a representation, we can still do many useful things like testing for membership—we just can’t support functions like fold or filter that require the elements themselves, rather than the hashed values.2

4.2

Unsafe Operations

Ordered collections support a number of operations with names like unsafeInsertMin and unsafeFromOrdSeq. These are important special cases with preconditions that are too expensive to check at runtime. For example, unsafeFromOrdSeq converts a sorted sequence of elements into a collection. In contrast to fromSeq, which converts an unsorted sequence into a collection, unsafeFromOrdSeq can be implemented particularly efficiently for data structures like binary search trees. The behavior of these operations is undefined if the preconditions are not satisfied, so the unsafe prefix is intended to remind the programmer that these operations are accompanied by a proof obligation. The one place where I have violated this convention is in the Set class, where there is a whole family of operations with names like insertWith and unionWith. These functions take a combining function that is used to resolve collisions. For example, when inserting an element into a set that already contains that element, the combining function is called on the new and old elements to determine which element will remain in the new set.3 The combining functions typically return one element or the other, but they can also combine the elements in non-trivial ways. These combining functions are required to satisfy the precondition that, given two equal elements, they return a third element that is equal to the other two.

5

Associative Collections

The associative-collection abstraction includes finite maps, finite relations, and priority queues where the priority is distinct from the element. Associative collections are defined in Edison as a set of eight classes, organized in the hierarchy shown in Figure 6. Notice that this hierarchy mirrors the hierarchy for collections, but with the addition of Functor as a superclass of every associative collection. Like collections, associative collections depend heavily on multi-parameter type classes. 2 In

fact, we can even support fold and filter if the hashing function is reversible, but this is relatively uncommon. a combining function is useful only when nominally equal elements are distinguishable in other ways—that is, when the “equality” relation is really an equivalence relation. However, this is extremely common. 3 Such

Eq k

Functor (m k)

❅ ❅ ❅ AssocX m k

✟✟ ✟✟ ✟✟ ✟

Ord k

❍❍ ❍❍ ❍❍ ❍

OrdAssocX m k

FiniteMapX m k

OrdAssoc m k

FiniteMap m k

❍❍ Assoc m k ✟✟ ❍❍ ✟✟ ❍❍ ✟✟ ✟❍ ✟❍ ✟❍ ✟❍ ✟ ❍❍ ✟ ❍ ✟ ✟ ❍ OrdFiniteMapX m k

❍❍ ❍❍ ❍❍ ❍

✟✟ ✟ ✟ ✟✟ ✟

OrdFiniteMap m k

AssocX empty,insert union,delete null,size lookup map,fold filter ···

OrdAssocX minElem deleteMin unsafeInsertMin foldr,foldl filterLT ···

FiniteMapX insertWith unionWith intersectWith difference subset ···

OrdFiniteMapX no methods

Assoc toSeq mapWithKey foldWithKey filterWithKey ···

OrdAssoc minElemWithKey foldrWithKey toOrdSeq ···

FiniteMap unionWithKey intersectWithKey ···

OrdFiniteMap no methods

Figure 6: The associative-collection class hierarchy, with typical methods for each class.

The operations on associative collections are similar to the operations on collections. The differences arise from having a separate key and element, rather than just an element. One significant implication of this separation is that many of the methods move up in the hierarchy, because elements are always observable for associative collections even though keys may not be. Figure 7 summarizes all the methods on associative collections. Edison currently supports two implementations of finite maps (association lists and Patricia trees [10]). Because collections and associative collections are so similar, it is tempting to merge them into one class hierarchy, either by defining collections to be associative collections whose elements are of the unit type or by defining associative collections to be collections whose elements are pairs of type Association k a, where the ordering on associations is inherited from the keys only. For example, Peyton Jones [11] follows this latter approach. Edison rejects both approaches, however, because both carry unacceptable performance penalties. The former requires extra space for the unnecessary unit values and the latter injects at least one extra level of indirection into every key access. The implementor is free to define any particular implementation of a data structure in one of these ways, trading a small performance penalty for reduced development costs, but it would be wrong for the design of the library to mandate that every implementation of an abstraction pay these penalties.

6

Testing

Each abstraction in Edison has an associated test suite implemented under QuickCheck [2]. To support both this testing and any testing of applications built on top of Edison, every Edison data structure is defined to be an instance of the Arbitrary class. This class is used by QuickCheck to generate random versions of each data structure, which are then passed to the routines that check the desired invariants, such as cons x xs == append (single x) xs The QuickCheck test suite is a relatively new addition to Edison. Compared to the old test suite, I estimate that the QuickCheck test suite took less that 25% of the effort to develop, and provides much better coverage as well! I highly recommend using QuickCheck in any application with a relatively well-understood specification.

7

Commentary

There are many places where the design of Haskell has influenced the design of Edison in non-obvious ways. In addition, there are several places where Edison runs up against limits in the design of Haskell.

7.1

Fixity

Because qualified infix symbols are fairly ugly, Edison avoids infix symbols as much as possible. For example, the sequence catenation function is named append instead of ++.

7.2

Error handling

Because Haskell has no good way to recover from errors, Edison avoids signalling errors if there is any reasonable alternative. For many functions, it is easy to avoid an error by returning the Maybe type (or something similar), but sometimes, as with the head function on lists and the corresponding lhead function on sequences, this approach is just too painful. For lhead of an empty sequence, there really is no choice but to signal an error, but other times there is a reasonable alternative. For example, Edison defines both ltail of the empty sequence and take of a negative argument to return the empty sequence even though the corresponding Prelude functions would signal errors in both cases.

Associative-Collection Methods Constructors: AssocX: empty, single, insert, insertSeq, union, unionSeq, fromSeq OrdAssocX: unsafeInsertMin, unsafeInsertMax, unsafeFromOrdSeq, unsafeAppend FiniteMapX: insertWith, insertWithKey, insertSeqWith, insertSeqWithKey, unionl, unionr, unionWith, unionSeqWith, fromSeqWith, fromSeqWithKey FiniteMap: unionWithKey, unionSeqWithKey Destructors: OrdAssocX: minView, minElem, maxView, maxElem OrdAssoc: minViewWithKey, minElemWithKey, maxViewWithKey, maxElemWithKey Deletions: AssocX: delete, deleteAll, deleteSeq OrdAssocX: deleteMin, deleteMax Observers: AssocX: null, size, member, count, lookup, lookupM, lookupAll, lookupWithDefault, elements Assoc: toSeq, keys OrdAssoc: toOrdSeq Modifiers: AssocX: adjust, adjustAll Maps and folds: AssocX: map, fold, fold1 OrdAssocX: foldr, foldl, foldr1, foldl1 Assoc: mapWithKey, foldWithKey OrdAssoc: foldrWithKey, foldlWithKey Filters and partitions: AssocX: filter, partition OrdAssocX: filterLT, filterLE, filterGT, filterGE, partitionLT GE, partitionLE GT, partitionLT GT Assoc: filterWithKey, partitionWithKey Set-like operations: FiniteMapX: intersectWith, difference, subset, subsetEq FiniteMap: intersectWithKey Figure 7: Summary of methods for the associative-collection classes.

7.3

Map

It may be surprising that the collection hierarchy does not include a map method. In fact, Edison includes a utility function map :: (Coll cin a, CollX cout b) => (a -> b) -> (cin a -> cout b) map f xs = fold (insert . f) empty xs but this function is not a method, so there is no hope of substituting something more efficient for a particular implementation of collections. But how could this operation be implemented more efficiently? For example, it is tempting to implement map on a binary search tree by the usual map function for trees. However, besides limiting map to the special case where cin and cout are identical, this implementation is incorrect. There is no guarantee that f preserves the ordering of elements, so the result would not in general be a valid binary search tree. Many Edison data structures can and do support a function unsafeMapMonotonic that assumes that f preserves ordering, leaving this fact as a proof obligation for the user, but this function is not general enough to deserve to be a method.

7.4

Defaults

Haskell supports default implementations of methods, but Edison makes almost no use of this language feature. The difficulty is that there is very often more than one implementation that could play this role. For example, consider the insertSeq method for inserting a sequence of elements into a collection. There are at least two equally good “default” implementations of this method: the first inserts each element of the sequence into the collection, and the second converts the sequence into a collection and then unions this new collection with the old one. Arbitrarily designating one of these implementations as the default would simply lead to performance bugs in which the implementor forgets to overide the default method, thinking that the other implementation has been chosen as the default. The solution in Edison is to provide, for each family of abstractions, a separate module containing all these myriad default implementations, with names like insertSeqUsingFoldr and insertSeqUsingUnion. Then, each data structure module contains a set of definitions of the form insertSeq = insertSeqUsingFoldr for those methods for which a default implementation is appropriate.

7.5

Limitations on Contexts

Haskell’s restrictions on the form of type contexts occasionally prove too restrictive. For example, the root of the associative-collection class hierarchy is defined as class (Eq k, Functor (m k)) => AssocX m k but the (m k) in the Functor context is not allowed — at least, not in Haskell 98. An unsatisfying workaround is to simply delete the Functor part of the context and add a map method to AssocX. Similarly, it would be useful to be able to define collections based on hashing, as in newtype HashColl c a = H (c Int) instance (UniqueHash a, CollX c Int) => CollX (HashColl c) a where single = single . hash ... but the Int in the CollX c Int context is not allowed. For further discussion of Haskell’s limitations on contexts, see [12].

8

Final Words

Haskell programmers, indeed functional programmers in general, too often reach for lists when an ADT would be more appropriate. Without Edison or some similar library, I fear this trend will continue indefinitely. A library like Edison will only be successful if it is embraced by the community. I welcome community involvement at every level from design to implementation. I am especially eager for user feedback, and I repeat my earlier invitation for anybody to submit new implementations of the Edison abstractions.

Acknowledgements Thanks to Simon Peyton Jones for many discussions about the design of Edison. Thanks also Ralf Hinze and Sigbjorne Finne, who have each contributed to the Edison infrastructure. Finally, thanks to Koen Claessen and John Hughes for their wonderful QuickCheck tool.

References [1] F. Warren Burton. An efficient functional implementation of FIFO queues. Information Processing Letters, 14(5):205–206, July 1982. [2] Koen Claessen and John Hughes. Quickcheck: A lightweight tool for random testing of haskell programs. In ACM SIGPLAN International Conference on Functional Programming, September 2000. [3] William R. Cook. Interfaces and specifications for the Smalltalk-80 collection classes. In Conference on Object-Oriented Programming Systems, Languages, and Applications, pages 1–15, October 1992. [4] Rob R. Hoogerwoord. A logarithmic implementation of flexible arrays. In Conference on Mathematics of Program Construction, volume 669 of LNCS, pages 191–207. Springer-Verlag, July 1992. [5] Donald E. Knuth. Searching and Sorting, volume 3 of The Art of Computer Programming. Addison-Wesley, 1973. [6] Eugene W. Myers. An applicative random-access stack. Information Processing Letters, 17(5):241–248, December 1983. [7] Chris Okasaki. Purely functional random-access lists. In Conference on Functional Programming Languages and Computer Architecture, pages 86–95, June 1995. [8] Chris Okasaki. Three algorithms on Braun trees. Journal of Functional Programming, 7(6):661–666, November 1997. [9] Chris Okasaki. Purely Functional Data Structures. Cambridge University Press, 1998. [10] Chris Okasaki and Andy Gill. Fast mergeable integer maps. In Workshop on ML, pages 77–86, September 1998. [11] Simon Peyton Jones. Bulk types with class. In Glasgow Workshop on Functional Programming, July 1996. [12] Simon Peyton Jones, Mark Jones, and Erik Meijer. Type classes: an exploration of the design space. In Haskell Workshop, June 1997. [13] Daniel D. K. Sleator and Robert E. Tarjan. Self-adjusting heaps. SIAM Journal on Computing, 15(1):52– 69, February 1986. [14] Alexander Stepanov and Meng Lee. The standard template library. Technical report, Hewlett-Packard, 1995.

Combinator Parsers: From Toys to Tools S. Doaitse Swierstra Department of Computer Science Utrecht University P.O. Box 80.089 3508 TB UTRECHT, the Netherlands

[email protected] ABSTRACT We develop, in a stepwise fashion, a set of parser combinators for constructing deterministic, error-correcting parsers. The only restriction on the grammar is that it is not left recursive. Extensive use is made of lazy evaluation, and the parsers constructed “analyze themselves”. Our new combinators may be used for the construction of large parsers to be used in compilers in practical use.

Categories and Subject Descriptors D.1.1 [Programming techniques]: Applicative (functional) programming; D.3.3 [Programming languages]: Language Constructs and features; D.3.4 [Programming languages]: Processors—parsing,translator writing systems and compiler generators; D.3.4 [Programming languages]: Language Classification—Applicative (functional) languages

General Terms parser combinators, error correction, deterministic, partial evaluation, parser generators, program analysis, lazy evaluation, polymorphism

1. INTRODUCTION There exist many different implementations of the basic parser combinators; some use basic functions [3], whereas others make use of a monadic formulation [4]. Parsers constructed with such conventional parser combinators have two disadvantages: when the grammar gets larger parsing gets slower and when the input is not a sentence of the language they break down. In [7] we presented a set of parser combinators that did not exhibit such shortcomings, provided the grammar had the so-called LL(1) property; this property makes it possible to decide how to proceed during top-down parsing by looking at the next symbol in the input. For many grammars an LL(1) equivalent grammar may be constructed through left factoring, but unfortunately the resulting grammars often bear little resemblance to what the language designer had in mind. Extending such transformed grammars with functions for semantic processing is cumbersome and the elegance offered by combinator-based parsers is lost. To alleviate this problem we set out to extend our previous

combinators in a way that enables the use of longer lookahead sequences The new and completely different implementation is both efficient and deals with incorrect input sequences. The only remaining restriction is that the encoded grammar is neither directly nor indirectly left-recursive: something which can easily be circumvented by the use of appropriate chain-combinators; we do not consider this to be a real shortcoming since usually the use of such combinators expresses the intention of the language designer better than explicit left-recursive formulations. The final implementation has been used in the construction of some large parsers. The cost added for repairing errors is negligible. In Section 2 we recapitulate the conventional parser combinators and investigate where the problems mentioned above arise. In Section 3 we present different basic machinery which adds error correction; the combinators resulting from this are still very short and may be used for small grammars. In Section 4 we show how to extend the combinators with the (demand driven) computation of look-ahead information. In this process we minimize the number of times that a symbol is inspected. Finally we present some further extensions in Section 5 and conclusions in Section 7.

2. CONVENTIONAL PARSER COMBINATORS In Figure 1 we present the basic interface of the combinators together with a straightforward implementation. We will define new implementations and new types, but always in such a way that already constructed parsers can be reused with these new definitions with minimal changes. To keep the presentation as simple as possible, we assume all inputs to be sequences of Symbols. Parsers constructed using these combinators perform a depthfirst search through all possible parse trees, and return all ways in which a parse can be found, an idea already found in [1]. Note that we have taken a truly “functional” approach in constructing the result of a sequential composition. Instead of constructing a value of the more complicated type (b, a) out of the two simpler types b and a, we have chosen to construct a value of a simpler type a out of the more complicated types b -> a and b. Based on these basic combinators more complicated combinators can be constructed. For examples of the use of such combinators, and the definition of more complicated combinators, see [3, 5, 6] and the

infixl 3 infixl 4

-- choice combinator -- sequential combinator

type Symbol = ... type Input = [Symbol] type Parser a = Input -> [(a,Input)] succeed symbol () () parser

:: :: :: :: ::

a -> Symbol -> Parser a -> Parser a -> Parser (b -> a) -> Parser b -> Parser a -> Input ->

Parser Parser Parser Parser Result

a Symbol a a a

infixl 3 -- a derived combinator using the interface () :: (b -> a) -> Parser b -> Parser a f p = succeed f p -- straightforward implementation succeed v input = [ (v , input)] symbol a (b:bs) = if a == b then [(b,bs)] else [] symbol a [] = [] (p q) input = p input ++ q input (p q) input = [ (pv qv, rest ) | (pv , qinput) Left res Figure 1: The basic combinators web site for our combinators1 . As an example of how to construct parsers using these combinators and of what they return consider (for Symbol we take Int): p = (

)

symbol 3 (+) symbol 3 symbol 4

parser p [3, 4, 5]? > [(3, [4, 5]), (7, [5])] The main shortcoming of the standard implementation is that when the input cannot be parsed the parser returns an empty list, without any indication about where in the input things are most likely to be wrong. As a consequence the combinators in this form are unusable for any input of significant size. From modern compilers we expect even more than just such an indication: the compiler should correct simple typing mistakes by deleting superfluous closing brackets, inserting missing semicolons etc. Furthermore it should do so while providing proper error messages. A second disadvantage of parsers constructed in this way is that parsing gets slow when productions have many alternatives, since all alternatives are tried sequentially at each branching point, thus causing a large number of symbol com1

see www.cs.uu.nl/groups/ST/Software/UU Parsing

parisons. This effect becomes worse when a naive user uses the combinators to describe very large grammars as in: fold1 () (map symbol [1..1000]) Here on the average 500 comparisons are needed in order to recognize a symbol. Such parsers may easily be implicitly constructed by the use of more complicated derived combinators, without the user actually noticing. A further source of potential inefficiency is caused by non-determinism. When many alternatives may recognize strings with a common prefix, this prefix will be parsed several times, with usually only one of those alternatives eventually succeeding. So for highly “non-deterministic” grammars the price paid may be high, and even turn out to be exponential in the size of the input. Although it is well known how to construct deterministic automata out of non-deterministic ones, this knowledge is not used in this implementation, nor may it easily be incorporated. We now start our description of a new implementation that solves all of the problems mentioned.

3. ERROR CORRECTION 3.1 Continuation-Based Parsing If we extend the combinators from the previous section to keep track of the farthest point in the input that was reached, the parser returns that value only after backtracking has

been completed. Unfortunately, we have by then lost all context information which might enable us to decide on the proper error correcting steps. So we will start by converting our combinators into a form that allows us to work on all possible alternatives concurrently, thus changing from a depth-first to a breadth-first exploration of the search space. This breadth-first approach might be seen as a way of making many parsers work in parallel, each exploring one of the possible routes to be taken. As a first step we introduce the combinators in Figure 2, which are constructed using a continuation-based style. As we will see this will make it possible to provide information about how the parsing processes are progressing before a complete parse has been constructed. For the time being we ignore the result to be computed, and simply return a boolean value indicating whether the sentence belongs to the language or not. The continuation parameter r represents the rest of the parsing process, which is to be called when the current parser succeeds. It can be seen as encapsulating a stack of unaccounted-for symbols from the right hand sides of partially recognized productions, against which the remaining part of the input is to be matched. We have again defined a function parse that starts the parsing process. Its continuation parameter is the function null, which checks whether the input has indeed been consumed totally when the stack of pending symbols has been depleted.

3.2

Parsing histories

An essential design decision now is not just to return a final result, but to combine this with the parsing history, thus enabling us to trace the parsing steps that led to this result. We consider two different kind of parsing steps: Ok steps, that represent the successful recognition of an input symbol Fail steps, that represent a corrective step during the parsing process; such a step corresponds either to the insertion into or the deletion of a symbol from the input stream data Steps result = Ok (Steps result) | Fail (Steps result) | Stop result getresult getresult getresult getresult

:: Steps (Ok l) (Fail l) (Stop v)

result -> result = getresult l = getresult l = v

For the combination of the result and its parsing history we do not simply take a cartesian product, since this pair can only be constructed after having reached the end of the parsing process and thus having access to the final result. Instead, we introduced a more intricate data type, which allows us to start producing tracing information before parsing has completed. Ideally, one would like to select the result with the fewest Fail steps, i.e., that sequence that corresponds to the one with a minimal editing distance to the original input. Unfortunately this will be a very costly operation, since it implies that at all possible positions in the

input all possible corrective steps have to be taken into consideration. Suppose e.g. that an unmatched then symbol is encountered, and that we want to find the optimal place to insert the missing if symbol. In this case there may be many points where it might be inserted, and many of those points are equivalent with respect to editing distance to some correct input. To prevent a combinatorial explosion we take a greedy approach, giving preference to the parsing with the longest prefix of Ok steps. So we define an ordering between the Steps, based on longest successful prefixes of Ok steps: best :: Steps rslt -> Steps rslt -> Steps rslt _@(Ok l) ‘best‘ (Ok r) = Ok (l ‘best‘ r) _@(Fail l) ‘best‘ (Fail r) = Fail (l ‘best‘ r) l@(Ok _) ‘best‘ (Fail _) = l _@(Fail _) ‘best‘ r@(Ok _) = r l@(Stop _) ‘best‘ _ = l _ ‘best‘ r@(Stop _) = r There is an essential observation to be made here: when there is no preference between two sequences based on their first step, we postpone the decision about which of the operands to return,while still returning information about the first step in the selected result.

3.3

Error-correcting steps

Let us now discuss the possible error-correcting steps. We have to take such a step when the next symbol in the input is different from any symbol we expect or when we expect at least one more symbol and the input is exhausted. We consider two possible correcting steps: • pretend that the symbol was there anyway, which is equivalent to inserting it in the input stream • delete the current input symbol, and try again to see whether the expected symbol is present In both of these cases we report a Fail step. If we add this error recovery to the combinators defined before, we get the code in Figure 3. Note that if any input left at the end of the parsing process is left it is deleted, resulting in a number of failing steps (Fail(Fail(... (Stop True))). This may seem superfluous, but is needed to indicate that not all input was consumed. The operator ||, that was used before to find out whether at a branching point at least one of the alternatives finally led to success, has been replaced by the best operator which selects the “best” result. It is here that the change from a depth-first to a breadth-first approach is made: the function || only returns a result after at least its first operand has been completely evaluated, whereas the function best returns its result in an incremental way. It is the function getresult at the top level that is actually driving the computation by repeatedely asking for the constructor at the head of the Steps value.

3.4

Computing semantic results

The combinators as just defined are quite useless, because the added error correction makes the parser always return True. We now have to add two more things:

type Result a type Parser succeed symbol a

p q p q

= Bool = (Input -> Bool) -> (Input -> Bool)

= \ r input -> r input = \ r input -> case input of (b:bs) -> a == b && r bs [] -> False = \ r input -> p (q r) input = \ r input -> p r input || q r input

parse p input = p null input -- null checks for end of input Figure 2: The continuation-based combinators type Result a =

Steps Bool

symbol a

= \ r input -> case input of (b:bs) -> if a == b then Ok(r bs) {- insert the symbol a -} else Fail (r input ‘best‘ {- delete the symbol b -} symbol a r bs ) {- insert the symbol a -} [] -> Fail (r input) succeed p q p q

= \ r input -> r input = \ r input -> p r input ‘best‘ q r input = \ r input -> p (q r) input

parse p = getresult . p (foldr (const Fail) (Stop True)) Figure 3: Error correcting parsers 1. the computation of a result, like done in the original combinators 2. the generation of error messages, indicating what corrective steps were taken. Both these components can be handled by accumulating the results computed thus far in extra arguments to the parsing functions.

3.4.1

Computing a result

Top-down parsers maintain two kinds of stacks: • one for keeping track of what still is to be recognized (here represented by the continuation) • one for storing “pending” elements, that is, elements of the right hand side of productions that have been recognized and are waiting to be used in a reduction (which in our case amounts to the application of the first element to the second). Note that our parsers (or grammars if you prefer), although this may not be realized at first sight, are in a normal form in which each right-hand side alternative has length at most 2: each occurrence of a combinator introduces an (anonymous) non-terminal. If the length of a right hand side is

larger than 2, the left-associativity of determines how normalization is defined. So there is an element pending on the stack for each recognized left operand of some parser whose right hand side part has not been recognised yet. We decide to represent the stack of pending elements with a function too, since it may contain elements of very different types. The types of the stack contaniing the reduced items and of the continuation now become: type Stack a b = a -> b type Future b result = b -> Input -> Steps result Together this gives us the following new definition of the type Parser: type Parser a = forall b result . Future b result -> Stack a b -> Input -> Steps result

-- the continuation -- the stack of pending values

This is a special type that is not allowed by the Haskell98 standard, since it contains type variables b and result that are not arguments of the type Parser. By quantifying with

the forall construct we indicate that the the type of the parser does not depend on these type variables, and it is only through passing functions that we link the type to its environment. This extension is now accepted by most Haskell compilers. So the parser that recognises a value of type a combines this value with the stack of previously found values which will result in a new stack of type b, which in its turn is passed to the continuation as the new stack. The interesting combinator here is the one taking care of sequential composition which now becomes:

remaining problems (backtracking and sequential selection), which both have to do with the low efficiency, in one sweep. Thus far the parsers were all defined as functions about which we cannot easily get any further information. An example of such useful information is the set of symbols that may be recognized as first -symbols by a parser, or whether the parser may recognize an empty sequence. Since we cannot obtain this information from the parser itself we decide to compute this information separately, and to tuple this information with a parser that is constructed using this information.

((p q) r stack input = p (q r) (stack.) input

4.1 When pv is the value computed by the parser p and qv the value computed by the parser q, the value passed on to r will be: (((stack .) pv) qv) = (stack. pv) qv = stack (pv qv) which is exactly what we would expect. Finally we have to adapt the function parse such it transforms the constructed result to the desired result and initializes the stack (id): parse p = getresult ( p (\ st inp -> foldr (const Fail) (Stop st) inp) id ) We will not give the new versions of the other combinators here, since they will show up in almost the same form in Figure 4.

3.4.2

Error reporting

Note that at the point where we introduce the error-correcting steps we cannot be sure whether these corrections will actually be on the chosen path in the search tree, and so we cannot directly add the error messages to the result: keep in mind that it is a fundamental property of our strategy that we may produce information about the result without actually having made a choice yet. Including error messages with the Fail constructors would force us to prematurely take a decision about which path to choose. So we decide to pass the error messages in an accumulating parameter too, only to be included in the result at the end of the parsing process. In order to make it possible for users to generate their own error messages (say in their own language) we return the error messages in the form of a data structure, which we make an instance of Show (see Figure 4, in which also the previous modifications have been included).

4. INTRODUCING LOOK-AHEAD In the previous section we have solved the first of the problems mentioned, i.e. we made sure that a result will always be returned, together with a message about what error correcting steps were taken. In this section we solve the two

Tries

To see what such information might look like we first introduce yet another formulation of the basic combinators: we construct a trie-structure representing all possible sentences in the language of the represented grammar (see Figure 5). This is exactly what we need for parsing: all sentences of the language are grouped by their common prefix in the trie structure. Thus it becomes possible, once the structure has been constructed, to parse the language in linear time. For a while we forget again about computing results and error messages. Each node in the trie represents the tails of sentences with a common prefix, which in its turn is represented by the path to the root in the oevrall structure representing the language. A Choice node represents the non-empty tails by a mapping of the possible next symbols to the tries representing their corresponding tails. An End node represents the end of a sentence. The :|: nodes corresponds to nodes that are both a Choice node (stored in the left operand) and an End node (stored in its right operand) 2 . Notice that the language ab|ac is represented by: Choice [(‘a’, Choice [(‘b’, End), (‘c’, End)])] in which the common prefix has been factored out. In this way the cost associated with the backtracking of the parser has now been moved to the construction of the Sents structure. For the language a∗ b|a∗ c constructing the trie is a nonterminating process. Fortunately lazy evaluation takes care of this problem, and the merging process only proceeds far enough for recognising the current sentence. A shortcoming of this approach, however, is that it introduces a tremendous amount of copying, because of the way sequential composition has been modelled. We do not only use the structure to make the decision process deterministic, but also to represent the stack of symbols still to be recognized. Furthermore we may have succeeded in parsing in linear time, but this is only possible because we have shifted the work to the construction of the trie structure. In the sequel we will show how we can construct the equivalent to the trie-structure by combining precomputed trie-structure building blocks. 2 We could have encoded this using a slightly different structure, but this would have resulted in a more elaborate program text later.

type Parser a = forall b result . Future b result -> Stack a b -> Errs -> Input -> Steps result data Errors = Deleted Symbol String Errors | Inserted Symbol String Errors | Notused String instance Show Errors where show (Deleted s pos e ) = "\ndeleted " ++ show s ++ " before " ++ pos ++ show e show (Inserted s pos e ) = "\ninserted "++ show s ++ " before " ++ pos ++ show e show (NotUsed "" ) = "" show (NotUsed pos ) = "\nsymbols starting at "++ pos ++ " were discarded " eof = " end of input" position ss = if null ss then eof else show (head ss) symbol a = let pr = \ r st e input -> case input of (b:bs) -> if a == b then else

Ok (r (st s) e bs) Fail ( (pr r st (e . Deleted b (position bs)) ‘best’ (r (st a) (e . Inserted a (show b)) ) Fail (r (st a) (e . Inserted a eof) input)

bs) input)

[] -> in pr succeed v = \ r stack errors input -> r (stack v) errors input p q = \ r stack errors input -> p r stack errors input ‘best‘ q r stack errors input p q = \ r stack errors input -> p (q r) (stack.) errors input parse p input = getresult ( p (\ v errors inp -> foldr (const Fail) (Stop (v, errors.position inp)) inp) id id input ) Figure 4: Correcting and error reporting combinators

4.2

Merging the different approaches

Compare the two different approaches taken: • continuation-based parsers that compute a value, and work on all alternatives in parallel • the parser that interprets a trie data structure, and inspects each symbol in the input only once In our final solution we will merge these two approaches. We will compute Sents fragments on which we base the decision how to proceed with parsing, and use the continuation based parsers to actually accept the symbols. Since the information represented in the new data structure closely resembles the information stored in a state of an LR(0) automaton, we will use that terminology. So instead of building the complete Sents structure, we will construct a similar structure which may be used to select the parser to continue with.

succeed (\ x y -> y) symbol ‘a’ symbol ‘c’ The problem that arises here is what to do with the parsers preceding the respective symbol ‘a’ occurrences: we can only decide which one to take after a symbol ‘b’ or ‘c’ has been encountered, because only then we will have a definite answer about which alternative to take. This problem is solved by pushing such actions inside the trie structure to a point where the merging of the different alternatives has stopped: at that point we are again working on a single alternative and can safely perform the postponed computations. As a result of this the actual parsing and the computation of the result may run out of phase.

Before proceeding, let us consider the following grammar fragment:

We now discuss the full version of the structure we have defined to do the bookkeeping of our look-ahead information (see the definition in Figure 6), and the actions to be taken once a decision has been made. The p components of the Look structure are the parsing functions that actually accept symbols and do the semantic processing. We will discuss the different alternatives in reverse order:

succeed (\ x y -> x) symbol ‘a’ symbol ‘b’

Found p (Look p) this alternative indicates that the only

type Parser = Sents data Sents = Choice [(Symbol, Sents)] | Sents :|: Sents -- left is Choice, right is End | End combine xss@(x@(s,ct):xs) yss@(y@(s’,ct’):ys) = case compare s s’ of LT -> x:combine xs yss GT -> y:combine xss ys EQ -> (s, ct ct’):combine xs ys combine [] cs’ = cs’ combine cs [] = cs symbol a succeed

= Choice [(a, End)] = End

l@(Choice as) r = case r of (Choice bs) -> Choice (combine as bs) (p :|: q ) -> (l p) :|: q End -> l :|: End l r@(Choice _) = r l _ _ = error "ambiguous grammar" Choice cs q = Choice [ (s, h q)| (s, h) Input -> Bool (Choice cs) (a:as) = or [ parse future as| (s, future) Look p -> Look sp l@(Shift p pcs) ‘merge_ch‘ right = case right of Shift q qcs -> Shift (p ‘bestp‘ q) (combine pcs qcs) (left :|: right) -> (l ‘merge_ch‘ left) :|: right (Reduce p) -> l :|: (Reduce p) (Found _ c) -> l ‘merge_ch‘ c Found _ c ‘merge_ch‘ r = c ‘merge_ch‘ r l ‘merge_ch‘ r@(Shift _ _) = r ‘merge_ch‘ l _ ‘merge_ch‘ _ = error "ambiguous grammar" (P p) ‘bestp‘ (P q) = P (\ r st e input -> p r st e input ‘best‘ q r st e input ) Figure 6: Computing the look-ahead information

possible parser that applies at this point is the first component of this construct. It corresponds to a nonmerged node in the trie structure. As such it marks also the end of a trie fragment out of which the original try structure may be reconstructed without any merging needed. The (Look p) part is used when the structure is merged with further alternatives. This is the case when that other alternative contains a path similar to the path that leads to this Found node. As one can see in the function ‘merge ch‘ this Found constructor is removed when the structure is merged with other alternatives. Reduce p this indicates that in using the look-ahead information we have encountered all symbols in the right hand side of a production (we have reached a reduce state in LR terminology), and that the parser p is the parser to be used. (:|:) this corresponds to the situation where we either may continue with using further symbols to make a decision, or we will have to use information about the followers of this nonterminal. This will be the only place where we continue with a possibly non-deterministic parsing process. It corresponds to a shift-reduce conflict in an LR(0) automaton. Shift p [(s, Look p)] this corresponds to a shift state, in which we need at least one more symbol in order to decide how to proceed. The parser p contained in this alternative is the error correcting parser to be called when the next input symbol is not a key in the next table of this Shift point, and thus no symbol can be shifted without taking corrective steps. Before giving the description about how to construct such a Look-ahead structure we will first explain how they are going to be used (see Figure 7). In order to minimize the interpretive overhead associated with inspecting the look-ahead data structures, we pair each such structure with the function that given 1. uses the input sequence to locate the parsing function to be called 2. and then calls this function with the input sequence. In a proper Haskell implementation this implies that the constructors used in the look-ahead structure are being “compiled away” as a form of partial or stages evaluation. Note that the functions constructed in this way (of type Realparser) will be the real parsers to be called: the look-ahead structures merely play an intermediate role in their construction, and may be discarded as soon as the functions have been constructed. The first argument to a Realparser is the continuation, the second one the accumulated stack, the third one the error messages accumulated thus far, the fourth one its input sequence, and its result will finally be a Steps sequence, containing the parsing result at the very end. The function mkparser interprets a Look structure and pairs it with its corresponding Realparser. The function mkparser constructs a function choose that is used in the resulting

Realparser to select (choose input) a Realparser p) making use of the current input. Once selected this parser p is then called (p r st e input). So the function choose, that is the result of the homomorphism over the Look structure, has type Input -> Realparser a. We will now discuss how this selection process takes place, again taking the alternatives into account from bottom to top: Found no further selection is needed so we return the function that, given the rest of the input, returns the parser contained in this alternative. Reduce this alternative can be dealt with just as the Found alternative; no further symbols of the input have to be inspected. (:|:) in this case we return a parser that is going to choose dynamically between the two possible alternatives: either we reduce by calling the parser contained in this alternative (p), or we continue with the parser located by using further look-ahead information (css). Shift this alternative is the most interesting one. We are dealing with the case where we have to inspect the next input symbol. Since performing a linear search here may be very expensive, we first construct a binary search tree out of the table, and partially evaluate the function pFind with respect to this constructed search tree. The returned function is now used in the constructed fucntion for the continuation of the selection process (and if this fails returns the error correcting parser p). We give the code for the additional functions in Figure 9. They speak for themselves, and are not really important for understanding the overall selection process. We have assured that this searching function is only computed once and is included in the Realparser. The interpretation overhead associated with all the table stuff is thus only performed once. In this respect our combinators really function as a parser generator. Having come thus far we can now describe how the Look structures can be constructed for the different basic combinators. The code for the basic combinators is given in Figure 8. The code for the combinator is quite straightforward: we construct the trie structure by merging the two trie structures, as described before, and invoke mkparser in order to tuple it with its associated Realparser. The code for the combinator is a bit more involved: Found we replace the parser already present, and which may be selected without further look-ahead, with the sequential composition (‘fwby‘) of the two Realparsers. Reduce apparently further information about the followers of this node is available from the context (viz. the right hand side operand of the sequential composition), and we use this to provide the badly needed information about what symbols may follow. So we replace this Reduce node with the trie structure of the right hand side parser, but with all element in it replaced

newtype RealParser a = P (forall . -> -> -> )

b result (b -> Errs -> Input -> Steps result) (a -> b) Errs Input-> Steps result

data Parser a = Parser (Look (RealParser a)) (RealParser a) mkparser :: Look a -> Parser a cata_Look (sem_Shift, sem_Or, sem_Reduce, sem_Found) = let r = \ c -> case c of (Shift p csr) -> sem_Shift (left :|: right ) -> sem_Or (Reduce p ) -> sem_Reduce (Found p cs ) -> sem_Found in r map_Look f

=

cata_Look ( , , , )

\ \ \ \

qp csr left right qp qp csr

-> -> -> ->

p [(s,r ch) | (s, ch) let locfind = pFind (tab2tree css) in \inp -> case inp of [] -> p (s:ss) -> case locfind s of Just cp -> (cp ss) Nothing -> p ,\ css p -> (p ‘bestp‘). css ,\ p -> const p ,\ p cs -> const p ) cs in Parser cs (P (\ r st e input -> let (P p) = choose input in p r st e input ) ) Figure 7: Constructing parsers out of Look ahead structures symbol a = let pr = ... as before pr’= (\ r -> \ st e (s:ss) -> Ok (r (st s) e ss )) in mkparser (Found (P pr) (Shift (P pr) [(a, (Reduce (P pr’)) ) ] )) succeed v = mkparser ( End (P (\ r -> \ st e input -> r (st v) e input)) ) (Parser cp _) (Parser cq _)

= mkparser (cp ‘merge_ch‘ cq)

(Parser cp _) ~(Parser csq qp) = mkparser$cata_Look ( \ pp csr -> Shift (pp ‘fwby‘ qp) csr , \ csr rp -> csr ‘merge_look‘ rp , \ pp -> map_Look (pp ‘fwby‘) csq , \ pp csr -> Found (pp ‘fwby‘ qp) csr ) cp where (P p) ‘fwby‘ (P q) = P (\ r st e i -> p (q r) (st.) e i) Figure 8: The final basic combinators

by a parser prefixed with the reducing parser from the original left hand side node3 . (:|:) we merge both alternatives. Shift we update the error correcting parser, and postfix all parsers contained in the choice structure with the fact that they are followed by the second parser. This definition seems to be horrendously costly, but again we are saved by lazy evaluation. Keep in mind that these Look structures are only being used in the function mkparser, and are only inspected for the branches until a Found or Reduce node is reached. If the grammar is LL(1) this will only be one step! As soon as mkparser has done its job the whole structure may be garbage collected. In the code for symbol we see two local functions: • pr: the original error correcting parser • pr’: this function is only called when a function constructed by the aforementioned choose has indeed discovered that the expected symbol is present. In this case the check that it is present and the error correcting behavior can be skipped. So all we have to do is to take a single symbol from the input, incorporate it into the result, and continue parsing. We record the successful step by adding an OK-step. If we do not make use of look-ahead information we have to apply the function pr, and this is the function that is contained in the first Found construct. When this parser is merged with other parsers, this wrapper is removed and the parser will only be called after it has been decided that it will succeed: hence the occurrences of pr’ in the rest of the text. Only when the test somehow fails to find a proper look-ahead we use the old pr again in order to incorporate corrective steps. For the sake of completeness we incorporate the additional functions used in Figure 9.

5. EXTENSIONS In the full set of combinators, we have included some further extensions. Computation of a full look-ahead may be costly, e.g. when the choice structures that are computed become very large, and are not used very often. In such cases one may want to use a non-deterministic approach. For this purpose dynamic versions are also available that have the efficiency of the backtracking approach given before. We also note that the process of passing a value and error messages around can be extended to incorporate the accumulation of any further needed information; examples of such kinds of information are the name of the file being parsed, a line number, an environment in which to locate specific identifiers, etc. In that case the state should at least 3 Strictly speaking this is only needed when the reduce node actually is the right operand of a :|: construct, indicating the existence of a now resolvable shift-reduce conflict

be able to store error messages, and recognized symbols. Extra combinators have been introduced in the library to update the state. The production version of our combinators contains numerous further small improvements. As an example of such a subtle improvement consider the code of the function choose. Once it cannot locate the next symbol in the shift table it resorts to the error correcting version. This parser will try all alternatives, and compare all those results. But we know for sure that the first step of each result will be a Fail and thus that the first step of the selected result will be too. So instead of first finding out what is the best way to fail, and only then reporting that parsing failed, it is better to immediately report a fail step and to remove the first fail step from the actual result; it is quite likely that we are dealing with a shift/reduce conflict and have gone over to the dynamic comparison of the two alternatives, and that, since we fail, the other alternative will succeed. Since the fact that repair is possible may be discovered during the selection in a choice structure we have to make sure that no backtracking is taking place over this prefix if we want to guarantee that all correct symbols are examined only once. In the production version this adds one further accumulating parameter to the construction of the actual parsers. Sometimes, once an erroneous situation has been detected, a better correction can be produced by taking not only the future but also some of the past into account. We have produced versions of our combinators that do so, and which base the decision about how to proceed by comparing sequences of parsing steps, instead of taking the greedy approach, and looking just ahead far enough to see a difference. By making different choices for the function best we can incorporate different error correction strategies. In a sequential composition we always incorporate the call to the second parser in the trie structure of the first one. In general it is undecidable in our approach whether this is really needed for getting a deterministic parser; it may be used to resolve a shift/reduce conflict, but computing whether such a conflict may occur in the (possibly infinite) trie structure may lead to a non-terminating computation. An example of this we have seen before with the grammar for the language a∗ b|a∗ c; in our approach it is not possible to discover that, when building the trie structure, we have reached a situation equivalent to the root after taking the ’a’ branch. We may however take an approximate approach, in which we try to find out whether we can be sure it is not needed to create an updated version of the trie structure. As an example of such a situation consider: number = fold1 () (map symbol [1..1000]) plus = succeed (+) number number In this case an (unneeded) copy is made of the first number parser, in order to incorporate the call to +, and then once more a copy of this structure is made in order to incorporate the second number parser. Since we can immediately see that a number cannot be empty, and all alternatives are disjoint, and leading in one step to a Found node, we can postpone the updating process, and create a parser using the ‘fwby‘ operator immediately.

data BinSearchTree a v = Node (BinSearchTree a v) a v (BinSearchTree a v) | Nil tab2tree tab = tree where (tree,[]) = sl2bst (length tab) (tab) sl2bst 0 list = (Nil , list) sl2bst 1 ((a,v):rest) = (Node Nil a v Nil, rest) sl2bst n list = let ll = (n - 1) ‘div‘ 2 ; rl = n - 1 - ll (lt,(a,v):list1) = sl2bst ll list (rt, list2) = sl2bst rl list1 in (Node lt a v rt, list2)

pFind pFind

Nil = \i -> Nothing (Node left r v right) = let findleft = pFind left findright= pFind right in \i -> case compare i r of EQ -> Just v LT -> findleft i GT -> findright i Figure 9: The additional function for Braun tree construction and inspection

Another problem that occurs is that the number of different possible error corrections may get quite high, and worse, that they are all equivalent. If an operator is missing between two operands there are usually quite a number of candidates to be inserted, all resulting in a single failing step. In the approach given this would imply that the rest of the input is parsed once for each of these different possible corrections, trying to find out whether there really is no difference. In contrast to generalised LR parsing we do not have access to an explicit representation of the grammar, and especially we cannot compare the functions that represent the different states, and as a consequence we cannot discover that we are comparing many parellel but equivalent parses. In the library there are facilities for limiting such indefinite comparisons by specifying different insertion and deletion costs for symbols and by limiting the distance over which such comparisons are being made. In this way fine-tuning of the error correction process is possible without interfering with the rest of the parsing process. Finally we have included basic parsers for ranges of symbols, thus making the combinators also quite usable for describing lexers [2]. By defining additional combinators that extend the basic machinery we may deal with ambiguous grammars too. As an example consider a simple lexer defined as follows:

many p = let manyp = many p in succeed (:) p manyp succeed [] letter = ’a’ ‘upto‘ ’z’ identifier = (:) letter many letter token [] = succeed [] token (s:ss) = succeed (:) symbol s token ss

lexsym = identifier token "if" ... In this case we have two problems: 1. we want to use a greedy approach when recognising identifiers by taking as many letters as possible into account 2. the string "if" should be parsed as a token and not as an identifier We may solve this problem by introducing a new combinator: step v n = mkparser ( End (P (\ r st e input -> Step n (r (st v) e input) ) ) ) and changing the code above into: many n p = let manyp = many p in succeed (:) p manyp step [] n identifier = (:) letter many 2 letter token [] = step [] 1 token (s:ss) = succeed (:) symbol s token ss If we have seen the text "if", and there are no further steps possible with cost 0 (e.g. further letters), then we choose the token alternative since 1 < 2.

6. EFFICIENCY It is beyond the scope of this paper to provide a detailed discussion about the efficiency and space usage of the parsers constructed in this way. We notice however the following:

reached the adult state and are the tool to be chosen when one wishes to construct a parser: already they were nice, but now they have become useful.

8. ACKNOWLEDGMENTS • it follows directly from the construction process that the number of constructed parsers can be equivalent to the number of LR(0) states in a bottom-up parser, so there is no space leak to be expected from that; this is however only the case if the programmer has performed left factorisation by hand. If not the system will perform the left factorisation, but since it cannot identify equality of parsers it will do so by merging two identical copies of the same look-ahead information.

I want to thank Pablo Azero, Sergio Mello Schneider, Han Leushuis and David Barton for their willingness to be exposed to an almost unending sequence of variants of the library. I thank Johan Jeuring and Pim Kars and anonymous referees for commenting on the paper.

• the number of times a symbol is inspected may be more than once: where a bottom-up parser would have a shift-reduce conflict we basically pursue both paths until a difference is found; if the grammar is LR(1) this implies we are inspecting one symbol twice in such a situation. We feel that in all practical circumstances this is to be preferred over the construction of full LR(1) parsers, where the number of states easily explodes. It is the relatively low number of states of the LR(0) automaton, and thus of our approach, that makes the LALR(1) handled by Yacc to be preferred over LR(1). Besides that has our approach the advantage of smoothly handling also longer look-aheads when needed. The lazy evaluation takes nicely care of the parallel pursuit for success that would be a nightmare to encode in C or Java.

[2] M. M. T. Chakravarty. Lazy lexing is fast. In A. Middeldorp and T. Sato, editors, Fourth Fuji International Symposium on Functional and Logic Programming: FLOPS 99, number 1722 in LNCS, pages 68–84, 1999.

• using the decision trees makes that at each choice point we have a complexity that is logarithmic in the number of possible next symbols. The simple parser combinators all are linear in that number, a fact that really starts to hurt when dealing with productions that have really many alternatives.

[6] S. Swierstra and P. Azero Alcocer. Fast, error correcting parser combinators: a short tutorial. In J. Pavelka, G. Tel, and M. Bartosek, editors, SOFSEM’99 Theory and Practice of Informatics, 26th Seminar on Current Trends in Theory and Practice of Informatics, volume 1725 of LNCS, pages 111–129, November 1999.

7. CONCLUSIONS We have shown that it is possible to analyze grammars and construct parsers, that are both efficient and correct errors. The overhead for the error correction is only a few reductions per symbol in the absence of errors: adding the Ok step and removing it again. When comparing the current approach with the one for the LL(1) grammars we see we have now included all look-ahead information in one single data structure, thus getting a more uniform approach. Furthermore the decision about whether to insert or delete a symbol is done in a more local way, using precise look-ahead, especially in the case of follower symbols. This in general makes parsers continue longer in a more satisfactory way; previously a parser might prematurely decide to insert a sequence of symbols to complete a program, and throw away the rest of the input as being not needed. When comparing the parsers coded using our library with those written in Yacc, the inputs look much nicer. Reducereduce conflict resulting from the incorporation of semantic actions do not occur, since always a sufficient context is taken into account. We conclude by referring to the title of this paper by claiming that finally parser combinators have

9. REFERENCES [1] W. H. Burge. Recursvie Programming Techniques. Addison Wesley, 1975.

[3] J. Fokker. Functional parsers. In J. Jeuring and H. Meijer, editors, Advanced Functional Programming, number 925 in LNCS, pages 1–52, 1995. [4] G. Hutton and H. Meijer. Monadic parser combinators. Journal of Functional Programming, 8(4):437–444, 1988. [5] J. Jeuring and S. Swierstra. Lecture notes on grammars and parsing. http://www.cs.uu.nl/people/doaitse/Books/GramPars2000.pdf.

[7] S. Swierstra and L. Duponcheel. Deterministic, error-correcting combinator parsers. In J. Launchbury, E. Meijer, and T. Sheard, editors, Advanced Functional Programming, volume 1129 of LNCS-Tutorial, pages 184–207. Springer-Verlag, 1996.

Typed Logical Variables in Haskell Koen Claessen

[email protected]

Peter Ljunglof

[email protected]

August 9, 2000 Abstract

We describe how to embed a simple typed functional logic programming language in Haskell. The embedding is a natural extension of the Prolog embedding by Seres and Spivey [16]. To get full static typing we need to use the Haskell extensions of quanti ed types and the ST-monad.

1 Introduction Over the last ten to twenty years, there have been many attempts to combine the

avours of logic and functional programming [3]. Among these, the most well-known ones are the programming languages Curry [4], Escher [13], and Mercury [14]. Curry and Escher can be seen as variations on Haskell, where logic programming features are added. Mercury can be seen as an improvement of Prolog, where types and functional programming features are added. All three are completely new and autonomous languages. De ning a new programming language has as a drawback for the developer to build a new compiler, and for the user to learn a new language. A di erent approach which has gained a lot of popularity the last couple of years is to embed a new language in another language, called the host language. Haskell has been shown to be extremely well-suited for this purpose in various areas [8, 2, 1]. The embedding approach has an obvious other advantage. Programs in the embedded language are rst class citizens in the host language, and can therefore be generated by a program in the host language. Our aim is to embed logic programming features in Haskell. To this end, several di erent approaches have been taken, noticably by Seres and Spivey, who embed a language of predicates over terms in Haskell [16, 17], and by Hinze, who shows how to describe backtracking eciently and elegantly [5, 6]. Our approach combines these ideas and adds something new: the terms of the embedded program are, in contrast to Seres' and Spivey's approach, typed. The resulting embedded language has several limitations however. First of all, there are some syntactic drawbacks in the sense that some predicate de nitions can not be as elegantly described in Haskell as in Prolog, because of the special syntax that Prolog

has for logical variables and uni cation pattern matching. Second, real implementations of logical programming have many specialised search strategies. The rest of the paper is organized as follows. In section 2, we present a summary of Seres' and Spivey's work on embedding Prolog in Haskell. In Section 3, we generalize their work to use a monad. In Section 4, we show how we can use the ST-monad to deal with user-de ned datatypes in a typed setting. In Section 5 we conclude.

2 Embedding Prolog in Haskell Silvija Seres and Mike Spivey have embedded traditional logic programming in a functional framework [16, 17]. This section is essentially a summary of their embedding.

2.1 Logical Variables The embedding consists of a datatype Term for terms, a type Pred of predicates, a : the connectives (^, _) and the existential quanti er (9): uni cation predicate (=), :

(=) (^),(_) (9)

:: :: ::

Term ! Term ! Pred Pred ! Pred ! Pred (Term ! Pred) ! Pred

The type Term for terms could be anything, but must have the possibility of being an uninstantiated variable. Here we use a type of binary trees, with Var i being a reference to the variable with the unique identi er i: data Term = Var Id | Atom String | Nil | Term ::: Term

In the embedding we use lists to simulate backtracking, but this is a rather arbitrary choice | there are lots of other possibilities. So we de ne a type Backtr, which in this case is equivalent to lists. One thing which we use, is that all the possible variants for this search type are monads, more speci cally monads with both a zero (written as mzero) and a plus (written as (+++)). The type Pred of predicates is a function that takes a computation state and gives a stream of new states. A computation state State holds the current values of the logical variables (called substitutions), and a stream of uninstantiated variables. type type type type

Backtr Pred State Subst

a

= = = =

[a ] State ! Backtr State (Subst, [Term]) [(Id, Term)]

: uses a standard uni cation function unify :: Term ! Term The uni cation predicate (=) ! Subst ! Backtr Subst, which takes two terms and a substitution and returns either failure (represented by the zero of the backtracking monad) or a new substitution (represented by the unit): :

(=) :: Term ! Term ! Pred : a = b = (sub,vs) ! do sub'

s

where

Note that we use the non-standard Haskell extension of multiple parameter type classes with functional dependencies here [11]. This is possible because all instantiations of a will contain an s . With this we can de ne the free predicate for each of our types: instance Free s (Atom s ) where free = VarA `liftM` newLPRef Nothing instance Free s (List s a ) where free = VarL `liftM` newLPRef Nothing

Overloading free has the extra advantage that we can create free pairs, triples, etc. as well: instance (Free s a , Free s b ) ) Free free = liftM2 (,) free free

s (a ,b )

where

4.4 Uni cation Unifying two instances of a term datatype poses the same problem as with free, so we simply take the same solution; we overload uni cation.4 To minimize the work everytime a new datatype is declared, we split the work the uni cation algorithm has to do into two parts: variables and constructors. Therefore, we introduce a type class Unify containing two operators, isVar, used to check if a term happens to be a variable, and unify, used to unify two terms which are both not variables. class Unify s a | a -> s where isVar :: a ! Maybe (Var s a ) unify :: a ! a ! LP s ()

: is de ned in terms of these operations. Before we de ne The uni cation operator (=) it however, we introduce a helper function unifyVar, which uni es a variable and a constructor: unifyVar :: Unify s a ) Var s a ! a ! LP s () unifyVar ref a = do mb readLPRef ref case mb of Nothing ! writeLPRef ref (Just a) : Just b ! a = b

The uni cation algorithm can now be implemented as follows. 3 4

This could be very much simpli ed if we had subtyping as in the language O'Haskell [15]. There exist a general polytypic solution to this problem [9, 10], which takes a similar approach.

:

(=) :: Unify s a ) a ! a ! LP s () : a = b = case (isVar a, isVar b) of (Just var1, Just var2) | var1 == var2 ! true (Just var , _ ) ! unifyVar var b (_ , Just var ) ! unifyVar var a _ ! unify a b

It rst deals with the special cases where at least one argument is a variable, and hands the other cases over to the user-de ned operation unify. The instances for Unify of atoms and lists look like this: instance Unify s (Atom s ) where isVar (VarA var) = Just var isVar _ = Nothing unify (Atom s1) (Atom s2) | s1 == s2 = true unify _ _ = false instance Unify s a ) Unify s (List isVar (VarL var) = Just var isVar _ = Nothing unify Nil Nil unify (a ::: as) (b ::: bs) unify _ _

s a)

= true : = a = b ^ as = false

where

=:

bs

Note that the de nitions of isVar almost look identical.

4.5 Getting out of the LP-Monad To be able to extract a result from the LP-monad we have to de ne a run function that simply calls the run function of the ST-monad: runLP :: (8s . LP s a ) ! [a ] runLP m = runST (lower m)

This function can only be used when a does not depend on s . Both the Atom and the types (as well as any type for logical variables) depend highly on s , because they can contain a variable constructor in the term. So, we de ne a conversion that takes away all variables from a term. To do this, we de ne a helper function which converts the contents of a variable. This only succeeds if the variable is actually instantiated. List

variable :: (a ! LP s b ) ! Var variable convert var = do Just a

s a

! LP s b readLPRef var ; convert a

Here is how we can convert atoms and lists. The list conversion function takes as a parameter the conversion functions of its elements. atom :: Atom s ! LP s (Atom ()) atom (Atom s) = return s

atom (VarA var) = variable atom var list :: (a ! LP s b ) list elt Nil = list elt (a ::: as) = list elt (VarL var) =

s a ! LP s (List () b ) return Nil liftM2 ( ::: ) (elt a) (list elt as) variable (list elt) var

! List

Note that the result types use () as the state type, to indicate that these datatypes do not contain any variables. We might be tempted to overload these functions, but we do not do this because it is not clear that we always want to convert the datatypes to their variable-less counter parts. For example, Haskell already has a perfectly ne list datatype, and here is how we could convert to it: list' :: (a ! LP s b ) list' elt Nil = list' elt (a ::: as) = list' elt (VarL var) =

s a ! LP s [b ] return [] liftM2 (:) (elt a) (list' elt as) variable (list' elt) var

! List

Similarly, we might convert atoms directly to strings.

4.6 The Return of the Example The only thing that we need to change in our standard example is the types for the predicates. The code is exactly the same as in Section 3.2, but the predicates now have the following types: path neighbour

:: ::

Atom Atom

s s

! Atom s ! LP s (List ! LP s (Atom s )

s

(Atom

s ))

We get a type error if we try to evaluate the predicate path (Atom "a") (Atom "b" ::: Nil), whereas without the ST-monad the call would just fail. We can try to nd the paths between a and e by using an adapted solve function, which is now implemented using runLP. Main> solve (path (Atom "a") (Atom "e") = list atom) a ::: b ::: c ::: d ::: e ::: Nil a ::: b ::: c ::: e ::: Nil a ::: b ::: d ::: e ::: Nil a ::: d ::: e ::: Nil no (more) solutions Main> solve (path (Atom "a") (Atom "e") = list' atom) [a,b,c,d,e] [a,b,c,e] [a,b,d,e] [a,d,e] no (more) solutions

5 Conclusions and Future Work We have succeeded in embedding a simple typed functional logic language in Haskell, without extending the language with other than already well-accepted features, such as multiple parameter type classes (without overlapping instances), local universal quanti cation in types, and the ST-monad. It takes some work to add a datatype to be used in the embedding. Some of this has to do with the fact that every datatype has to contain a special constructor for variables. One way of solving this is to de ne recursive datatypes using explicit xpoints. This takes away some of the work when implementing a new datatype. However, the types get a lot more complicated and become unmanageable when dealing with more complicated types than regular datatypes. Another big help would be a polytypic programming tool [9, 7]. The resulting language is rather naive in several ways. First of all, the syntax of the programs is often more clumsy than the way you could write it in a dedicated logic programming language. This could be solved by adding some syntactic sugar. Second, the search strategies are not as fancy as the ones one can nd in some implementations of logic programming. Some of these could be implemented in our embedding, such as breadth- rst searching and some features which make it possible to unify more cyclic structures. Others require a meta-level view of the program, such as indexing in most Prolog implementations. Third, common implementations of Prolog have some interesting and practically useful extensions. Some of these are assert/retract in most Prolog implementations, and bb get/bb put in Sicstus Prolog, which are used to handle global variables that survive a failure. None of these are part of our embedding today, but we believe that most of them are not too hard to add. Eciency is often not the rst aim of an embedding, but is sometimes desirable. Hinze's implementation of backtracking is quite fast [5], but the uni cation algorithm we implemented could probably be done better. For example, we could use optimizations in the spirit of the well-known union- nd algorithm. The techniques we used have been shown quite useful in implementing embeddings of other domains as well. For example, in ongoing unrelated work, we have used the STmonad to implement an embedded language for describing state transition diagrams over variables of arbitary types using a similar method as the one we describe here. For future work, we want to implement many of these proposed improvements and extensions. By doing this, we hope to nd a good basis for a nice semantics for logic programming and its well-known extensions. Also, we would like to investigate how we can make the embedding more practical, by using it in more realistic programs.

References [1] Koen Claessen. An embedded language approach to hardware description and veri cation, September 2000. Licentiate Thesis, Chalmers University of Technology, Gothenburg, Sweden.

[2] Conal Eliott. An embedded modeling language approach to interactive 3d and multimedia animation. In IEEE Transactions on Software Engineering, June 1999. [3] Michael Hanus. The integration of functions into logic programming: From theory to practice. Journal of Logic Programming, 19 & 20:583{628, 1994. [4] Michael Hanus. Curry: An integrated functional logic language, February 2000. The current report describing the language Curry, http://www.informatik.unikiel.de/~curry/report.html. [5] Ralf Hinze. Prological features in a functional setting - axioms and implementations. In Third Fuji Int. Symp. on Functional and Logic Programming, pages 98{122, April 1998. [6] Ralf Hinze. Deriving monad transformers. Technical Report IAI-TR-99-1, Institut fur Informatik III, Universitat Bonn, January 1999. [7] Ralf Hinze. Polytypic functions over nested datatypes. Discrete Mathematics and Theoretical Computer Science, 3(4):159{180, 1999. [8] Paul Hudak. Building domain-speci c embedded languages. ACM Computing Surveys, 28(4):196, December 1996. [9] Patrik Jansson. Functional Polytypic Programming. PhD thesis, Chalmers Univerisity of Technology and Goteborg University, June 2000. [10] Patrik Jansson and Johan Jeuring. Polytypic uni cation. Journal of Functional Programming, 8(5):527{536, September 1998. [11] Mark P. Jones. Type classes with functional dependencies. In 9th European Symposium on Programming, Berlin, Germany, 2000. Springer-Verlag LNCS 1782. [12] John Launchbury and Simon Peyton Jones. Lazy functional state threads. In Proc ACM Programming Languages Design and Implementation, Orlando, 1994. ACM. [13] John W. Lloyd. Programming in an integrated functional and logic programming language. Journal of Functional and Logic Programming, 3, March 1999. [14] The Mercury project homepage. http://www.cs.mu.oz.au/research/mercury/. [15] Johan Nordlander. Reactive Objects and Functional Programming. PhD thesis, Dept. of Computer Science, Chalmers University of Technology, 1999. [16] Silvija Seres and Michael Spivey. Embedding Prolog in Haskell. In Haskell Workshop, Paris, France, September 1999. [17] Silvija Seres, Michael Spivey, and Tony Hoare. Algebra of logic programming. In Int. Conf. Logic Programming, November 1999.

DebuggingHaskellbyObservingIntermediateDataSt

ructures

AndyGill OregonGraduateInstitute [email protected] http://www.cse.ogi.edu/~andy

Abstract Haskell has long needed a debugger. Although there has been much research into the topic of debugging lazy func tional programs, no robust tool has yetcomefrom theHaskell community that can helpdebugfullHaskell-untilnow.This paperdescribes a portable debugger for fullHaskell, building only on commonly implementedextensions.It is basedon theconcept ofobservation of intermediate data structures, rather than the mo re traditional stepping and variable examination paradigm used by imperative debuggers.

1 Introduction Debuggers allow you to see inside your program whil e running, andhelpyou understandboth theflow ofcontrolan dtheinternal data and structures that are being created, manipul ated and destroyed. The art of debugging is viewing your progr am through this portal, letting you locate the difference betw een what the computerhasbeentoldtodo,andwhattheprogramm erthinksthe computershouldbedoing. When debugging an imperative program using traditio nal debugging technology (like gdb or Visual Studio) the pro grammer might step through some suspect code using sample t est data, stopping and examining internal structures at keyp oints. Haskell programscanhaveanimperativeveneer,usingtheI Omonad,and itshouldbepossibletousetypicaldebuggingtech nologyforsuch parts of a Haskell program. But when debugging othe r parts of Haskell, we cannot straightforwardly use the same d ebugging technology to render internal information, because many of the hooks that are used to provide the user with debugg ing facilities donotmapneatlyacrosstothelazyfunctionalwor ld. • Therearenovariablestoobservechangingduringe xecution. • The concept of sequences of actions or executing a specific linenumberdoesnotexist. uild the • Any closure has two parents, the static one (that b closure and give context), and the dynamic one (tha t first evaluatedtheclosure).Astacktracebecomesapar enttree. • When a function is called, its arguments might not yet be evaluated.Shouldthedebuggerdoextraevaluations ? In thispaper,wearguethattheanalogtobreakpoi ntingandexamining variables for a functionalprogram isobservi ngintermediate data structures as they are passed between function s. This argument can be considered a generalization of the "deb ugging via dataflow"ideaproposedbySinclair[7].

ConsiderthisHaskellfunction natural :: Int -> [Int] natural = reverse . map (`mod` 10) . takeWhile (/= 0) . iterate (`div` 10)

The first step to understanding this listful functi functionwithsomeexampledata.

on is to run the

Main> natural 3408 [3,4,0,8]

This tells us what the function does, but not how t works.To understandthisfunction,weneedtovisu denintermediatestructuresbehindthefunction,an pipeline of (lazy) intermediate lists. ($ is a comb application)

he function alizethehiddseeinsidethe inator for infix

natural 3408 reverse . map (`mod` 10) . takeWhile (/= 0) . iterate (`div` 10) $ 3408 reverse . map (`mod` 10) . takeWhile (/= 0) $ (3408 : 340 : 34 : 3 : 0 :_) reverse . map (`mod` 10) $ (3408 : 340 : 34 : 3 : []) reverse $ (8 : 0 : 4 : 3 : []) (3 : 4 : 0 : 8 : [])

    

Displaying steps like this gets garrulous quickly. information - the intermediate structures - can be pressed. -( -( -( -(

Yet the critical concisely ex-

after iterate (`div` 10) 3408 : 340 : 34 : 3 : 0 : _ ) after takeWhile (/= 0) 3408 : 340 : 34 : 3 : [] ) after map (`mod` 10) 8 : 0 : 4 : 3 : [] ) after reverse 3 : 4 : 0 : 8 : [] )

We want to build a portable debugger (in the form o f a Haskell library) that lets Haskellusers get concise data s tructureinformation, like the information displayed above,aboutt hestructuresin their Haskell programs. Even though our debugger an swers only this one question -what arethecontents of specif icintermediate structures,becausestructuresin Haskellareboth richandregular, even this simple question can be the basis of a pow erfuldebuggingtool.

Ouroveralldebuggingsystemisasfollows: •

We provide a Haskell library that contains combinat debugging. (Taking this form allows the user to deb Haskell.)

ors for ug full

Main> tracing_sum [1,2,3] sum [1,2,3] = 66 Main>

Wehaveobservedthebehaviorofsum,butneededto trivialcodechangestodoso.

makenon-



The frustrated Haskell programmer uses these debugg ing combinators toannotate their code,andre-runsthe irHaskell program.

Thethirdproblemis tracechangesthestrictnessofthethingsitis observing because trace it is hyper-strict in its first argu ment. Consideratracingversionoffst.



The execution oftheHaskellprogram runs asnormal are nobehavioralchangesbecauseofthedebugging tions.

tracing_fst pair = trace message res where res = fst pair message = "fst " ++ show pair ++ " = " ++ show res



The structures that have been marked for observatio n are displayed on the users console on termination of th eir program.

Otherversionsofthedebugginglibraryallow foro setups,likeofflineobservationsofdata-structure s.

;there annota-

therdebugging

Using this version of fst is problematic, because o oftracing_fst.

f the strictness

Main> tracing_fst (99,undefined :: Int) fst (99, Program error: {undefined} Main>

2 DebuggingCombinators

2.2 Introducingobserve

We introduce our new debugging combinator in terms of an improvementofthecurrentstateoftheartinfullH askelldebugging, whichisusinganunsafefunctioncalled trace.

Thefunction tracecan bereallyusefulfordebuggi ngHaskell,but thebonafideshortcomingisthattraceisattool owalevel.Building combinator libraries is a common way to build i n low-level primitives, giving interfaces that are both friendl ier and more intuitive.

2.1 trace–AReprise All current Haskell implementations come with this standard)function,whichhasthetype:

(non-

trace :: String -> a -> a

The semantics of trace istoprint (asasideeffec t) thefirstargument,andreturnthesecondargument.Therearethr eemainproblemswithusingtracefordebugging. The first problem with traceis theincomprehensiblenessof output. Augustsson and Johnsson had a variation of trace in their LML compiler [1]. Their conclusion about trace was that it was generally difficult to understand the "mish-mash" o f output from differentinstancesoftrace.Thisispartlybecaus ethestrictnessof the first argument of trace might itself trigger ot her traces, and partly due to the unintuitive ordering of lazy eval uation. The "mish-mash" problem could perhaps be tackled using a postprocessorontheoutput. The second problem with trace is that inserting it into Haskell code tends to be invasive , changing the structure of code, For example,consideravariantofsum,whichdisplays itsown executionusingtrace. tracing_sum xs = trace message res where res = sum xs message = "sum " ++ show xs ++ " = " ++ show res

Runningtracing_sumusingHugsgives:

What form could a higher level debugging combinator take? Using the example in the introduction asevidence, we arguethatit shouldtaketheform ofafunction that allows ust oobservedata structures in a transparent way. As a wayofachiev ingthis,considertheHaskellfragment: consumer . producer

Imagineif thePrelude function id remembered itsargument.We couldinsertstrategicallyplacedid’s,andidwoul dtelluswhatgot passedfromtheproducertotheconsumer. consumer . id . producer

We argue that a higher level combinator for debuggi ng should take this form, and both passing an argument transp arently, and observing and remembering it. To facilitate multipl e observations in one program, we useastringargument,which is alabelused onlyfor identification purposes. The type of our p rincipaldebuggingcombinatoris observe :: (Observable a) => String -> a -> a

Intheabove(point-free)example,wecouldwrite: consumer . observe "intermediate" . producer

Thishasidenticalsemanticsto consumer . producer,butthe observe squirrelsawaythedatastructurethatgets drawn through it,puttingitintosomepersistentstructureforl aterperusal.Asfar astheexecution ofHaskellprogram isconcerned,o bserve(witha label) is just a version of id. Notice that observe can be used to observe any expression, not just the intermediate values insid ea point-freepipeline;wewillseeexamplesofboths tyleslater. observe has a type class restriction on the object Thisdoesnotturnouttobeasbigaproblemasmi

being observed. ghtbethought.

We provide instances for all the Haskell98 base typ es (Int, Bool, Float, etc), as well as many containers (List, Arra y, Maybe, Tuples,etc).Wewillreturn tothespecificsofthis restriction inSection 5.2, because the type class mechanism provided the frameworkthatenablesobservetowork. How doesobservecomparewith respecttothethree trace? •

• •

weaknessof

trace sometimes produced a "mish-mash" of output. I n our system, we provide renderings, using a prettyprint er, of the specific observations made by observe. This is poss ible because observe provides a structured way of looking at Haskellobjects. Unlike advanced uses of trace, minimal code changes requiredtoobserveanintermediatestructure.

are

Finally and critically, the strictness of the obser ved structure is not changed, because observe does not do anyeva luation oftheobjectitisobserving.Observationofan in finitelist,or isperfectlyvalid,asweshallseeshortly. alistfullof

3 Examplesofusingobserve Now we look at several examples of observe being us explaininghowtoimplementobserveinSection5.

ed, before

ex1 :: IO () ex1 = print ((observe "list" :: Observing [Int]) [0..9])

xt(explainedin

-- list 0 : 1 : 2 : 3 : 4 : 5 : 6 : 7 : 8 : 9 : []

We have successfully observed an intermediate data structure, withoutchangingthe valueorsemanticsof thefina lHaskellprogram. xplicitabout

print reverse (observe "intermediate" :: Observing [Int]) reverse [0..9]

Thisobservemakesthefollowingobservation -- intermediate 9 : 8 : 7 : 6 : 5 : 4 : 3 : 2 : 1 : 0 : []

3.3 Observingainfinitelist Both thelistswehaveobservedsofarwerefinite. ofanobservationonaninfinitelist,consider:

Asan example

ex3 :: IO () ex3 = print (take 10 (observe "infinite list" [0..]) )

Here we observean infinitelist, startingat0,wh 10 elements taken from it, and is then printed. Run ampleallowsustomaketheobservation

ich hasthefirst ningthis ex-

3.4 Observinglistswithunevaluated elements Sowhataboutunevaluatedelementsofthelist?Wha totakethelengthofafinitelist?

tifwewere

ex4 :: IO () ex4 = print (length (observe "finite list" [1..10]) )

Thisgivestheobservationas -- finite list _ : _ : _ : _ : _ : _ : _ : _ : _ : _ : []

type Observing a = a -> a

However, using this explicit typing is optional. We equallywellwritten

ex2 = . . . $

typical use

Wecanseethat0to9havebeenevaluated,butthe tailofthe10 th cons has not been evaluated, rendered using the not ation " _". If moreofthelistwereextracted,wewouldseemore conscells,etc.

Asafirstexampleconsider:

Weusetheobservetypesynonymtoallow ustobee whattypewethinkweareobserving.

observe can be used partially applied, which is the scenariowhenobservinginsideapoint-freepipelin e.

-- infinite list 0 : 1 : 2 : 3 : 4 : 5 : 6 : 7 : 8 : 9 : _

3.1 Observingafinitelist

Ifwerun thisIOactioninsidethedebuggingconte Section6.1),wewouldmaketheobservation

3.2 Observinganintermediatelist

could have

ex1 = print (observe "list" [0..9])

Thisdefinition howeverreliesonthedefaultmecha nismchoosing an Int or Integer list. Typicallythe type of obser ve is fullydeterminedbyitscontext,butwesometimesincludethe typesignature with ourexamplestomakeexplicittothereader wh attypeisbeingobserved.

Whatiftheelementswere

?

ex5 :: IO () ex5 = print (length ((observe "finite list" :: Observing [()]) [ error "oops!" | _ (observe "after reverse" :: Observing . reverse . (observe "after map …" :: Observing . map (`mod` 10) . (observe "after takeWhi …":: Observing . takeWhile (/= 0) . (observe "after iterate …":: Observing . iterate (`div` 10)

4.5 Summaryofusingobserve We have seen many examples of observe successfully observing internal,sometimesintermediate,structures.Itis both generaland flexible, working in many different practical setti ngs, such as: observinghow functions areused, observingstatei nsidemonads, andobservingIOactions.

5 Howdoesobservework? We have demonstrated that observe can be used as a powerful debuggingtool,butwestillneedtoanswertheque stionofhowto implementobservein aportable way.Thissection i ntroducesthis newmechanism. TakeasanexamplethisHaskellfragment.

[Int])

ex12 = let pair = (Just 1,Nothing) in print (fst pair)

[Int])

What steps has pair gone through in the Haskell exe expressionsstartasunevaluatedthunks.

[Int])

cution? All

… pair = -- start [Int])

Atthispoint,wearegettingdiminishingreturnsb ecausewehave madeanumberof changestothecodetogetusethe secombinators. Notice wecan'tjustreturn a localobserveb utneedtowrap in inside the constructor, Observer, because observ e must havea fullypolymorphictype. Theexampleoutputs…

-- natural { \ 123 -> 1 : 2 : 3 : [] } -- after reverse 1 : 2 : 3 : [] -- after map 3 : 2 : 1 : [] -- after takeWhile 123 : 12 : 1 : [] -- after iterate 123 : 12 : 1 : 0 : _

Thisisamorestructuredrecordofwhathappened.

Now there is nothing tyingtogether thedatathats hare thesame pipeline, apart from manual observations. There is no guarantee (becauseof lazyevaluation)thatthe data willbe orderedlikethis example. In order to allow individual pipelines to have a wayof tyingobservationtogether,weprovideanothercomb inator.

WehavenowlefttheHaskell98camp,becauseweare 2 polymorphism. observations passes a local version allowingascopedversion ofbeusedwhen debugging pleuseofthiscombinatoris

-- natural { \ 3408 -> 3 : 4 : 0 : 8 : [] } -- after reverse 3 : 4 : 0 : 8 : [] -- after map 8 : 0 : 4 : 3 : [] -- after takeWhile 3408 : 340 : 34 : 3 : [] -- after iterate 3408 : 340 : 34 : 3 : 0 : _

First, print is hyper-strict in its argument, so it tion oftheexpression "(fstpair)".Thiscausespa viafst,returningatuplewithtwothunksinsidei

starts the evaluairtobeevaluated t.

… pair = (,) -- after step 1

Now thefstfunctionreturnsthefirstcomponentof thiselementisfurtherevaluatedbyprint.

thetuple,and

… pair = (Just ,) -- after step 2

And finally, the thunk inside the Just constructor giving

is evaluated,

… pair = (Just 1,) -- after step 3

This evaluation can be illustrated diagrammatically , showing the threeevaluationstepsthatthisstructurewentthr ough.

• (1)

( • , • ) Here represents the first thunk inside the cons tructor producedbythefirststep,andrepresentsthesec ondthunkfrom thesamereduction.Wethenacceptthenextthunk( ),giving

( Just •, • )

( • , • ) (2)

Hererepresentsthefirst(andonly)thunkof theconstructor produced by the thunk labeled . Finally, we acce pt informationabout,giving

Just • (3)

( Just 1, • )

1 We can now explain the keyideas behindtheimpleme observe. •



ntation of

We automatically insert side-effecting functions in place of the labeled arrows in the diagram above, which both return the correct result on the evaluation to weak head n ormal form, and also inform a (potentiallyoffline)agentthat the reduction has taken place . All thunks (including internal thunks) are therefore replaced with functions that, when evaluated,triggertheinformativesideeffect. We use the type class mechanism as a vehicle for th tematic(runtime)rewriting.

Next,weexaminethedetailsofbothoftheseideas

is sys-

Whatevaluationhappened(pathlocation)



Whattheevaluationreducedto((:),3,Nothing,et

Location

root

firstthunkinsideroot

firstthunkinsidethe firstthunkoftheroot

r (potentially efurthercalls Onepossible

observer :: (Observable a) => [Int] -> String -> a -> a

ot,asseeninthe hisfunction.

Let us consider the generic case for observer, over a pseudoconstructor.Thisalsoactsasaninformalsemantic sforobserve. data Cons = Cons ty1 … tyn

allow it to t information

c)

So, in the example above we would pass the followin tionviaoursideeffectingfunction. Name

We use a worker function, observer, to both tell ou offline) agent about reductions happening, and plac to new instances of observer on allthe sub-thunks. typeforsuchafunctionis:

observe = observer []

5.1 CommunicatingtheShapeofData Structures



unevaluated, passingfunc-

5.2 Insertingintermediateobservations

The[Int]isusedtorepresentthepathfromthero aboveexample.observecanbedefinedintermsoft

.

We need to give enough information toourviewerto rebuild a local copy of our observed structure. Wha mighttheseside-effectingfunctionssend?

By default, if we know nothing about a thunk, it’s like.Wenowlookathowtoinsertourmessage tionsintoourdatastructures.

g informa-

Constructor

observer path label (Cons v1 … vn) = unsafePerformIO { send "Cons" path label ; return ( let y1 = observer (1:path) label vn … yn = observer (n:path) label vn in Cons y1 … yn) }

We can notice a number of things about the function pseudocode. •

tuple constructor with two children

from this

observer is strict in its constructor argument. Thi s is not a contradiction from the clam that observe does not e ffect strictnessofwhatitisobserving,inthesameway that forall xs :: [a] . foldr (:) [] xs = xs

The Just constructor wi child

th one

Theinteger1

Thisinformation isenough torecreatetheobserved startwithanunevaluatedthunk.

•root Wethenacceptthefirststep(),giving

Forobservertolookatitsconstructorargument,i beintheprocessofbeingevaluatedtoWHNF. •

structure!We

tmustitself

The only place observer can get stuck (evaluate to when invoking send. There is a (reasonable) presump thatthiswillnotblockorfail.

) is



Thepathisusedinastrictfashion(assumingsend



observe can changethespace behaviorofprograms,because itlosesanysharinginitsreplication.

isstrict).

tion

Ifweassumethatthepath isaconstantstring,an getstuck,simpleequationalreasoningcanshowtha

dsenddoesnot t

forall (cons :: Cons) . cons = observe "lab" cons

foranyconsoftheaboveform. • •

Strict fields just re-trigger evaluation of already things.

evaluated

We can consider base types (Int, Integer, etc) to b e large enumerated types, and capture them by the above cla im aboutconstructorsingeneral.

Functionsarecapturedbyadifferentinstance: observer path label fn arg = unsafePerformIO $ do { send "->" path label ; return ( let arg' = observer (1:path) label arg res' = observer (2:path) label (fn arg’) in res') }

This is asimplification (becauseobserver actually ateauniquereferenceforeachfunctioninvocation ture the behavior as far as the Haskell evaluation Again,weusereasoninglikethatabovetoclaimth

needs togener)butdoescapis concerned. at

forall fn arg . fn arg = observe "lab" fn arg

instance (Observable a,Observable b) => Observable (a,b) where observer (a,b) = send "," (return (,) ->

String MonadObserver a Parent a

MonadObserver is a lazy state monad that both count s the total numberofsub-thunksthisconstructorhas,andprov idesaunique context for the sub-thunks. Parent is simply a name for thecontext. Severalexamplesofrealinstancesareincludedin

theAppendix.

6 TheHaskellObjectObservation Debugger Wehaveimplemented theseideas, incorporatingthem intoafullscale debugging tool we callthe HaskellObject Obs ervation Debugger.Wegiveashortoverviewofthetoolhere. Ausermanual isavailableonline.Inessence,HOODisusedasfo llows: •

We use the type class mechanism to implement the va rious repeatedcallstotheworkerfunction,observer,asa ndwhenastructure gets evaluated. We have a class Observable, an d for each observableHaskellobject,wehaveaninstanceoft hisclass.

The user is responsible for importing the Observe l which exports several debugging functions, includin serve,andaddingstrategicobservestotheircode.



Usingthe Observelibraryproducesan internaltrac wasobserved.

class Observable a where observer :: a -> ObserveContext -> a



At thetermination oftherunningthecodebeingde some code in the Observe library recreates the stru much like was done in Section 5.1, and the structur displayedtotheuser.

5.3 TheObservableClass

Reusing our diagram from Section 5.1 above, wehave observer.

3 callsto

• observer

( • , • ) 

observer [1] “label”(Just )

Just • [1,1] “label” 1

1 The first calluses the 2-tuple instance ofObserva usestheMaybeinstance,andthethirdusestheInt callalsogiven acontext,which containsinformati thisthunkisinrelationtoitsparentnode.

eofwhat

bugging, ctures, es are

6.1 TheObservelibrary

[] “label” (,)

observe

ibrary, g ob-

TheObservelibraryisanimplementationoftheobs ervecombinator,somesupportingcombinators,andmanyinstance sforvarious Haskelltypes.Observeprovides: BaseTypes:

Int,Bool,Float,Double,Integer,Char,()

Constructors:

(Observablea)=>[a]and(Maybea) (Observablea,Observableb)

ble,thesecond instance.Each onaboutwhere

In ourimplementation,weuseacombinator,send,t ocapturethe common idioms used when writing instances of observ er. The Observableinstancefor2-tuplesis:

=>(a,b)and(Arrayab)and(Either

ab)

(...)=>3-tuple,4-tuple,5-tuple

Functions:

(Observablea,Observableb)=>(a->b)

IOMonad:

(Observablea)=>IOa

Extensions:

Exceptions(error,etc)--withGHCandSTGHugs

In order to do debugging, you need to be inside a d ebugging mode. When this mode is turned on, the trace logfil e is created, and the system is ready for receiving observations. When the modeisturnedoff, thetrace logfileisclosed. We provideacombinatorthathelpswiththeseoperations.

… observe “label” … observe “label” …

Thisdoesnotturnouttobeaprobleminpractice. Thistransformation and other problematic transformations, th ough technically valid, change the sharing behavior of t he program.Compilersdo notliketochangethesesortso fproperties without fully understanding the ramifications of doing so. Furthermore,theworst thatcan happen isasin glestructureisobservedanumberoftimes.Ifthisoccurre d,itshould beobviouswhatishappening.

runO :: IO a -> IO ()

This turnson observations,runstheprovidedIO ac observations, and returns. In a Haskell program wit mightwrite

tion,turnsoff h main, you

main = runO $ do .. rest of program ..

Tohelpwithinteractiveuse,weprovidetwoextra

This glitch with observe turns out not to be a prob lem in GHC, Classic Hugs and STG Hugs. If any other Haskel l compiler has a problem with inappropriate sharing o f observe,this can be fixed,even byaddinga special casetothe sharing optimization. It is a lot easier to add spe cial cases thanawholedebugger!

combinators.

printO :: (Show a) => a -> IO () printO expr = runO (print expr) putStrO :: String -> IO () putStrO expr = runO (putStr expr)

These are provided for convenience. For example, in mightwrite

Hugs you

Module> printO (observe "list" [0..9])

Becausethisversionofprintstartstheobservatio ns,youcanuseit attheHugsprompt,andmakeobservationson things atthecommandlinelevel. Though Observe.lhs is itself fairly portable (needi ng only unsafePerformIO and IORef) we also provide versions o f Observe.lhs for specific compilers. Classic Hugs98 us es rank-2 polymorphism in one place of the implementation, an d uses MVars to allow debugging of concurrent programs. GH C and STGHugsalsouseextendedversionsthatprovideex trafunctionality for observing Exceptions and handling threade d execution. Catching, observing and rethrowing exceptions allow s you to observe exactly where in your data structures an er ror is raised, and perhaps later can also be used for debugging pr ograms that blackhole. In the Appendix we give code fragments from the Obs erve library, which include many more examples of instance s for the Observable class. Ifa user wantstoobservetheir own structures, then theyneedtoprovidetheirowninstances.Howe ver,ascanbe seen,thisisquitestraightforward. Thereareacoupleofimportantcaveatsabouthavin function provided by a library, rather than a separ tion/interpretationmode. •

gobserveasa ate compila-

observe is referentiallytransparent with regard to the execution of the Haskell program, but observe is not ref erentially transparent with regard to possible observations it might make. Compiler optimizations might move observe aro und, changingwhatisobserved.Hereisanexampleprobl em let v = observe “label” in … v … v …

Thismightbetransformedinto



Hugs does not re-evaluate top level updatable value s, called Constant Applicative Forms (CAFs), between specific invocationsofexpressionsatthecommandlineprompt. Thisisa good thing in general, but it also means that ifyo u wantto observe astructureinsidea CAF,you needtoreloa dtheoffendingCAFeach timeyou wanttoobserveit.This isajust minorannoyanceinpractice;perhapsaHugsflagtu rningoff caching of CAF's between expression evaluations cou ld be added.

6.2 UsingtheHOODbrowser We have an extension tothereleasedversion ofHOO D,thatincludesabrowserthatallowsdynamicviewingofstr uctures. Inthisnewversion,amodifiedversionoftheObse rvelibraryputs the tracing information into a file called observe. xml. Though it mightseemthatXMLisapoorchoiceforaninterme diateformat, off theshelf compression toolsresult in a surpris inglygoodqualityof compression (around90%),which gives signif icantlybetter foot print size than straightforward binary format, and we have plansforafutureversion thatusesapre-compress edtrace,orpipe thetracedirectlybetweenprogramandbrowser. The browser reads the XML file, and allows the user to browse the structures that were observed. To demonstrate o ur browser tool,taketheexampleobservation on foldl,from S ection 4.1.We userunO insidemain toturn on andofftheobserva tion machinery. main :: IO () main = runO ex9 ex9 :: IO () ex9 = print ((observe "foldl (+) 0 [1..4]" :: Observing ((Int -> Int -> Int) -> Int -> [Int] -> Int) ) foldl (+) 0 [1..4] )

This produces the file called observe.xml. We now s tart our browser - the details are implementation dependent, but this can bedonedirectlyusingaJVM,orfrominsideNetsca peorInternet

Explorer. After the browser is started, it offers t possibleobservationstolookat.

Thisshowsuswehaveloads65 "events"(observatio only have one observation ("foldl (+) 0"), and we c playitafterevaluation,giving

he user a list of

nsteps).We hoose to dis-

We use a (red) '?' to signifyan expression that ha s been entered (someone has requested its evaluation), but has not yet reached weak headnormalform. Wecan seewehaveanumber ofquestionmarks,whichcorrespondtoarathernastychai nofentersasa consequence of a lazy accumulating parameter, a wel l-known strictnessbug. This dynamic viewing of how structure and functions are used insiderealcontextscan bringawholenew levelof understanding ofwhatgoesonwhenweevaluatefunctionalprogram s,andcould serveasausefulpedagogicaltool.

7 RelatedWork There are two previous pieces of work that use the observingintermediatestructuresinadebuggingai d. • This display uses colors to give information beside the raw text. We use purple for base types, blue for constructors , black for syntax, and yellow highlighting for the last expres sion changed. (Note:thispictureshowingan alternativepossible syntaxfor renderingfunctionalvalues.) This viewer has the ability to step forwards and ba ckwards through the observation, seeing what part oftheob servation was evaluated (demanded) in what order. Though in many cases we are not interested in this information, it sometime s is invaluable. Forexample,ifwestepbackafewstepsduringour perusalofthe foldlexample,andweseeastrangething.

Hawk, a microprocessor architecture specification e languagehasafunctioncalledprobe[4].

explicit

mbedded

probe :: Filename -> Signal a -> Signal a probe worksexactlylikeobserveon theSignallevel,wh ere Signals are just lazy lists. However, probe is strict in the

contents of the signal, so it can change the semant ics of a signal. Encouragingly, probe hasturnedouttobeextremely usefulinpractice. •

The stream-based debugger in [9] let the user obser ve lazy streams as they are evaluated. The information gath ering mechanism was completelydifferent. Theirstream-ba sed debuggerusedaprimitive(isWHNF::a-> Bool)toma kesure thattheynevercauseextraevaluation when display ingstructures. We expect that we could emulate all the beha vior of thisdebugger(andmore)inournewbrowser.

The work in this paper was undertaken because of th e success storiestoldbyboth theseprojects,andthehopet hatourgeneralization of both willbeusefulin practicewhen debu ggingHaskell programs.

Acompletedescription ofother attemptstobuildd ebuggingtools forlazyfunctionallanguageisnotpossibledueto sidelimitations. Hereisashortsummaryofthetechniques;formore detailsabout writing debuggers for Haskell, Watson’s thesis [10] is a great startingpoint.

Thefirstversion ofHoodhasbeen released,andis availablefrom thewebpage.Afutureversionwillincludethegra phicalbrowser. The source code (including a copy of the graphical browser) is availablefromthesameCVSrepositoryasGHCandH ugs.

TherearetwobasicapproachestoinstrumentingHas

Acknowledgements

kellcode:



The first is where code is transformed to insert ex tra (sideeffecting) functions that record specificactions, likeentering functions and evaluating structures. The transforma tion can bedoneinsidethecompiler(andthereforecompiler specific) or done as a preprocessing pass (complicating the c ompilation mechanism.) In practice, such transformations turn out to be tied to specific compilers. One example of tr acing via transformationsistheworkbySparud[8],inhist raceoption forthenhccompiler.



The second approach to gathering debugging informat ion is augmenting a reduction engine to gather the relevan t information, and is completelycompilerspecific. One ex ampleof such a reduction engine is the work by Nilsson [5], who modifiedtheG-machinereductionengine.

Using the raw debugging information gathered to hel p debug Haskell programs is a difficult problem, partly for the reasons already mentioned in the introduction. One importan t debugging strategy is algorithmic (or declarative) debugging [6]. Al gorithmic debuggers compare the result of specific ch unks of computations(likefunctioncalls)withwhattheprogr ammerintended. By asking the programmer (or an oracle) about expec tations, the debuggercanhomeinonabug’slocation.observec anbeusedto performamanualversionofalgorithmicdebugging.

The idea for using Haskell type classes and unsafe operations to observe intermediate data structures arose from a c onversation between Simon Marlow and the author in 1992,when w ewere both graduate students at Glasgow. Thanks Simon! Th ank you also Magnus Carlsson, Tim Sheard, Richard Watson, a nd the anonymousreferees,allofwhom gaveusefulcomment sandsuggestions.

References [1]

Augustsson, L., Johnsson,T. (1989) TheChalmer s LazyMLCompiler. TheComputingJournal. 32(2):127-139.

[2]

Claessen, K and Hughes, J (2000) QuickCheck:A LightweightToolforRandomTesting ofHaskellPrograms In ICFP2000,Montreal,Canada.

[3]

Launchbury, J (1993) A StaticSemantics forLazyFunctional Programs. Proc ACM Principles of Programming Languages,Charleston.

[4]

Launchbury,J.,Lewis,J.andCook,B.(1999) Onembedding a microarchitectureal design languagewithin H askell.InICFP99

[5]

Nilsson,H.(1998) DeclarativeDebuggingforLazyFunctional Languages. PhD thesis. Department of Computer andInformationScience,LinköpingUniversity,Swed en.

[6]

Shapiro,E.(1982) AlgorithmicProgramDebugging. MIT Press.

[7]

Sinclair,D.(1991)DebuggingbyDataflow-Sum mary.In Proceedingsofthe1991GlasgowWorkshoponFunctio nal Programming,Portree,IsleofSkye.pp347-351.

[8]

Sparud,J.(1995) ATransformationalApproachtoDebugging Lazy Functional Programs. PhD thesis.Department ofComputer Science,ChalmersUniversityofTechnol ogy, Goteborg,Sweden.

[9]

Sparud, J. and Sabry, A(1997) Debugging Reactiv temsinHaskell, HaskellWorkshop, Amsterdam.

8 Conclusions&FutureWork All previous work on debuggers for Haskell have onl y been implementedfor subsets ofHaskell, andaretherefore oflimiteduse fordebuggingrealHaskellprograms.Thispapercom batstheneed for debugging real Haskell by using a portable libr ary of debugging combinators, and develops a surprisingly rich debugging systemusingthem. Thereisworktobedonewithbuildingsemanticsfo semanticsgivenin[3]wouldbeagoodplacetosta

robserve.The rt.

This debugging system could be made even more usef ul if the Observable class restriction was removed. It would be conceivable to have a compiler flag where Observable is pa ssed silently everywhere, and therefore can be used without type restrictions, provided we supply a default instance for Observabl e. Alternatively, a reflection interface might beusedtoloo kat constructors in a polymorphic way, allowing the type class restr iction to be totallyeliminated. HOODhasawebpage:

http://www.haskell.org/hood

e Sys-

[10] Watson, R. (1997) Tracing Lazy Evaluation Program Transformation.PhDthesis.SchoolofMultimediaandInformation Technology, Southern CrossUniversity,Au stralia.

AppendixA–HaskellCodefromObserve.lhs class Observable a where observer :: a -> Parent -> a type Observing a = a -> a -- The base types instance instance instance instance instance instance

Observable Observable Observable Observable Observable Observable

Int Bool Integer Float Double Char

instance Observable ()

where where where where where where

{ { { { { {

observer observer observer observer observer observer

= = = = = =

observeBase observeBase observeBase observeBase observeBase observeBase

} } } } } }

where { observer = observeOpaque "()" }

observeBase :: (Show a) => a -> Parent -> a observeBase lit cxt = seq lit $ send (show lit) (return lit) cxt observeOpaque :: String -> a -> Parent -> a observeOpaque str val cxt = seq val $ send str (return val) cxt -- The constructors instance (Observable a,Observable b) => Observable (a,b) where observer (a,b) = send "," (return (,) Int, which gives the number of elements in a nite bound, and { inBounds :: a -> Bounds a -> Bool, which checks for membership in the set de ned by a bound. 3.4

Operations on Finite Data Fields

Sometimes it is desirable to force the evaluation of all elements in a data eld. There are, for instance, parallel algorithms whose eciency depends on the compile-time knowledge of which computations to perform. This is similar to strictness declarations for functions, which sometimes are necessary to ensure ecient execution. To this end, we have de ned three data eld evaluators, all of type (Pord a, Ix a, Eval a) => Datafield a b -> Datafield a b

that evaluate their respective arguments to di erent degrees. hstrictTab, for instance, evaluates all elements in a hyperstrict fashion (i.e., to the innermost constructor). foldlDf, of type (Pord a, Ix a, Eval a) => (b -> c -> b) -> b -> (Datafield a c) -> b

is the data eld equivalent to foldl for lists. It reduces its data eld argument in the order given by the enumeration of its bound. The reduction only includes the values indexed by elements in the domain of the corresponding partial function (note that the bound may overapproximate this domain: see [11] for details). As for lists, there are various versions of data eld folds [11]. The operations in this section are only meaningful for nite data elds and will yield a runtime error if applied to an in nite data eld. 3.5

Forall-abstraction

Data Field Haskell provides a form of '-abstraction, with the following syntax (described in the metasyntax of the Haskell report [23]): forall apat1 : : : apatn -> exp Thus, the syntax is analogous to -abstraction in Haskell and includes such features as pattern-matching (which is convenient when de ning multidimensional data elds). Type inference works in the same way as for -abstraction, although the identi ers being abstracted over must be instances of the Pord and Ix classes. The semantics of forall-abstraction is

forall x -> t = datafield (\x -> t) b

where the bound b is a function of the form of t. The limited space prohibits a detailed account for how b is computed: the exact rules are found in [1, 11]. Here, we give an informal description supported by representative examples. First, if a!x occurs in a strict position in the body of forall x -> ... then bounds a should constrain the bounds of forall x -> .... Thus, bounds (forall x -> a!x + b!x + 17) = (bounds a) `meet` (bounds b)

The principle generalises to forall-abstraction over tuples, which should have product bounds where each component constrains the respective variable in the tuple. Thus, bounds (forall (x,y) -> a!x * b!y) = (bounds a) (bounds b)

so this expression yields the outer product of a and b with the expected bounds. For conditionals, any of the branches could be taken for any value of x. Thus, the bounds from the branches should be joined. Moreover, the conditional is strict in the condition, thus, bounds (forall x -> if a!x then b!x else c!x) = (bounds a) `meet` ((bounds b) `join` (bounds c))

Multidimensional arrays are important in array languages, and they often provide convenient syntax to select subarrays from matrices. In order to generalise this feature to data elds, components of product bounds of multidimensional data elds occurring in forall-abstraction can constrain the bound of the abstraction. Thus, if bounds a = b1b2, we have3 bounds (forall x -> a!(1,x)) = b2

(selection of row one), and bounds (forall x -> a!(x,x)) = b1 `meet` b2

(main diagonal). This feature can be combined with forall-abstraction over tuples, like bounds (forall (x,y) -> a!(y,x)) = b2 b1

(\data eld transpose"). If the bound of a is a sparse multidimensional bound, then the smallest enclosing product bound is rst computed and the above then applies. Finally, we allow translations of bounds w.r.t. linear o sets, e.g., if bounds a = 15 then bounds (forall x -> a!(x+1)) = 04

Sparse bounds are translated similarly, and this feature combines with the others. If none of the previous cases apply (e.g., forall x -> a!(f x)), then the bound universe will result. The \compute bounds rst" evaluation order of forall-abstraction gives data elds a lazy avour. For instance, one may de ne a two-dimensional data eld with nitely many in nitely long columns; rows are then still nite data elds. 3

a more exact bound would be if (inBounds 1 b1) then rent version of Data Field Haskell does not compute this.

, but the cur-

b2 else empty

3.6

For-abstraction

for-abstraction

provides a convenient syntax to de ne data elds by cases. It essentially de nes a data eld from a list of pairs of bounds and expressions and can be thought of as a \parallel case" where the di erent bounds provide the cases. The syntax is for

pat

in { e1 -> e10 ;

: : : ; en -> en0 }

with semantics (forall

pat ->

if inBounds pat ( 1 ) then 01 else if else if inBounds pat ( n ) then 0n else outofBounds) ( 1 ) `join` ( 2 ) `join` `join` ( n )

e

e

e

e

e

:::

e

:::

e

4 A Simple Example The limited space only allows a short example, see [1, 12, 19] for more examples. Consider the linear equation system Ax = b, where A is an n  n lower-triangular matrix. (1) gives the classical forward-solving algorithm for computing x:

x = i

b

i

P

i 1 j =1

a

ii

a x ij

j

; i = 1; : : : ; n

(1)

This algorithm can be more or less directly expressed in Data Field Haskell: dfSum = foldlDf (+) 0 fsolv a b = forall i -> (b!i - dfSum (for j in 1(i-1) -> a!(i,j) * (fsolv a b)!j)) /a!(i,i)

P

Note how \dfSum (for j in 1(i-1) -> ...)" corresponds to \ =11 : : :". What is the bound of fsolv a b? It will be constrained by the bound of b, and the bounds with respect to i derived from dfSum (...) and a!(i,i). If bounds a = b1b2, then the latter bounds are b1 and b1 `meet` b2, respectively, and we obtain i j

bounds (fsolv a b) = (bounds b) `meet` b1 `meet` b1 `meet` b2

If (bounds b) = b1 = b2 = 1n, then bounds (fsolv a b) = 1n as expected. The bound of the data eld being summed over, nally, is given by the constraints on k: thus, it equals 1(i-1) `meet` b2 `meet` bounds (fsolv a b). With bounds b, b1, and b2 as above this equals 1(i-1). Interestingly, the code above works also for sparse a: a sparse version of a dense matrix can be created with the very generic function sparsify de ned below: sparsify x = x predicate (\i -> x!i /= 0)

If x has a nite bound, then sparsify x will have a nite sparse bound. If fsolv is given a sparse a, the current version of Data Field Haskell will rst create the bounds b1 and b2 by projecting bounds a as indicated in Fig. 3, and then the above works as before. Note that this leads to loose approximations: in particular for each i, the bounds for the summed data eld really only needs to contain the k in 1(i-1) where a(i,k) is de ned. It is possible to de ne a more complex scheme for deriving constraints of bounds arising from the use of sparse multidimensional data elds, which yields exactly this: the details can be found in [18]. However, Data Field Haskell does not yet use this scheme.

b2

b1

bounds a

Fig. 3. The two one-dimensional projections of a sparse, two-dimensional bound.

5 Implementation Our implementation of Data Field Haskell is based on the NHC compiler [25], which implements Haskell v. 1.3. The execution mechanism is graph reduction, which is performed by a variant of the G-machine. Our implementation consists of: { { { { {

Modi cations to the front-end in order to parse and type-check forall and for-abstractions, automatic derivation of instances for the new type class Pord, and for the Eval class which has been slightly modi ed [11], a program transformation of intermediate code with forall- and for-abstractions into intermediate code without forall and for-abstractions, the abstract data types for Datafield and Bounds implemented in Haskell, and simple exception handling (used to implement outofBounds), implemented mostly by modi cations to the back-end.

Portability and development time was deemed more important than execution speed, thus we have strived to make most of the implementation in Haskell itself. We have not implemented any advanced optimizations. The front-end modi cations are quite straightforward, as the automatic derivation of instances for the Pord and Eval classes. for- and forall-abstractions are translated into intermediate code that uses the datafield function to build data elds. In this process, calls to join and meet are also introduced. These operations obey the following equations, and we perform the corresponding simpli cation of expressions for bounds in the translation: universe `meet` x = x empty `join` x = x x `meet` universe = x x `join` empty = x The implementation of the abstract data types for data elds and bounds was not entirely straightforward to do in Haskell. The problem is Bounds a. Ideally, one would de ne this as an algebraic data type with constructors for the di erent kinds of bounds. However, product bounds do not t into this scheme since they require that a is a tuple type. It would indeed be possible to de ne a type PBounds_n a1 ... an = ... that includes product bounds, but this type could then not be used for bounds over non-tuple-types and one would have to use different types for bounds and data elds over tuple types and non-tuple types. Overloading the operations on data elds and bounds through the class system does not work, since the type constructors Bounds and PBounds_n have di erent arities. Pattern-matching in type declarations, like data PBounds_n (a1,...,an) = ...

would make it possible to de ne a constructor class for bounds, but this is not allowed in Haskell. Thus, we have reverted to a low-level implementation of data elds and bounds, done in Haskell but with incorrect types. The implementation has some similarities with how dictionaries are used to implement overloading in Haskell. Coercion functions, which are manually given (incorrect) function types, are used as interfaces between the Datafield and Bounds types and their implementations. Sparse bounds and tabulated data elds are represented by an abstract data type for sets, which is based on balanced binary trees. If n is the number of elements stored in the tree, then membership tests (and lookups) are done in time O(log n), unions, intersections, enumerations, and folds in time O(n), and the size is calculated in time O(log2 n). The production of ordinary error values in Haskell results in immediate termination. outofBounds must be handled in a less strict fashion, since data elds represent partial functions where the bounds may overapproximate the partial function domain, and certain operations should only be performed over the elements in this domain. Thus, it must be possible to just skip occurrences of outofBounds rather than terminating directly when it appears. We wanted the implementation of this to be reasonably ecient. Therefore we have introduced a simple exception handling mechanism. On the Haskell kernel level a function handle is introduced that adheres to the following: handle x y = y handle x y = x isoutofBounds

-- if x evaluates to outofBounds -- otherwise

can now be de ned as:

isoutofBounds = handle (seq x False) True

is implemented by catching exceptions, and outofBounds is implemented by throwing them.

handle

< n1

)

:

n2

< n1

:

:

:

HANDLE C; D; E > EVAL REMOVEHANDLER

S; G;

S; G;

:

< S; G;

REMOVEHANDLER

< S; G;

FAIL

:

C; D;

(

:

C; D; t

0

0

n; S ; C ; D

: 0

:

E >

):

C; D;

)

E >

(

n2 ; S; C; D

):

E >

< S; G; C; D; E >

)

< n

:

0

S ;

EVAL

:

0

0

C ;D ;E >

Fig. 4. State transitions for HANDLE , REMOVEHANDLER and FAIL. The exception handling was implemented by modifying the G-machine of NHC. The basic G-machine, as described in [24], has four-tuples < S; G; C; D > as states. Here, S is a stack of node names, G is the graph, C is the sequence of G-code being executed, and D is the dump, a stack of pairs of code sequences and stacks. The Gmachine of NHC adheres to this scheme, although its instruction set and low-level representations are somewhat di erent. Our modi ed G-machine has ve-tuples < S; G; C; D; E > as states. The new component E , the exception stack, consists of quadruples (n; S; C; D) of a node name, a stack, a code sequence and a dump. (S; C; D) saves the current state when the handling of an exception is set up, and n points to the node to be evaluated on failure. We also need three new instructions: HANDLE, REMOVEHANDLER, and FAIL. The code generated for outofBounds is simply

FAIL

and the code for handle x y is

HANDLE

The idea is to abort the evaluation of x if FAIL is executed, restore the machine state to what is was before the evaluation of x began, and evaluate y. The semantics of the instructions as transitions of the modi ed G-machine is shown in Figure 4. The description above is for exception handling in the basic G-machine. Our actual solution for the G-machine of NHC is slightly di erent, due to the internal details of this G-machine, but the basic idea is the same. See [11].

6 Related Work There is a wealth of collection-oriented languages and it is impossible to give a full account here. An excellent survey of collection-oriented languages up to around 1990 is found in [27]. Array and data parallel languages like Fortran 90, HPF [15], and *lisp [29] have been important sources of inspiration for Data Field Haskell. The language closest to Data Field Haskell is probably FIDIL [26], whose implicit intersection rule corresponds to the propagation of bounds from strict positions below a forall-abstraction. The arrays in FIDIL resemble data elds also in other respects, for instance they can have a wider variety of shapes than traditional array bounds. Examples of functional data parallel and array languages are Connection Machine Lisp [28], Id [4], Sisal [6], NESL [2], Data Parallel Haskell [10], and pH [21]. These languages are intended for direct parallel implementation whereas Data Field Haskell targets collection-oriented programming in general, with more emphasis on expressiveness than eciency. Haskell itself [23] is to some extent collection-oriented through its set of collective list operations, and it has been suggested for data parallel programming [22]. FISh [13] is an imperative array language, which shares some features with Data Field Haskell such as advanced polymorphism. It is, however restricted to regular arrays and certain recursion patterns, which enables the generation of good code but makes it less suitable for speci cation of sparse or dynamic algorithms. A survey of the research in parallel functional programming is found in [9]. \Bulk types", like the ones provided by the STL C++ library [20], provide generic collection-orientation and are similar in this respect to data elds. Peyton Jones [14] has used the class system of Haskell to de ne bulk types. Bulk types do not provide any particular support for multidimensional structures, and there is no counterpart to forall-abstraction and implicit derivation of bounds for expressions.

7 Conclusions and Further Research We have de ned and implemented Data Field Haskell, a Haskell dialect where data elds replace arrays. Data elds are designed with the abstract view of indexed structures as partial functions in mind. This leads to the view of bounds as set representations, and to the design of forall-abstraction, which is inspired by abstraction. The intention has been to create a language that supports collectionoriented programming at a very high level. Although our initial inspiration comes from array and data parallel programming, we believe that the data eld concept

is general enough to support collection-oriented programming in a variety of applications. Data Field Haskell is designed for expressiveness rather than speed. We believe this is the right place to start, and then investigate how restricted sublanguages can be given an ecient implementation and how performance-enhancing features like mutable data elds could be introduced. Parallel implementations are also certainly possible. The eciency of our current implementation can also be greatly improved. We have furthermore found some cases of forall-abstraction where it would be natural to have a tighter bound. We plan to upgrade our implementation to Haskell 98: in this process, we may x some of the current de ciencies. Another desirable feature is elemental intrinsics overloading, which refers to the ability in some array languages to apply certain \scalar" operators to arrays with the meaning that it is applied to each element. For data elds, it would be natural to resolve this overloading into forall-expressions, e.g., a+b ! forall x-> a!x + b!x provided that a and b have the proper data eld type. To some extent this is possible to do within the class system of Haskell, but the resulting overloading has certain restrictions and is also likely to lead to ineciencies. We are investigating another scheme for elemental intrinsics overloading that is less restricted, but it is still only de ned for explicitly typed languages [30]. An obvious goal is to extend this scheme to implicitly typed languages. The low-level representation of data elds and bounds is somewhat unsatisfactory, since it hurts the portability of the implementation. If Haskell's algebraic type declarations allowed pattern matching on type parameters then it would be possible to de ne classes for bounds and data elds. We could then do away with the low level representations. This would also make it possible for users to de ne their own types of bounds. The formal data eld model [18] was speci cally designed to support the development of abstract data types for bounds and data elds, and the ability to de ne new types of bounds would be an important enhancement of the language.

References 1. Data Field Haskell homepage. http://www.it.kth.se/labs/paradis/dfh/. 2. Guy E. Blelloch. Programming parallel algorithms. Comm. ACM, 39(3), March 1996. 3. Walter S. Brainerd, Charles H. Goldberg, and Jeanne C. Adams. Programmer's Guide to FORTRAN 90. Programming Languages. McGraw-Hill, 1990. 4. Kattamuri Ekanadham. A perspective on Id. In Boleslaw K. Szymanski, editor, Parallel Functional Languages and Compilers, chapter 6, pages 197{253. AddisonWesley, 1991. 5. A.D. Falko and K.E. Iverson. The Design of APL. IBM Journal of Research and Development, pages 324{333, July 1973. 6. John T. Feo, David C. Cann, and Rodney R. Oldehoeft. A report on the Sisal language project. J. Parallel Distrib. Comput., 10:349{366, 1990. 7. Tom R. Halfhill. Sun reveals secrets of \magic". Microprocessor Report, pages 13{17, August 1999. 8. Per Hammarlund and Bjorn Lisper. On the relation between functional and data parallel programming languages. In Proc. Sixth Conference on Functional Programming Languages and Computer Architecture, pages 210{222. ACM Press, June 1993. 9. Kevin Hammond and Greg Michaelson, editors. Research Directions in Parallel Functional Programming. Springer-Verlag, 1999. 10. Jonathan M. D. Hill. Data Parallel Haskell: Mixing old and new glue. Tech. Rep. 611, Queen Mary and West eld College, December 1992. 11. Jonas Holmerin. Implementing data elds in Haskell. Technical Report TRITA-IT R 99:04, Dept. of Teleinformatics, KTH, Stockholm, November 1999. ftp://ftp.it.kth.se/Reports/paradis/DFH-report.ps.gz.

12. Jonas Holmerin and Bjorn Lisper. Development of parallel algorithms in Data Field Haskell. Accepted to Euro-Par 2000, 2000. 13. C. Barry Jay and P. A. Steckler. The functional imperative: shape! In Chris Hankin, editor, Proc. 7th European Symposium on Programming, volume 1381 of Lecture Notes in Comput. Sci., pages 139{53, Lisbon, Portugal, March 1998. Springer-Verlag. 14. Simon Peyton Jones. Bulk types with class. In Electronic Proceedings of the 1996 Glasgow Functional Programming Workshop, Ullapool, July 1996. 15. Charles H. Koelbel, David B. Loveman, Robert S. Schreiber, Guy L. Steele, Jr., and Mary E. Zosel. The High Performance Fortran Handbook. Scienti c and Engineering Computation. MIT Press, Cambridge, MA, 1994. 16. Bjorn Lisper. Data parallelism and functional programming. In Guy-Renee Perrin and Alain Darte, editors, The Data Parallel Programming Model: Foundations, HPF Realization, and Scienti c Applications, Vol. 1132 of Lecture Notes in Comput. Sci., pages 220{251, Les Menuires, France, March 1996. Springer-Verlag. 17. Bjorn Lisper. Data elds. In Proc. Workshop on Generic Programming, Marstrand, Sweden, June 1998. http://wsinwp01.win.tue.nl:1234/WGPProceedings/. 18. Bjorn Lisper and Per Hammarlund. The data eld model. Submitted. Preliminary version avaliable as Tech. Rep. TRITA-IT R 99:02, Dept. of Teleinformatics, KTH, Stockholm, 2000. 19. Bjorn Lisper and Jonas Holmerin. Development and veri cation of parallel algorithms in the data eld model. In Sergei Gorlatch and Christian Lengauer, editors, Proc. 2nd Int. Workshop on Constructive Methods for Parallel Programming, pages 115{130, Ponte de Lima, Portugal, July 2000. 20. David R. Musser and Atul Saini. STL Tutorial and Reference Guide. Addison-Wesley, Reading, MA, 1996. 21. Rishiyur S. Nikhil, Arvind, James E. Hicks, Shail Aditya, Lennart Augustsson, JanWillem Maessen, and Y. Zhou. pH language reference manual, version 1.0. Technical Report CSG-Memo-369, Massachussets Institute of Technology, Laboratory for Computer Science, January 1995. 22. John T. O'Donnell. Data parallelism. In Hammond and Michaelson [9], chapter 7, pages 191{206. 23. John Peterson, Kevin Hammond, Lennart Augustsson, Brian Boutel, Warren Burton, Joseph Fasel, Andrew D. Gordon, John Hughes, Paul Hudak, Thomas Johnsson, Mark Jones, Erik Meijer, Simon L. Peyton Jones, Alastair Reid, and Philip Wadler. Report on the programming language Haskell: A non-strict purely functional language, version 1.4, April 1997. http://www.haskell.org/definition/. 24. Simon L. Peyton Jones. The Implementation of Functional Programming Languages. Prentice-Hall International Series in Computer Science. Prentice Hall, 1987. 25. Niklas Rojemo. Garbage Collection, and Memory Eciency, in Lazy Functional Languages. PhD thesis, Department of Computing Science, Chalmers University of Technology, Gothenburg, Sweden, 1995. 26. Luigi Semenzato and Paul Hil nger. Arrays in FIDIL. In Lenore M. R Mullin, Michael Jenkins, Gaetan Hains, Robert Bernecky, and Guang Gao, editors, Arrays, Functional Languages, and Parallel Systems, chapter 10, pages 155{169. Kluwer Academic Publishers, Boston, 1991. 27. Jay M. Sipelstein and Guy E. Blelloch. Collection-oriented languages. Proc. IEEE, 79(4):504{523, April 1991. 28. Guy L. Steele and W. D. Hillis. Connection Machine LISP: Fine grained parallel symbolic programming. In Proc. 1986 ACM Conference on LISP and Functional Programming, pages 279{297, Cambridge, MA, 1986. ACM. 29. Thinking Machines Corporation, Cambridge, MA. Getting Started in *Lisp, June 1991. 30. Claes Thornberg. Towards Polymorphic Type Inference with Elemental Function Overloading. Licentiate thesis, Dept. of Teleinformatics, KTH, Stockholm, May 1999. Research Report TRITA-IT R 99:03.

Pattern Guards and Transformational Patterns Martin Erwig Oregon State University [email protected]

Abstract We propose three extensions to patterns and pattern matching in Haskell. The first, pattern guards, allows the guards of a guarded equation to match patterns and bind variables, as well as to test boolean condition. For this we introduce a natural generalisation of guard expressions to guard qualifiers. A frequently-occurring special case is that a function should be applied to a matched value, and the result of this is to be matched against another pattern. For this we introduce a syntactic abbreviation, transformational patterns, that is particularly useful when dealing with views. These proposals can be implemented with very modest syntactic and implementation cost. They are upward compatible with Haskell; all existing programs will continue to work. We also offer a third, much more speculative proposal, which provides the transformational-pattern construct with additional power to explicitly catch pattern match failure. We demonstrate the usefulness of the proposed extension by several examples, in particular, we compare our proposal with views, and we also discuss the use of the new patterns in combination with equational reasoning. 1

Introduction

Pattern matching is a well-appreciated feature of languages like ML or Haskell; it greatly simplifies the task of inspecting values of structured data types and facilitates succinct function definitions that are easy to understand. In its basic form, pattern matching tries to identify a certain structure of a value to be processed by a function. This structure is specified by a pattern, and if it can be recovered in a value, corresponding parts of the value are usually bound to variables. These bindings are exploited on the right-hand side of the definition. There are numerous proposals for extending the capabilities of patterns and pattern matching; in particular, the problems with pattern matching on abstract data types have stimulated a lot of research [19, 16, 3, 12, 4, 6, 11]. Other aspects have also been subject to extensions and generalisations of pattern matching [8, 1, 9, 7, 17]. All these approaches differ in what they can be used for, in their syntax, and in their properties, which makes it almost

Simon Peyton Jones Microsoft Research Ltd, Cambridge [email protected]

impossible to use two or more different approaches at the same time. Moreover, among all these different approaches there is no clear winner, although so-called views seem to be the most prominent and favourite extension. Therefore, a consolidation of pattern matching at a more fundamental level deserves attention. An extension should be simple enough so that its use is not prohibited by a complex syntax, and it should be powerful enough to express most of the existing approaches. In this paper we present a proposal for an elementary extension of patterns and pattern matching that naturally extends Haskell’s current pattern matching capabilities. The design is influenced by the following goals: • Conservative Extension. Programs that do not use the new feature should not need to be changed and should have unchanged semantics. • Simplicity. We shall not introduce (yet another) more or less complex sub-language for specifying new kinds of patterns, for introducing pattern definitions, and so on. Instead, a minor extension to the syntax with a simple semantics should be aimed at. • Expressiveness. It should be possible to express pattern matching on abstract data types. In particular, views [19, 3, 4, 11], and two kinds of active patterns [12, 6] should be covered. • Efficient and Simple Implementation. The use of the new patterns should not be penalised by longer running times. Moreover, only minimal changes to an existing language should be needed. This facilitates the easy integration of the new concept into existing language implementations and supports a broad evaluation of the concept. The remainder of this paper is structured as follows: we motivate the need for more powerful pattern matching in Section 2 and present our proposal in Sections 3 and 4. Syntax and semantics are defined in Section 5, and the implementation is discussed in Section 6. A detailed comparison with views is performed in Section 7. In Section 8 we then discuss the use of the new patterns with equational reasoning. A further extension of the expressiveness of our proposal is described in Section 9. Related work is discussed in Section 10, and finally, conclusions are given in Section 11.

2

The need for more powerful pattern matching

In the current version of Haskell pattern matching is not just a straight, one-step process because guards can be used to constrain further the selection of function equations. However, no (additional) bindings can be produced in this second step. This is a somewhat non-orthogonal design, and the extension we propose essentially generalises this aspect.

filtSeq :: (a->Bool) -> Seq a -> Seq a filtSeq p xs | isJust lv && p y = lcons y (filtSeq p ys) | isJust lv = filtSeq p ys | otherwise = nil where lv = lview xs Just (y,ys) = lv

Consider the following Haskell function definition. filter p [] = [] filter p (y:ys) | p y = y : filter p ys | otherwise = filter p ys The decision of which right-hand side to choose is made in two stages: first, pattern matching selects a guarded group, and second, the boolean-valued guards select among the right-hand sides of the group.

The auxiliary function isJust is taken from the standard library Maybe: isJust :: Maybe a -> Bool isJust (Just x) = True isJust Nothing = False

In these two stages, only the pattern-matching stage can bind variables, but only the guards can call functions. It is well known that this design gives rise to a direct conflict between pattern-matching and abstraction, as we now discuss.

The idea here is that the guard isJust lv checks that the lview returns a Just value, while the (lazily-matched) pattern Just (y,ys) is only matched if y or ys is demanded. So now filtSeq is more “equational”, but it is hardly clearer than before. A well-known approach to reconcile pattern matching and abstract data types is the views proposal; we will consider views in detail in Section 7.

2.1

2.2

Abstract data types

Consider an abstract data type of sequences, which offers O(1) access to both ends of the sequence (see, for example, [10]): nil lcons lview rcons rview

:: :: :: :: ::

Seq a a -> Seq Seq a -> Seq a -> Seq a ->

Matching that involves several arguments

As another example, suppose we have an abstract data type of finite maps, with a lookup operation: lookup :: FiniteMap -> Int -> Maybe Int

a -> Seq a Maybe (a,Seq a) a -> Seq a Maybe (Seq a,a)

Since sequences are realized as an abstract date type, their representation is not known, and this prohibits the use of pattern matching. The functions lview and rview provide two views of the sequence, one as a left-oriented list and the other as a right-oriented list, and thus reveal to some degree a representation of sequences (that can be different, though, from their actual implementation). This means that pattern matching against this representation is now principally possible. However, it generally leads to less clearer definitions. For example, a function to filter such a sequence would have to use a case expression to scrutinise the result of, say, lview: filtSeq :: (a->Bool) -> Seq a -> Seq a filtSeq p xs = case (lview xs) of Nothing -> nil Just (y,ys) | p y -> lcons y (filtSeq p ys) | otherwise -> filtSeq p ys This is much less satisfactory than the list version of filter, which used pattern-matching directly. Actually, it is possible to write filtSeq in a more equational way:

The lookup returns Nothing if the supplied key is not in the domain of the mapping, and (Just v) otherwise, where v is the value that the key maps to. Now consider the following definition: clunky env var1 var2 | ok1 && ok2 = val1 + val2 | otherwise = var1 + var2 where m1 = lookup env var1 m2 = lookup env var2 ok1 = isJust m1 ok2 = isJust m2 Just val1 = m1 Just val2 = m2 Much as with filtSeq, the guard ok1 && ok2 checks that both lookups succeed, using isJust to convert the maybe types to booleans. The (lazily matched) Just patterns extract the values from the results of the lookups, and bind the returned values to val1 and val2, respectively. If either lookup fails, then clunky takes the otherwise case and returns the sum of its arguments. This is certainly legal Haskell, but it is a tremendously verbose and un-obvious way to achieve the desired effect. Is it any better using case expressions?

clunky env var1 var1 = case lookup env var1 of Nothing -> fail Just val1 -> case lookup env var2 of Nothing -> fail Just val2 -> val1 + val2 where fail = var1 + var2 This is a bit shorter, but hardly better. Worse, if this was just one equation of clunky, with others that follow, then the thing would not work at all. That is, suppose we have clunky’ env (var1:var2:vars) | ok1 && ok2 = val1 + val2 where m1 = lookup env var1 ... as before clunky’ env [var1] = ... some stuff clunky’ env [] = ... more stuff Now, if either of the lookups fail, we want to fall through to the second and third equations for clunky’. If we write the definition in the form of a case expression, we are forced to make the latter two equations for clunky’ into a separate definition and call it in the right-hand side of fail. This is precisely why Haskell provides guards at all, rather than relying on if-then-else expressions: if the guard fails, we fall through to the next equation, whereas we cannot do that with a conditional. What is frustrating about this is that the solution is so tantalisingly near at hand! What we want to do is to patternmatch on the result of the lookup. We can do it like this: clunky’ env vars@(var1:var2:_) = help (lookup env var1) (lookup where help (Just v1) (Just v2) vars help _ _ [var1] help _ _ []

env var2) vars

2.3

Summary

In this section we shown that Haskell’s pattern-matching capabilities are unsatisfactory in certain situations. The first example relates to the well-known tension between patternmatching and abstraction. The clunky example, however, was a little different — there, the matching involved two arguments (env and var1), and did not arise directly from data abstraction. There is no fundamental issue of expressiveness: we can rewrite any set of pattern-matching, guarded equations as case expressions. Indeed, that is precisely what the compiler does when compiling equations! So should we worry at all? Yes, we should. The reason that Haskell provides guarded equations is because they allow us to write down the cases we want to consider, one at a time, mostly independently of each other — the “equational style”. This structure is hidden in the case version. In the case of clunky, two of the right-hand sides are really the same (fail). Furthermore, nested case expressions scale badly: the whole expression tends to become more and more indented. In contrast, the equational (albeit verbose) definition, using isJust have the merit that they scale nicely to accommodate multiple equations. So we seek a way to accommodate the equational style despite a degree of abstraction. 3

A proposal: pattern guards

Our initial proposal is simple: Instead of being a boolean expression, a guard is a list of qualifiers, exactly as in a list comprehension. That is, the only syntax change is to replace exp by quals in the syntax of guarded equations. Here is how we would write clunky:

= v1 + v2 = ... some stuff = ... more stuff

Now we do get three equations, one for each right-hand side, but it is still clunky. In a big set of equations it becomes hard to remember what each Just pattern corresponds to. Worse, we cannot use one lookup in the next. For example, suppose our function was like this: clunky’’ env var1 var2 | ok1 && ok2 = val2 | otherwise = var1 + var2 where m1 = lookup env var1 m2 = lookup env (var2 + val1) ok1 = isJust m1 ok2 = isJust m2 Just val1 = m1 Just val2 = m2 Notice that the second lookup uses val1, the result of the first lookup. To express this with a help function requires a second helper function nested inside the first. Dire stuff.

clunky env var1 var1 | Just val1 Bool) -> Maybe (a,Seq a) -> Seq a filtSeq’ p Nothing = nil filtSeq’ p (Just (y,ys)) = if p y then lcons y (filtSeq p ys) else filtSeq p ys

4

A further proposal: transformational patterns

Pattern guards allow the programmer to call an arbitrary function and pattern-match on the result. In the important special case addressed by views, these calls take a very stylised form, and this motivates us to propose some special syntax, transformational patterns, in support. Here is how we might write filtSeq, using a transformational pattern: filtSeq :: (a->Bool) -> Seq a -> Seq a filtSeq p (Just (y,ys))!lview | p y = lcons y (filtSeq p ys) | otherwise = filtSeq p ys filtSeq p Nothing!lview = nil The transformational pattern (Just (y,ys))!lview means informally “apply lview and match against Just (y,ys)”. The expression to the right of the “!” is called pattern action. Transformational patterns are simply syntactic sugar for an equivalent form using pattern guards, but they are notationally a little more concise. Furthermore, they are quite like views: “match Just (y,ys) against the lview view of the argument”. Since the function in a transformational pattern can refer to any variables that are in scope in, or bound by, the where clause, we can write clunky as: clunky env (Just val1)!(lookup env) (Just val2)!(lookup env) = val1 + val2 ... other equations for clunky This gives transformational patterns just a little more power than views, at the cost of a somewhat ad hoc flavour. To summarise, transformational patterns help to keep function equations single-lined, which greatly enhances readability and understanding of function definitions containing several equations. Moreover, transformational patterns are particularly useful when simulating views, see Section 7, and with equational reasoning, see Section 8. 5

Syntax and semantics

Based on the Haskell 98 Report [14], we need two small changes to the syntax to integrate pattern guards and transformational patterns: first, a guard is not just anymore given by an expression but by a list of qualifiers, and an atomic pattern can be a pattern extended by an expression: gd

→ | qual1 , . . . , qualn

apat → . . . | apat!aexp

Pattern Guard

Transformational Pattern

We define the semantics of pattern guards and transformational patterns by a series of equations that relate them to “ordinary” case expressions of Haskell. We start with the reduction of pattern guards to nested case expressions. For this, we first unfold multiple guards in matches to nested case expressions. This is done to keep the further translation manageable because guards themselves can be lists of qualifiers. Hence, we replace rule (c) by (c’) case v of { p | g1 -> e1 . . . | gn -> en where { decls _ -> e′ } ′ = case e of { y -> case v of { p -> let decls in case () of { () | g1 -> e1 ; _ -> . . . case () of () | gn _ _ -> y }} where y is a completely new variable

6

Implementation

The standard technology used by compilers for generating efficient matching trees from sets of equations can be adapted straightforwardly to accommodate pattern guards and transformational patterns. Currently, pattern guards are fully implemented in GHC, and transformational patterns are not yet implemented. The efficiency issue is a little more pressing than with pure pattern matching, because the access functions, called in the transformational pattern or the pattern guard, may be arbitrarily expensive. For example, consider

};

data AbsInt = Pos Int | Neg Int absInt :: Int -> AbsInt absInt n = if n>=0 then Pos n else Neg (-n) { -> en ; -> y } . . . }

The construction case () of () -> ... indicates that these case expressions do not pattern matching, but are just used to look at the guards. Next we expand a list of qualifiers of each guard into a nested case expression. (s) case () of { () | q1 , . . . , qn -> e; _ -> e′ } = case e′ of { y -> case () of { () | q1 -> . . . case () of { () | qn -> e; _ -> y }; . . . _ -> y }} where y is a completely new variable The next three equations explain how qualifiers are resolved: (t) boolean guards are transformed into conditionals, (u) a local declaration can be just moved into the body, and (v) generators are again transformed into case expressions where the generating expression is scrutinised and matched against the binding pattern. ′

(t) case () of { () | e0 -> e; _ -> e } = if e0 then e else e′ (u) case () of { () | let decls -> e; _ -> e′ } = let decls in e ′

(v) case () of { () | (p0 e; _ -> e } = case e0 of { p0 -> e; _ -> e′ } It remains to reduce transformational patterns to pattern guards. This is done by the following equation: (w) case v of { p!f -> e; _ -> e′ } = case v of { x | (p e; _ -> e′ } where x is a completely new variable

f :: Int -> Int f (Pos n)!absInt = n+1 f (Neg n)!absInt = -(n+1) This is reasonably concise. But how many times is absInt called? In this case, it is pretty clear that it need only be called once. But what about this: g g g g

((Pos a)!absInt : as) [] = [] ((Pos b)!absInt : bs) = ((Neg a)!absInt : as) ((Neg b)!absInt : bs) = _ _ =

... ... ... ...

Now it gets harder to tell! In general, it may be necessary to know the pattern-match compilation algorithm used by the compiler in order to reason precisely about how many times absInt will be called. Nevertheless, it is not difficult to expand a pattern matching algorithm by knowledge about transformational patterns so that in cases like above (when only a function and not a complex expression is used as a pattern action) one can ensure a translation into nested case expressions so that at each argument position each pattern action is invoked only once. Without any changes to the pattern matching algorithm, pattern guards allow us to express the sharing explicitly in some cases: f n | Pos n’ Int f (Half i) = i+1 we could try to reason, by replacing equals for equals, that f (Half 8) = 9, which, however, is not true because f (Half 8) = 5 due to the computational part of Half.

Thus we might prefer to write (using views)

The solution proposed by Burton and Cameron [3] is to forbid the use of view constructors, such as Half, in expressions. This works well, but one always has to be aware of the status of a constructor and whether it is a view constructor or not.

multC (Polar r1 t1) (Polar r2 t2) = pole (r1*r2) (t1+t2)

With transformational patterns we would have to define a plain data type together with a function performing the desired computation of the constructor Half.

rather than (using transformational patterns)

data Half = Half Int

multC (Polar r1 i1)!polar (Polar r2 i2)!polar = Polar (r1*r2) (t1+t2)

half i = Half (i ‘div‘ 2)

One might argue, though, that the latter accurately indicates that there may be some work involved in matching against a view, compared to ordinary pattern matching. With transformational patterns we can also safely use the Polar constructor (see also Section 8). 7.3

Summary

We believe that the pattern-guard and transformationalpattern proposal • is much simpler to specify and implement than views; • gets some expressiveness that is simply inaccessible to views; • successfully reconciles pattern matching with data abstraction, albeit with a slightly less compact notation than views;

Now when we use Half in equations, nothing harmful can happen because all computation is made explicit. For example, it is valid to conclude f (Half 8) = 9 since Half is just a data type constructor performing no computation on its argument at all. This is because the above definition for f essentially does not use the view in its original sense. To make use of the view computations we have to give a different definition for f: f :: Half -> Int f (Half i)!half = i+1 But now it is evident that we just cannot use an equation like f (Half 8)!half = 9 because (Half 8)!half is not an expression. Hence, the additional requirement made by Burton and Cameron is implicitly given in our approach just by the syntax of transformational patterns. On the other hand, it is possible to use transformational patterns in equational reasoning. To explain this it is helpful to

recall how patterns are used, for example, in the transformation of an expression f e. This happens in two steps: first, the structure of e is examined (this is either obvious because e is an application of a constructor or it is given in a precondition of the current transformation, for example, something like e = x:xs). Then the equation for f that matches the structure of e is determined, say f (x:xs) = e’, and f e is substituted by e’. Now transformational patterns fit into this scheme as follows. Suppose, f contains an equation f p!c = e’. Then when you can make the assumption p = c e in a proof, you can replace the expression f e by e’. Of course, as with other patterns, one has to choose the first possible match to get a sound transformation. We illustrate this by a small example. Suppose we have defined selection sort with the help of a function min’ :: Ord a => [a] -> (a,[a]) that extracts a minimum from a list. sort :: Ord a => [a] -> [a] sort [] = [] sort (m,r)!min’ = m:sort r Now we would like to prove the correctness of sort. Using the prelude function all we can define a predicate for sorted lists as follows. sorted [] = True sorted (x:xs) = all (x Graph -> (Context,Graph) 1 match is defined so that for all nodes v contained in g the following law holds: embed (match v g) = g.

Now with a function suc that simply projects onto the third component of a context we can give a highly concise definition of depth-first search: dfs dfs dfs dfs

:: [Node] -> Graph -> [Node] [] _ = [] (v:vs) (c,g)!(match v) = v:dfs (suc c++vs) g (_:vs) g = dfs vs g

The arguments of dfs are a list of nodes to be visited and the graph to be searched, and the result is a list of nodes in depth-first order. Note that we do not need a data structure for remembering the nodes that we have already seen — by repeatedly removing contexts from the graph we rather “forget” (in the graph) the nodes that have been visited so far. If we then try to revisit a node, this leads in match to a match failure, causing dfs to try the third equation, which simply ignores the current node. 9.1

Semantics

Note that catching pattern matching failure is not possible with transformational patterns when they are just reduced to pattern guards. Therefore, we have to provide an independent semantics definition. One possible way to go is to define pattern matching within a Haskell version that accounts for exceptions. A proposal for exceptions was made in [13], giving a precise semantics together with an efficient implementation. Pattern match failure could be defined in that context to raise a Fail exception, and pattern matching had to catch Fail exceptions to select function equations. In that proposal catching exceptions leads, in general, to non-determinism. To see this, consider the expression e1 + e2, and assume that e1 and e2 result in two different exceptions. Now what should be the result of e1 + e2? If we avoid to fix the evaluation order, all we can do is to either define + to make a non-deterministic choice or to return the set of all exceptions raised anywhere within e1 and e2. The last proposal was made in [13]. Even when dealing with exception sets, checking for a particular exception re-introduces non-determinism. However, in a framework where Fail is the only exception one could also think of just checking whether an exception has occurred at all or not. This eliminates non-determinism to a large degree, but it might still be the case that one and the same program could diverge or not depending on, for example, the platform or the larger context in which it was compiled. Again, consider expressions like bot + Fail. Another possibility is to define the more general behaviour of transformational patterns within the current Haskell framework. The problem we face here is that a pattern-match failure within the pattern action must not yield ⊥ since this cannot be caught in the case expression containing the transformational pattern. We can cope with this by performing a source-code transformation e M of pattern actions to wrap all possible results with Just, and add a default case that returns Nothing. Then we perform pattern matching against Nothing in the case rule dealing with transformational patterns. The corresponding case equation is easy to

give: (w’) case v of { p!f -> e; _ -> e′ } = case e′ of { y -> case f M v of { Nothing -> y; Just x -> case x of { p -> e; _ -> y }}} where x and y are completely new variables It remains to be shown how pattern actions can be lifted into the Maybe type. We give a definition that follows the structure of expressions. Since only case expressions are a source of possible pattern match failure, wrapping Just and adding Nothing happens just there. In all other cases, lifting is just recursively passed through (or ignored). In particular, constants, constructors, and variables remain unchanged. For application, abstraction, and case expressions we obtain: (e1 e2 )M = eM 1 e2 M (λx.e) = λx.eM (case e of {pi -> ei })M = case e of { {pi -> Just ei }; _ -> Nothing } The problem with this approach is that it does not work well with separate compilation, in particular, with precompiled libraries: in general, we do not have access to the definition of a function that is used in a pattern action and that lives in a separately compiled module. In that case the above scheme breaks down because we do not know the function’s source code and we therefore cannot apply the source code transformation. 9.2

Summary

In this section we have sketched a more speculative development of the pattern-matching idea. We regard it as debatable whether the additional power of these extended transformational patterns is worth the cost in terms of semantic complications, or loss of separate compilation. However, first-class pattern-matching failure would very much obviate the need for pattern guards because Fail can be used to “step back” into a function’s pattern matching process. Our earlier proposals, of pattern guards and transformational patterns, described in Sections 3 and 4, do not involve any such semantic or compilation complications. 10

Related work

One of the first extensions to pattern matching was the lawful types of Miranda [18, 16]: in this approach the programmer is allowed to add equations to a data type definition that act as rewrite rules to transform data type values into a canonical representation. The approach has two main problems: first, in many applications different representations are needed to use data types and pattern matching conveniently (see, for example, the polar vs. cartesian representation of complex values [3, 12] or differently rooted trees to represent sets [6]), and Miranda laws prevent this. Second, laws cause problems with equational reasoning [16]. The most prominent and most widely accepted extension to pattern matching seems to be the view mechanism which

was first proposed by Phil Wadler [19] and that was later adapted by several others [3, 4, 11]. With views one can have as many different representations of a data type as needed. For each such representation, called view, two functions in and out must be defined that map from the (main) data type into the view type and vice versa. Views do not suffer from the first restriction of Miranda laws, but the view transformations must be inverses of each other, and this sometimes either leads to partial definitions or causes problems with equational reasoning due to ambiguities. The reasoning problems were first solved by Burton and Cameron [3] who restrict the use of view constructors only to patterns. Hence, the out function is not needed anymore, and the in function need not be injective. This has been adopted by all view proposals that were made since then. Okasaki [11] has defined the view concept (the proposal of [4]) for Standard ML. He pays special attention to the interaction of view transformations with stateful computations that are possible in ML. We have demonstrated that views can be easily simulated by transformational patterns. In [12] active destructors were introduced that allow the definition and use of patterns, called active patterns, that might perform computations to produce bindings: c p match q where r = e defines an active destructor c p that can be used as a pattern in place of q. During the matching process e is evaluated using bindings produced by q and producing new bindings in r that can finally be used by p. Active destructors extend the capabilities of views, but they require even more syntactic overhead. In particular, a new notation is needed for the typing of patterns. Active destructors can, to some degree, perform computations like pattern guards and transformational patterns, but they cannot access other variables bound in the same function equation, which we consider a highly useful feature.2 Active destructors can also be simulated by pattern guards and transformational patterns. For example, define a function c q = e and use the transformational pattern p!c in place of the active destructor c p. The “p as f” construction introduced in [2] is also similar to transformational patterns: a pattern matching function f can be converted into a pattern p so that it can be composed with other patterns. This is used to express pattern matching on union types. The goal of the active patterns introduced in [6]3 was to enable the matching of specific representations of data type values. Whereas views always map a value in one view type to a canonical representation, active patterns allow the selection of an arbitrary one. The idea is that specialised constructors can reorganise data type values before they are matched. This reorganisation is intended to yield a representation that suits the current function definition best. With regard to these preprocessing capabilities, active patterns are similar to transformational patterns and also to active destructors. However, active patterns are more general than 2 Active destructors allow a very limited and rather ad-hoc way of passing additional parameters into pattern functions; this is described in [12] only for Haskell-specific arithmetic n + k-patterns and requires yet another extension to the typing notation. 3 The work of [12] and [6] was performed independently leading to the homonym.

active destructors because their computing functions have access to other bindings of the pattern, and active patterns are less general than active destructors and than pattern guards and transformational patterns because the argument and result type must be the same. Just as views and laws and the other approaches mentioned so far were motivated by combining pattern matching and ADTs, there are some other, more limited, approaches that are also driven by specific applications: context patterns [9] give direct access to arbitrary deeply nested sub-parts of terms; they are very similar to other tree transforming languages (for example, [8, 15]). In particular, they only work for algebraic, free data types, and computations on matched values are not possible. The abstract value constructors presented in [1] provide a facility to abbreviate terms and allow the use of these abbreviations as expressions as well as within patterns. Again, no computations are possible on the matched values. In contrast, pattern abstractions [7] do allow a very limited form of computation; the aim is to generalise pattern matching only as far as static analyses, such as checking overlapping patterns or exhaustiveness, are still decidable. A different route to pattern matching is taken by Tullsen [17], who considers patterns as functions of type a -> Maybe b. This allows the treatment of patterns as first-class objects; in particular, it is possible to write pattern combinators. Although the semantics of patterns can be simplified considerably by that approach, the use of patterns in the language is rather clumsy even after the introduction of some syntactic sugar through so-called pattern binders. Let us finally compare the described extensions with our proposal from a general point of view. Whereas it is quite easy and straightforward to use a pattern guard or a transformational pattern in a function definition (just put it there!), the use of an active destructor (or of a view, or of any other proposal) requires the definition of such a pattern at some other place before it can be used. In many cases this is prohibitive either because adding an additional declaration is not justified by, say, only one application or because it is just faster or shorter not to use that concept. For example, the definition of the function last based on a reverse view of lists (see [19]) requires some effort to define the view type and the view transformations, whereas it can be immediately written using pattern guards: last xs | (x:_)