6/10/2015Martin Odersky, LAMP, EPFL1 Programming Language Abstractions for Semi-Structured Data Martin Odersky Sebastian Maneth Burak Emir EPFL
6/10/2015Martin Odersky, LAMP, EPFL2 Scala and XML The project studies language constructs and implementation techniques for processing XML data in a general purpose programming language. It s based on the recently released Scala programming language (scala.epfl.ch) Scala unifies functional and object-oriented programming. Both idioms have a lot to offer. New applications will require a combination of the two.
6/10/2015Martin Odersky, LAMP, EPFL3 Example 1: Distributed programming and web services: Immutable data are essential for achieving robustness and efficiency of applications in the face of replication and partial failure. Example 2: XML processing: Conciseness and safety helped by pattern matching over trees regular expression patterns and types tree transformer combinators (Design principle: fusion instead of agglutination).
6/10/2015Martin Odersky, LAMP, EPFL4 Design Aim You should not have the impression that you are programming either functionally or object-oriented. Three examples how this is achieved: Modules are objects. Pattern matching over class hierarchies. XML Processing
6/10/2015Martin Odersky, LAMP, EPFL5 1. Modules are Objects Traditional modules and objects have complementary strengths: Modules are good at abstraction: e.g. abstract types in signatures. Objects are good at composition: e.g. inheritance, recursion, dynamic composability because objects are first-class. Idea: Identify Object = Module Interface = Signature Class =Functor Consequence: Objects and interfaces need to contain type members. Furthermore, type members can be either abstract or concrete. (Papers on this at FOOL10, ECOOP2003)
6/10/2015Martin Odersky, LAMP, EPFL6 2. Pattern Matching over Class Hierarchies How are data decomposed? OO-approach: Through virtual member access. Functional approach: Through pattern matching over algebraic data types. Complementary wrt extensibilty: OO: Easy to add new kinds of data with fixed method interface. Functional: Easy to add new kinds of processors over fixed data type. How can we get extensibility in both directions?
6/10/2015Martin Odersky, LAMP, EPFL7 Case Classes and Pattern Matching Idea: Allow Pattern Matching over constructors of classes in a class hierarchy. trait Base { trait Exp; case class Num(x: int) extends Exp; def eval(e: Exp): int = e match { case Num(x) => x } } trait BasePlus extends Base { case class Plus(l: Exp, r: Exp) extends Exp; def eval(e: Exp): int = e match { case Plus(l, r) => eval(l) + eval(r) case _ => super.eval(e) }} Full code-reuse possible; easy to set up. Static type-safety can be achieved by refining this pattern (see FOOL 11)
6/10/2015Martin Odersky, LAMP, EPFL8 3. Representing XML Documents On an abstract level, an XML documents is simply a tree. We use and extend standard software to convert between external documents and trees. Trees are pure data; no methods are attached to nodes. Trees should reflect the application domain, rather than being a generic “one size fits all” such as DOM. This means we want different kind of data types for different kinds of tree nodes. BookList Header Book* Publisher Date Title Author* Abstract Keyword*
6/10/2015Martin Odersky, LAMP, EPFL9 Parsing XML Trees in Java How can trees be decomposed? In an object-oriented language: Type test and type casts – ugly and inefficient. if (node instanceof Header) { Header header = (Header)node; Publisher pub = (Publisher)header; } else if (node instanceof Book)... Visitors – heavyweight, hard to extend. node.visit(new Visitor() { void visitHeader() {... } void visit Book() {... } } In a functional language: Recursive pattern match over trees. Problem again: extensibility.
6/10/2015Martin Odersky, LAMP, EPFL10 Parsing XML Trees in Scala In Scala, we can represent XML data as instances of case classes and use pattern matching to access their elements. E.g: entry match { case Header(pub, date) => … case Book(title, info) => … } In general, we need to match in sequences of XML trees. This is done by extending Scala’s pattern matching to regular expressions.