Advanced Programming Andrew Black and Tim Sheard Lecture 11 Parsing Combinators.

Slides:



Advertisements
Similar presentations
0 PROGRAMMING IN HASKELL Chapter 5 - List Comprehensions.
Advertisements

Haskell Lets review some of the Haskell concepts you have been learning on your own. The answers are included, but try it yourself first.
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
Layered Combinator Parsers with a Unique State Pieter Koopman Rinus Plasmeijer Nijmegen, The Netherlands.
0 PROGRAMMING IN HASKELL Chapter 10 - Declaring Types and Classes.
Exercises – don’t use built in functions for these as we want the practice Write a recursive function to add up all the numbers in a list "flatten" a list.
© M. Winter COSC 4P41 – Functional Programming Abstract data types (ADTs) An ADT is a data type together with some functions to manipulate elements.
0 LECTURE 5 LIST COMPREHENSIONS Graham Hutton University of Nottingham.
What is a Parser? A parser is a program that analyses a piece of text to determine its syntactic structure  3 means 23+4.
0 PROGRAMMING IN HASKELL Chapter 5 - List Comprehensions.
Cse536 Functional Programming 1 6/23/2015 Lecture #17, Dec. 1, 2004 Todays Topics – Higher Order types »Type constructors that take types as arguments.
Advanced Programming Andrew Black and Tim Sheard Lecture 4 Intro to Haskell.
Yu-Chen Kuo1 Chapter 2 A Simple One-Pass Compiler.
0 PROGRAMMING IN HASKELL Chapter 3 - Types and Classes.
Comp 205: Comparative Programming Languages Functional Programming Languages: More Lists Recursive definitions List comprehensions Lecture notes, exercises,
Design patterns (?) for control abstraction What do parsers, -calculus reducers, and Prolog interpreters have in common?
0 PROGRAMMING IN HASKELL Typeclasses and higher order functions Based on lecture notes by Graham Hutton The book “Learn You a Haskell for Great Good” (and.
Functional Programming Lecture 4 - More Lists Muffy Calder.
Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Machine-independent code improvement Target code generation Machine-specific.
CS 461 – Oct. 7 Applications of CFLs: Compiling Scanning vs. parsing Expression grammars –Associativity –Precedence Programming language (handout)
COMP Parsing 2 of 4 Lecture 22. How do we write programs to do this? The process of getting from the input string to the parse tree consists of.
0 PROGRAMMING IN HASKELL Some first steps Based on lecture notes by Graham Hutton The book “Learn You a Haskell for Great Good” (and a few other sources)
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 3, 09/11/2003 Prof. Roy Levow.
Lee CSCE 314 TAMU 1 CSCE 314 Programming Languages Haskell: Types and Classes Dr. Hyunyoung Lee.
1-Nov-15 Haskell II Functions and patterns. Data Types Int + - * / ^ even odd Float + - * / ^ sin cos pi truncate Char ord chr isSpace isUpper … Bool.
Functional Programming guest lecture by Tim Sheard Parsing in Haskell Defining Parsing Combinators.
Advanced Functional Programming Tim Sheard 1 Lecture 14 Advanced Functional Programming Tim Sheard Oregon Graduate Institute of Science & Technology Lecture.
0 PROGRAMMING IN HASKELL Chapter 9 - Higher-Order Functions, Functional Parsers.
Advanced Functional Programming Tim Sheard 1 Lecture 6 Functional Programming Tim Sheard & Mark Jones Monads & Interpreters.
© M. Winter COSC 4P41 – Functional Programming Programming with actions Why is I/O an issue? I/O is a kind of side-effect. Example: Suppose there.
CPS 506 Comparative Programming Languages Syntax Specification.
Parsec Parsing. Parsec Parsec one of the standard libraries for building libraries. It is a combinator parser A parser parses a sequence of elements to.
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
Com Functional Programming Lexical Analysis Marian Gheorghe Lecture 15 Module homepage Mole & ©University of Sheffieldcom2010.
1 CS 457/557: Functional Languages Lists and Algebraic Datatypes Mark P Jones Portland State University.
Advanced Functional Programming Tim Sheard 1 Lecture 17 Advanced Functional Programming Tim Sheard Oregon Graduate Institute of Science & Technology Lecture:
Workshop: Towards Highly Portable Software Jakarta, 21 – 23 January 2003 Diselenggarakan oleh Universitas IndonesiaUniversitas Indonesia Part 1 : Programming.
0 PROGRAMMING IN HASKELL Based on lecture notes by Graham Hutton The book “Learn You a Haskell for Great Good” (and a few other sources) Odds and Ends,
Functional Programming Lecture 3 - Lists Muffy Calder.
© M. Winter COSC 4P41 – Functional Programming Some functions id :: a -> a id x = x const :: a -> b -> a const k _ = k ($) :: (a -> b) -> a -> b.
Set Comprehensions In mathematics, the comprehension notation can be used to construct new sets from old sets. {x2 | x  {1...5}} The set {1,4,9,16,25}
Set Comprehensions In mathematics, the comprehension notation can be used to construct new sets from old sets. {x2 | x  {1...5}} The set {1,4,9,16,25}
What is a Parser? A parser is a program that analyses a piece of text to determine its syntactic structure  3 means 23+4.
Announcements Today: the Parsing Domain-specific Language
Parsing 2 of 4: Scanner and Parsing
String is a synonym for the type [Char].
Types CSCE 314 Spring 2016.
Chapter 3 Lexical Analysis.
PROGRAMMING IN HASKELL
Functions and patterns
Interpreters Study Semantics of Programming Languages through interpreters (Executable Specifications) cs7100(Prasad) L8Interp.
A lightening tour in 45 minutes
Syntax Analysis Sections :.
PROGRAMMING IN HASKELL
PROGRAMMING IN HASKELL
PROGRAMMING IN HASKELL
Lecture 4: Lexical Analysis & Chomsky Hierarchy
Types and Classes in Haskell
PROGRAMMING IN HASKELL
CSCE 314: Programming Languages Dr. Dylan Shell
Haskell Types, Classes, and Functions, Currying, and Polymorphism
PROGRAMMING IN HASKELL
PROGRAMMING IN HASKELL
CSCE 314: Programming Languages Dr. Dylan Shell
Functions and patterns
PROGRAMMING IN HASKELL
PROGRAMMING IN HASKELL
CSCE 314: Programming Languages Dr. Dylan Shell
PROGRAMMING IN HASKELL
Presentation transcript:

Advanced Programming Andrew Black and Tim Sheard Lecture 11 Parsing Combinators

Type of a Parser data Parser a = Parser (String -> [(a,String)]) A function inside a data definition. The output is a list of successful parses. This type can be made into a monad –A monad is the sequencing operator in Haskell. Also be made into a Monad with zero and (++) or plus.

Defining the Monad Technical details, can be ignored when using combinators instance Monad Parser where return v = Parser (\inp -> [(v,inp)]) p >>= f = Parser (\inp -> concat [applyP (f v) out | (v,out) <- applyP p inp]) instance MonadPlus Parser where mzero = Parser (\inp -> []) mplus (Parser p) (Parser q) = Parser(\inp -> p inp ++ q inp) instance Functor Parser where... where applyP undoes the constructor applyP (Parser f) x = f x Note the comprehension syntax

Typical Parser Because the parser is a monad we can use the Do syntax. do { x1 <- p1 ; x2 <- p2 ;... ; xn <- pn ; f x1 x2... Xn }

Running the Parser Running Parsers papply :: Parser a -> String -> [(a,String)] papply p = applyP (do {junk; p}) junk skips over white space and comments. We'll see how to define it later

Simple Primitives applyP :: Parser a -> String -> [(a,String)] applyP (Parser p) = p item :: Parser Char item = Parser (\inp -> case inp of "" -> [] (x:xs) -> [(x,xs)]) sat :: (Char -> Bool) -> Parser Char sat p = do {x <- item; if p x then return x else mzero} ? papply item "abc" [('a',"bc")]

Examples ? papply item "abc" [('a',"bc")] ? papply (sat isDigit) "123" [('1',"23")] ? parse (sat isDigit) "abc" []

Useful Parsers char :: Char -> Parser Char char x = sat (x ==) digit :: Parser Int digit = do { x <- sat isDigit ; return (ord x - ord '0') } lower :: Parser Char lower = sat isLower upper :: Parser Char upper = sat isUpper

Examples char x = sat (x ==) ? papply (char 'z') "abc" [] ? papply (char 'a') "abc" [('a',"bc")] ? papply digit "123" [(1,"23")] ? papply upper "ABC" [('A',"BC")] ? papply lower "ABC" []

More Useful Parsers –letter :: Parser Char –letter = sat isAlpha Can even use recursion –string :: String -> Parser String –string "" = return "" –string (x:xs) = – do {char x; string xs; return (x:xs) } Helps define even more useful parsers –identifier :: Parser String –identifier = do {x <- lower – ; xs <- many alphanum – ; return (x:xs)} What do you think many does?

Examples ? papply (string "tim") "tim is red" [("tim"," is red")] ? papply identifier "tim is blue" [("tim"," is blue")] ? papply identifier "x5W3 = 12" [("x5W3"," = 12")]

Choice -- 1 parser or another Note that the ++ operator (from MonadPlus) gives non- deterministic choice. –instance MonadPlus Parser where – (Parser p) ++ (Parser q) – = Parser(\inp -> p inp ++ q inp) Sometimes we’d like to prefer one choice over another, and take the second only if the first fails We don’t we need an explicit sequencing operator because the monad sequencing plays that role.

Efficiency force :: Parser a -> Parser a force p = Parser (\ inp -> let x = applyP p inp in (fst (head x), snd (head x)) : (tail x) ) Deterministic Choice (+++) :: Parser a -> Parser a -> Parser a p +++ q = Parser(\inp -> case applyP (p `mplus` q) inp of [] -> [] (x:xs) -> [x])

Example –? papply (string "x" +++ string "b") "abc" –[] –? papply (string "x" +++ string "b") "bcd" –[("b","cd")]

Sequences (more recursion) many :: Parser a -> Parser [a] many p = force (many1 p +++ return []) many1 :: Parser a -> Parser [a] many1 p = do {x <- p ; xs <- many p ; return (x:xs)} sepby :: Parser a -> Parser b -> Parser [a] p `sepby` sep = (p `sepby1` sep) +++ return [] sepby1 :: Parser a -> Parser b -> Parser [a] p `sepby1` sep = do { x <- p ; xs <- many (do {sep; p}) ; return (x:xs) }

Example ? papply (many (char 'z')) "zzz234" [("zzz","234")] ? papply (sepby (char 'z') spaceP) "z z z 34" [("zzz"," 34")]

Sequences separated by operators chainl :: Parser a -> Parser (a -> a -> a) -> a -> Parser a chainl p op v = (p `chainl1` op) +++ return v chainl1 :: Parser a -> Parser (a -> a -> a) -> Parser a p `chainl1` op = do {x <- p; rest x } where rest x = do {f <- op; y <- p; rest (f x y)} +++ return x ? papply (chainl int (return (+)) 0) "1 3 4 abc" [(8,"abc")]

Tokens and Lexical Issues spaceP :: Parser () spaceP = do {many1 (sat isSpace); return ()} comment :: Parser () comment = do{string "--"; many (sat p); return ()} where p x = x /= '\n' junk :: Parser () junk = do {many (spaceP +++ comment); return ()} A Token is any parser followed by optional white space or a comment token :: Parser a -> Parser a token p = do {v <- p; junk; return v}

Using Tokens symb :: String -> Parser String symb xs = token (string xs) ident :: [String] -> Parser String ident ks = do { x <- token identifier ; if (not (elem x ks)) then return x else zero } nat :: Parser Int nat = token natural natural :: Parser Int natural = digit `chainl1` return (\m n -> 10*m + n)

Example ? papply (token (char 'z')) "z 123" [('z',"123")] ? papply (symb "tim") "tim is cold" [("tim","is cold")] ? papply natural "123 abc" [(123," abc")] ? papply (many identifier) "x d3 23" [(["x"]," d3 23")] ? papply (many (token identifier)) "x d3 23" [(["x", "d3"],"23")]

More Parsers int :: Parser Int int = token integer integer :: Parser Int integer = (do {char '-’ ; n <- natural ; return (-n)}) +++ nat

Example: Parsing Expressions data Term = Add Term Term | Sub Term Term | Mult Term Term | Div Term Term | Const Int addop:: Parser(Term -> Term -> Term) addop = do {symb "+"; return Add} +++ do {symb "-"; return Sub} mulop:: Parser(Term -> Term -> Term) mulop = do {symb "*"; return Mult} +++ do {symb "/"; return Div}

Constructing a Parse tree expr :: Parser Term addop :: Parser (Term -> Term -> Term) mulop :: Parser (Term -> Term -> Term) expr = term `chainl1` addop term = factor `chainl1` mulop factor = (do { n <- token digit ; return (Const n)}) +++ (do {symb "(“ ; n <- expr ; symb ")“ ; return n}) ? papply expr "5 abc" [(Const 5,"abc")] ? papply expr " " [(Sub (Add (Const 4) (Const 5))(Const 2),[])]

Array Based Parsers type Subword = (Int,Int) newtype P a = P (Array Int Char -> Subword -> [a]) unP (P z) = z emptyP :: P () emptyP = P f where f z (i,j) = [() | i == j] notchar :: Char -> P Char notchar s = P f where f z (i,j) = [z!j | i+1 == j, z!j /= s] charP :: Char -> P Char charP c = P f where f z (i,j) = [c | i+1 == j, z!j == c]

anychar :: P Char anychar = P f where f z (i,j) = [z!j | i+1 == j] anystring :: P(Int,Int) anystring = P f where f z (i,j) = [(i,j) | i <= j] symbol :: String -> P (Int,Int) symbol s = P f where f z (i,j) = if j-i == length s then [(i,j)| and [z!(i+k) == s!!(k-1) | k <-[1..(j-i)]]] else []

Combinators infixr 6 ||| (|||) :: P b -> P b -> P b (|||) (P r) (P q) = P f where f z (i,j) = r z (i,j) ++ q z (i,j) infix 8 <<< ( c) -> P b -> P c (<<<) f (P q) = P h where h z (i,j) = map f (q z (i,j)) infixl 7 ~~~ (~~~) :: P(b -> c) -> P b -> P c (~~~) (P r) (P q) = P f where f z (i,j) = [f y | k <- [i..j], f <- r z (i,k), y <- q z (k,j)]

run :: String -> P b -> [b] run s (P ax) = ax (s2a s) (0,length s) s2a s = (array bounds (zip [1..] s)) where bounds = (1,length s) instance Monad P where return x = P(\ z (i,j) -> if i==j then [x] else []) (>>=) (P f) g = P h where h z (i,j) = concat[ unP (g a) z (k,j) | k <- [i..j], a <- f z (i,k)]

Examples p1 = do { symbol "tim"; c <- anychar ; symbol "tom"; return c} ex4 = run "tim5tom" p1 ex5 = run "timtom" p1 Main> ex4 "5" Main> ex5 ""

Exercise in class Write a parser for regular expressions