Download presentation
Presentation is loading. Please wait.
1
Making Mashups with Marmite Jeff Wong Jason I. Hong Carnegie Mellon University
2
The Big Picture Problem Lots of content out there on the web –But not always in a form amenable to your needs –Ex. Easy to get a list of hotels in San Jose, not so easy to sort by distance to convention center Two observations: –In many cases, all of the data and services people need already exist, but not connected together –Unlikely that a web site can predict all possible needs
3
A Solution: Mashups Rapidly growing community of users creating “mashups” combining content from multiple web sites –Ex. Housingmaps.com
7
A Solution: Mashups Rapidly growing community of users creating “mashups” combining content from multiple web sites –Ex. Housingmaps.com –Ex. MySpace child predators –Ex. Friendster locations –Ex. Most popular videos on YouTube, Yahoo Video, …
8
A Solution: Mashups Rapidly growing community of users creating “mashups” combining content from multiple web sites –Ex. Housingmaps.com –Ex. MySpace child predators –Ex. Friendster locations –Ex. Most popular videos on YouTube, Yahoo Video, … ProgrammableWeb.com statistics –~1500 mashups created since April 2005 –356 open web-based APIs available
9
But Creating Mashups is Hard Requires lots of skill to create a mashup –Ex. Housingmaps creator has PhD in computer science –Ex. MySpace child predator list took months Requires programming expertise in many areas –Web crawling –Text parsing –Pattern matching –Databases –HTML
10
Marmite End-User Programming for Mashups Main idea: make it easy to create web mashups Use a dataflow approach connecting small operators –Inspired by Unix pipes and Apple’s Automator Example: –Get all events from Upcoming.org –Filter out events that are too old –Put them all onto a map Runs inside of a standard web browser
11
Set of Operators
12
Data Flow View
13
Data View
14
Using Marmite (Envisioned) Extract content from one or more web pages –names, addresses, dates, phone #, URLs Process it in a data flow manner –filtering out values or adding metadata –integrating with other data sources (similar to a database join operation) Direct the output to a variety of sinks –databases, map services, text files, visualizations, web pages, or source code that can be further edited
15
Marmite Motivation and Examples Features and Design Rationale User Evaluation
16
Features and Design Rationale Conducted a series of quick evaluations to understand design space and potential problems –Automator –Lo-fi prototypes
17
Automator
18
Informal Automator Evaluation Had three novices try three simple web-based tasks –Warm-up task –Traverse a set of web pages –Download a set of images Some findings: –Some difficulties knowing how to start and what to do next –Little feedback about state of system between operations –Difficult to iterate due to network speed issues
19
Lo-Fi Prototypes 6 paper prototypes with 20 participants
20
Design Solutions Problem: how to start and what to do next Solution: Suggest next actions –Weak data typing to find types (addresses, numbers, etc) –Filter operators to only show relevant ones –Suggest operators that might be applicable
22
Design Solutions Problem: little feedback about state of system between operations Solution: link data flow and data view together –Many systems take program-centric view (ex. Automator) or data-centric view (ex. spreadsheets) –Use hybrid data flow / data view, showing an operation and its effects together –Data view usually “spreadsheet”, other views possible too (for example, maps)
25
Design Solutions Problem: difficult to iterate due to network speeds Solution: cache data, let people “replay” data –Reload, pause, play
26
Other Design Findings Screen real estate issues –Collapsible operators, leaving a readable label
27
Extracting Generic Content Can’t have pre-defined extractor operators for every possible web site –Need a more general way of extracting data from pages Developed a generic wizard UI for selecting links –Content from that set could be extracted via other operators –Uses Solvent (MIT), an XPath-based algorithm for finding patterns in web pages Finds “groups” of related web content based on how HTML is structured
28
Marmite
29
Operators Operators have input types –Operator uses this to guess which columns it wants Operators have output types
30
Implementation JavaScript (for underlying code) and Extensible Binding Language (XBL for UI) Operators currently in JavaScript –Ideally could be scriptable in any programming language –Currently ~15 operators
31
Marmite Motivation and Examples Features and Design Rationale User Evaluation
32
Evaluation Informal user study with 6 people –2 novices –2 people with spreadsheet experience (formulas) –2 people with programming experience Tasks (in increasing difficulty) –Warmup task showing how to retrieve a set of addresses and how to geocode an address –Search for and filter out events further than a week away –Compile a list of events from two event services and plot them on a map –Recreate the housingmaps site
33
Results Three people able to complete all tasks in ~1 hour –First two users confused about suggested actions (automatically popped up, made manual for other 4 users) –Novice made some progress, not able to finish all tasks Able to re-create housingmaps in ~15 minutes
34
Marmite
35
More Results Biggest barrier was understanding the data flow –Did not understand input and output concept –Applied operators as one-off, did not realize that it was a static representation of flow –Did not understand data flow and data view were linked
36
Future Directions Short-term –Better screen-scraping operators –More operators –Better connection with web services (WSDL and REST) –Better help for starting a data flow Long-term –Intelligence analysis –Better visualizations –Location-based services
37
Conclusions Marmite, a tool for creating web-based mashups –Extract content from one or more web pages –Process it in a data flow manner –Direct the output to a variety of sinks Hybrid data flow / data view User evaluation shows some promising results Jeff Wong, Jason Hong, Making Mashups with Marmite: Re-purposing Web Content through End- User Programming, CHI 2007
41
Marmite
42
Types of Operators Sources –Add data into Marmite by querying databases, extracting information from web pages, and so on. Processors –modify, combine, or delete existing rows. Example operators include geocoding (converting street addresses to latitude and longitude) and filtering. Processor operators might add or remove columns as well Sinks –redirect the flow the data out of Marmite. Examples include showing data on a map, saving it to a file, or to a web page.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.