making mashups with marmite n.
Download
Skip this Video
Download Presentation
Making Mashups with Marmite

Loading in 2 Seconds...

play fullscreen
1 / 42

Making Mashups with Marmite - PowerPoint PPT Presentation


  • 83 Views
  • Uploaded on

Making Mashups with Marmite. Jeff Wong Jason I. Hong Carnegie Mellon University. The Big Picture Problem. Lots of content out there on the web But not always in a form amenable to your needs Ex. Easy to get a list of hotels in San Jose, not so easy to sort by distance to convention center

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Making Mashups with Marmite' - derek-hood


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
making mashups with marmite

Making Mashups with Marmite

Jeff WongJason I. HongCarnegie Mellon University

the big picture problem
The Big Picture Problem
  • Lots of content out there on the web
    • But not always in a form amenable to your needs
    • Ex. Easy to get a list of hotels in San Jose, not so easy to sort by distance to convention center
  • Two observations:
    • In many cases, all of the data and services people need already exist, but not connected together
    • Unlikely that a web site can predict all possible needs
a solution mashups
A Solution: Mashups
  • Rapidly growing community of users creating “mashups” combining content from multiple web sites
    • Ex. Housingmaps.com
a solution mashups1
A Solution: Mashups
  • Rapidly growing community of users creating “mashups” combining content from multiple web sites
    • Ex. Housingmaps.com
    • Ex. MySpace child predators
    • Ex. Friendster locations
    • Ex. Most popular videos on YouTube, Yahoo Video, …
a solution mashups2
A Solution: Mashups
  • Rapidly growing community of users creating “mashups” combining content from multiple web sites
    • Ex. Housingmaps.com
    • Ex. MySpace child predators
    • Ex. Friendster locations
    • Ex. Most popular videos on YouTube, Yahoo Video, …
  • ProgrammableWeb.com statistics
    • ~1500 mashups created since April 2005
    • 356 open web-based APIs available
but creating mashups is hard
But Creating Mashups is Hard
  • Requires lots of skill to create a mashup
    • Ex. Housingmaps creator has PhD in computer science
    • Ex. MySpace child predator list took months
  • Requires programming expertise in many areas
    • Web crawling
    • Text parsing
    • Pattern matching
    • Databases
    • HTML
marmite end user programming for mashups
MarmiteEnd-User Programming for Mashups
  • Main idea: make it easy to create web mashups
  • Use a dataflow approach connecting small operators
    • Inspired by Unix pipes and Apple’s Automator
  • Example:
    • Get all events from Upcoming.org
    • Filter out events that are too old
    • Put them all onto a map
  • Runs inside of a standard web browser
using marmite envisioned
Using Marmite (Envisioned)
  • Extract content from one or more web pages
    • names, addresses, dates, phone #, URLs
  • Process it in a data flow manner
    • filtering out values or adding metadata
    • integrating with other data sources (similar to a database join operation)
  • Direct the output to a variety of sinks
    • databases, map services, text files, visualizations, web pages, or source code that can be further edited
marmite
Marmite
  • Motivation and Examples
  • Features and Design Rationale
  • User Evaluation
features and design rationale
Features and Design Rationale
  • Conducted a series of quick evaluations to understand design space and potential problems
    • Automator
    • Lo-fi prototypes
informal automator evaluation
Informal Automator Evaluation
  • Had three novices try three simple web-based tasks
    • Warm-up task
    • Traverse a set of web pages
    • Download a set of images
  • Some findings:
    • Some difficulties knowing how to start and what to do next
    • Little feedback about state of system between operations
    • Difficult to iterate due to network speed issues
lo fi prototypes
Lo-Fi Prototypes
  • 6 paper prototypes with 20 participants
design solutions
Design Solutions
  • Problem: how to start and what to do next
  • Solution: Suggest next actions
    • Weak data typing to find types (addresses, numbers, etc)
    • Filter operators to only show relevant ones
    • Suggest operators that might be applicable
design solutions1
Design Solutions
  • Problem: little feedback about state of system between operations
  • Solution: link data flow and data view together
    • Many systems take program-centric view (ex. Automator) or data-centric view (ex. spreadsheets)
    • Use hybrid data flow / data view, showing an operation and its effects together
    • Data view usually “spreadsheet”, other views possible too (for example, maps)
design solutions2
Design Solutions
  • Problem: difficult to iterate due to network speeds
  • Solution: cache data, let people “replay” data
    • Reload, pause, play
other design findings
Other Design Findings
  • Screen real estate issues
    • Collapsible operators, leaving a readable label
extracting generic content
Extracting Generic Content
  • Can’t have pre-defined extractor operators for every possible web site
    • Need a more general way of extracting data from pages
  • Developed a generic wizard UI for selecting links
    • Content from that set could be extracted via other operators
    • Uses Solvent (MIT), an XPath-based algorithm for finding patterns in web pages
      • Finds “groups” of related web content based on how HTML is structured
operators
Operators
  • Operators have input types
    • Operator uses this to guess which columns it wants
  • Operators have output types
implementation
Implementation
  • JavaScript (for underlying code) and Extensible Binding Language (XBL for UI)
  • Operators currently in JavaScript
    • Ideally could be scriptable in any programming language
    • Currently ~15 operators
marmite2
Marmite
  • Motivation and Examples
  • Features and Design Rationale
  • User Evaluation
evaluation
Evaluation
  • Informal user study with 6 people
    • 2 novices
    • 2 people with spreadsheet experience (formulas)
    • 2 people with programming experience
  • Tasks (in increasing difficulty)
    • Warmup task showing how to retrieve a set of addresses and how to geocode an address
    • Search for and filter out events further than a week away
    • Compile a list of events from two event services and plot them on a map
    • Recreate the housingmaps site
results
Results
  • Three people able to complete all tasks in ~1 hour
    • First two users confused about suggested actions (automatically popped up, made manual for other 4 users)
    • Novice made some progress, not able to finish all tasks
  • Able to re-create housingmaps in ~15 minutes
more results
More Results
  • Biggest barrier was understanding the data flow
    • Did not understand input and output concept
    • Applied operators as one-off, did not realize that it was a static representation of flow
    • Did not understand data flow and data view were linked
future directions
Future Directions
  • Short-term
    • Better screen-scraping operators
    • More operators
    • Better connection with web services (WSDL and REST)
    • Better help for starting a data flow
  • Long-term
    • Intelligence analysis
    • Better visualizations
    • Location-based services
conclusions
Conclusions
  • Marmite, a tool for creating web-based mashups
    • Extract content from one or more web pages
    • Process it in a data flow manner
    • Direct the output to a variety of sinks
  • Hybrid data flow / data view
  • User evaluation shows some promising results

Jeff Wong, Jason Hong, Making Mashups with Marmite: Re-purposing Web Content through End-User Programming, CHI 2007

types of operators
Types of Operators
  • Sources
    • Add data into Marmite by querying databases, extracting information from web pages, and so on.
  • Processors
    • modify, combine, or delete existing rows. Example operators include geocoding (converting street addresses to latitude and longitude) and filtering. Processor operators might add or remove columns as well
  • Sinks
    • redirect the flow the data out of Marmite. Examples include showing data on a map, saving it to a file, or to a web page.