Mashups… …Recycling Data
As a simple example… Click on Videos that are uploaded individually over time are collected on one site They are re-organized The original location is referenced with a url Or, look at and at the embedded blogshttp://rogerking.me
What is a mashup? No rigid definition, sometimes called “remixing” The focus is on information, data, and to a lesser extent, services. A mashup takes information and services and brings them together in a way that, hopefully, creates a sum- bigger-than-the-parts Systems used to develop mashups include Flash Builder, Ruby on Rails, Wordpress, content management systems like Joomla and Drupal.
The key component The web is becoming more interconnected and more “processed” Websites are more interconnected and fewer sites are simply one way downloads of data A mashup can be a website or a desktop app – or combined
Data sharing Data includes news, commentaries, videos, etc., etc. As well as a growing use of warehoused and minded data Farmed from TV broadcasts, Vimeo, Youtube, etc. Images off web pages Text and ideas from blogs and websites Printed news that originates on paper or online Podcasts Spam and targeted Twitter and Facebook
How data is recycled Links Publicly available APIs Web services RSS and other feeds Web based office document bases that are partially shared http
Problems Lots of un-processed information that is simply being repeated over and over, making the Web even bigger and harder to use Lots of violations of copyrights, etc. A sense of anonymousness that leads to information that is unintentionally or deliberately incorrect, mean spirited, etc.
Important Reuse can we automated Or it can be manual Or it can be manual and conceptual … imagine if the semantic web actually existed…
Examples Uses police data Locates crimes on Google maps Provides for filtering by users Satellite data from Microsoft and Yahoo Focused on maps from different sources =eng =eng Information from many sources - emergencies
More examples Pulls in reverse phone data Pulls in address data … tries to sell you more data! Non-public and partially public mashups Used by companies, government agencies, etc., up to date on internal news A one-stop way of staying in the loop Insiders have special logins to see special information Try
Tailoring Often a mashup gives user control over content and form What news feeds What “skin” to use What format to use – Shows a growing trend toward smart search engines Events happening locally
Relationship to topics already covered Locating, translating, and integrating heterogeneous data The semantic web Web 2.0/3.0 Unconventional data management systems Website development technologies Information visualization Manipulating data on client side
Challenges This is creating even more varied and diverse heterogeneous data sources – what we need is some consolidation of technologies for information integration that includes creating mashups Making large grained semantic use of web page content The semantic web… ? More focused: widgets, small web components that can be copied and pasted Making them legal, safe, secure, private, and accurate – and profitable Accessing the large bulk of web data – most is still hidden
More challenges Creating more effective use of user-supplied data Using mashups more aggressively than just as information sources – collaborations and group coordination Larger grained software technology – right now, it is as tough as building any website with dynamic pages Would be nice to make their construction quick and cheap Fluidity is what we need Supporting content manipulation on client side Manipulating images, video, audio, etc.
A related technology: portals Perhaps just an improvement in development technology But a portal is less dynamic, does not generally support client side manipulation of data, and usually does not support the manipulation of advanced forms of data
Other legal issues There is a legal notion of “fair use” There is a growing trend toward considering information unprotected Sometimes very valuable information is being stolen… remember bit torrent and file sharing…? Entire books Marketing information Growing use of peer to peer technology to evade centralized detection
Lastly There is another kind of mashup…