Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

Slides:



Advertisements
Similar presentations
On-line media tools for strategic communications purposes When using media tools for communication we try to use the latest technologies such us blogging,
Advertisements

INTERNET MARKETING CHAPTER4 Social Media Marketing Pranjoy Arup Das.
A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
Sharing Content and Experience in Smart Environments Johan Plomp, Juhani Heinila, Veikko Ikonen, Eija Kaasinen, Pasi Valkkynen 1.
Chapter 8 Social Networks and Industry Disruptors in the Web 2.0 Environment.
Yahoo! Research Bradley Horowitz VP Product Strategy Yahoo, Inc. August 2006.
Information Retrieval in Practice
Mary Lou Roberts SOCIAL MEDIA MARKETING STRATEGY January 2010.
A. Frank Multimedia Multimedia/Video Search. 2 A. Frank Contents Multimedia (MM) and search/retrieval Text-based MM search in General SEs Text-based MM.
IS Today (Valacich & Schneider) 5/e Copyright © 2012 Pearson Education, Inc. Published as Prentice Hall 7/2/ Facebook is the most popular social.
Business Intelligence SharePoint 2007 Collaboration Search Portal Business Forms Platform Services RSS, Workspaces, Mgmt, Security, Storage, Topology,
Delivering Knowledge for Health Edit mode to enable administrators to : Add widgets Add pages Set page format Publish to the website.
Overview of Search Engines
SEARCH ENGINE OPTIMIZATION Donna Habersaat. WHAT IS SEARCH ENGINE OPTIMIZATION (SEO)  Search Engine Optimization (SEO) is the process of setting up your.
SEO PACKAGES. Types of Plans Starter Plan Business Plan Enterprises Plan.
Organic Website Marketing and Online Reputation Management To Boost Traffic, Visibility and Targeted Audience Table of content Introduction Service On.
Best Practices Using Enterprise Search Technology Aurelien Dubot Consultant – Media and Entertainment, Fast Search & Transfer (FAST) British Computer Society.
Consultation Workshop “Future R&D Challenges on Networked Media Systems” Welcome and Context Luis Rodríguez-Roselló Director a.i “Converged Networks &
Increasing Ecommerce Sales Determine keywords that lead to sales (research vs buy mode) Make sure keywords are aligned with products/services you offer.
Sharing Online Resources Social Bookmarking. Ambition in Action Facilitators Stephan Ridgway, Workforce Development.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Christine Laham, Fahed Abdu, David Dezano,Shelly Kim.
Personalized Interaction with Web Resources First Sino-German Symposium on KNOWLEDGE HANDLING: REPRESENTATION, MANAGEMENT AND PERSONALIZED APPLICATION.
Pete Bohman Adam Kunk. Real-Time Search  Definition: A search mechanism capable of finding information in an online fashion as it is produced. Technology.
PERFORMERS’ RIGHTS IN TODAY’S EUROPEAN ENVIRONMENT: HOW TO ADAPT EXISTING RIGHTS TO NEW USES OF PERFORMANCES? Panel Discussion 1 – Webcasting, streaming,
Travis Petershagen Digital Media Solutions Manager – Microsoft Studios Seattle Podcasters Network Organizer.
Search Engine Optimization 101 What is SEM? SEO? How can I use SEO on my blogs and/or my personal web space?
Future Learning Landscapes Yvan Peter – Université Lille 1 Serge Garlatti – Telecom Bretagne.
استاد : مهندس حسین پور ارائه دهنده : احسان جوانمرد Google Architecture.
Web 2.0: Helene Blowers Moving Libraries Forward to Web 2.0 Infopeople Webcast May 9, 2007 What Library Managers Need to Know.
Lecture 3 Strategic E-Marketing Instructor: Hanniya Abid
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 12 This presentation © 2004, MacAvon Media Productions Hypertext and Hypermedia.
Lecture 1 Jan 08, Outline Course logistics Introducing tools to be used in the course Overview of Social Web and Web 2.0 Definition History Key.
By – What is "the Web", a hypertext system that operates over the Internet Web 2.0, a perceived transition of the Web from a collection.
Personalizing Java based Answers for Hundreds of Millions of Users Anurag Gupta Senior Architect, Yahoo Answers & Groups
Blogs and Wikis Tim Bornholtz. Purpose Many new technologies are available on the internet that enable people to publish and edit content without expensive.
Galileo Concierge Live! Managing & Distributing Hotel Images Heidi Naujokaitis & Keith Harrison Galileo by Travelport Paolo Boni President & CEO - VFM.
Future & Emerging Technology for Multimedia Wilky Chan ( ) University of Ulster BSc Interactive Multimedia Design Final Research Report.
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
PODCAST term acronym derived from a combination of “pod” (capsule) and Broadcast (dissemination-issue) Its direct antecedents are audioblogs, variants.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
SAPIR Search in Audio-Visual Content using P2P Information Retrival For more information visit: Support.
Behrooz ChitsazLorrie Apple Johnson Microsoft ResearchU.S. Department of Energy.
10/1/2008 Hessie Jones The Art of Conversation Social Media: The Key to Effectively Attracting and Retaining Customers for Life.
Web 2.0 Debi McGuire. What is Web 2.0? Huge paradigm shift in the Internet Social implications that impact education Tools are powerful, useful, and.
On the go Resources. Ambition in Action Topics /Evolution of resource development /Digital Stories /Podcasting/vodcasting /Captivate.
Social Information Processing March 26-28, 2008 AAAI Spring Symposium Stanford University
DYNAMIC MEDIA Abdullah Kabesh, Mohamad Abdullah, Nour Seif El Nasr, Lynn Abouchacra.
Search Engine Optimization Presented By:- ARKA Softwares Effective! Affordable! Time Groove
Resources of a Resource By, Anupama Atmakur Pooja Adudodla.
+ Welcome to PAHO/WHO Sustainable Development and Health Toolkit for the UN Global Conference RIO + 20 Welcome to PAHO/WHO Sustainable Development and.
Think Digital, Think Ally Digital Media 1of19 SEO Press Release Strategy 2015.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
The Anatomy of a Large-Scale Hypertextual Web Search Engine (The creation of Google)
© 2007 IBM Corporation IBM Software Strategy Group IBM Google Announcement on Internet-Scale Computing (“Cloud Computing Model”) Oct 8, 2007 IBM Confidential.
Expanding audiences USING ONLINE/DIGITAL CONTENT.
Frompo is a Next Generation Curated Search Engine. Frompo has a community of users who come together and curate search results to help improve.
Digital Marketing Services Web Development Web Design Web: S 124/1, Kausalya, Behind Malvani Hotel, Baner Road,
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Data mining in web applications
Search Engine Optimization
Information Retrieval in Practice
Dr. Frank McCown Comp 250 – Web Development Harding University
Digital Video Library - Jacky Ma.
Search Engine Architecture
Multimedia Information Retrieval
American Library Association Online Resource Center
Presentation transcript:

Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005

2 Yahoo! Search Vision Enable people to find, use, share and expand all human knowledge Find:Enable people to find what they are looking for Use:Search not for sake of searching, but to achieve a purpose Share:Sharing knowledge with people you connect with and connecting to people who you share knowledge with Expand Find:Enable people to find what they are looking for Use:Search not for sake of searching, but to achieve a purpose Share:Sharing knowledge with people you connect with and connecting to people who you share knowledge with

3 FUSE fuse (fy oo z) verb fused, also fusing, fuses To become mixed or united by melting together fusion (fy oo zh ų n) noun A reaction in which nuclei combine to form massive nuclei with the simultaneous release of energy Knowledge Fusion: Enable people to find, use, share and expand all human knowledge

10

14 How it works… Standard Crawl / Index / Serve paradigm Automated crawlers spider the web and encounter multimedia content. (Feeds augment the crawl…) Metadata is generated and indexed, content is fetched and “summary” (thumbnail) is extracted and stored on Yahoo servers Queries are processed and results are served

15 Challenges: relevance & ranking What is the “best” match, i.e. the best image for a given query? Techniques like link-flux or PageRank not as meaningful in multimedia context Clickstream analysis a useful tool –Unlike web results, users can consume content directly on SRP… –Therefore clicks are meaningful “votes”

16 Challenges (v. Web Search) Crawling Multimedia data can be difficult URLS are often dynamically generated, often intentionally obsfucated. Huge storage implications (thumbnail cache) Multimedia objects are binary, and hence “opaque” v. self-describing web pages, thus dependence on heuristic means of deriving metadata Relevance is difficult to determine

17 Metadata Extraction Any and all means…. Syntactic stuff: Size, dimensions, format, duration, frames per second, etc. Object name: “dog.gif” ALT tags, Embedded metadata (headers): ID3 tags, EXIF information, etc. Analysis: –Color / black & white –Acoustic Fingerprinting –Speech Recognition –Speaker Identification –OCR –Face Recognition –Object Recognition –Indoor / outdoor –Music / Speech

18 Problems with automated Metadata Extraction Techniques are computationally expensive, very difficult to compute at web scale Results are noisy, prone to error… Some of these drawbacks can be mitigated by using context Biggest problem is that techniques do not extract most valuable level of information

19 A long digression on metadata

20 ESP Game

21 ESP Game

22 ESP Game

23 Great fun!

24 ESP Game

25 Over 10m photos tagged…

FUSE Case Study: Flickr

27 What’s important about Flickr? “An online photo sharing community…” Filled with high-quality, timely, topical photographic content... Richly annotated and indexed, searchable, browsable and navigable… Tens of thousands of distribution partners… Hundreds of applications written against flickr platform… content entirely user-generated. metadata entirely user-generated. each “deal” brokered by flickr users… by 1000s of flickr developers

28 Flickr

29 Photos from my Contacts

30 Photos from Nathan

31 Photos from Nathan via RSS

32 Photos from Nathan via RSS

33 Tagging

34 Tagging / cats

35 Tagging / interestingness

36 Tagging / clusters

37 Flickr Flash Widget

38 Inline on a blog…

39 Flickr publishing

40 Flickr as a service…

41 Culture of Participation 1creators 10synthesizers 100consumers

42 Mass Media Micro Media My Media

43 Mass Media as artifact… Negroponte: “Being Digital” - bits v. atoms Current system of media production & consumption is largely an artifact of “atom- based” distribution Media has been relatively difficult for individuals to create, and virtually impossible for individuals to distribute at scale Led to a high-stakes economic model Mass media has flourished, in part, because micro-media wasn’t viable…

44 The Tale of the Head & the Tail Popular content…everything else Postulates Area under the curve of “long tail” is significant… Economics around this “tail” content often invert

45 Mass v. Micro Media Mass Media Appeals to Large Audiences Controlled and manicured distribution Expensive to produce Studio model, high-stakes economics Micro Media May have limited audience Uncontrolled, unmoderated distribution Cheap to produce Different, developing economic model My Media Ability to consume from both head and tail Ability to create and share “my” content

46 Content Acquisition Strategy Explicit Feed Relationships Comprehensive Media Crawl Media RSS

47 Media RSS 1.0 Simple extension to wildly popular RSS format Builds on “podcasting” movement, and extends to other media data types Designed for “grass-roots” publishing, enabling audiences for content previously unavailable through traditional channels Working with community “to get something done,” in partnership and collaboration

48 Mass Media vs.Mass Media + Micro Media = My Media

49 Numa

50 Yahoo Video Search Straightforward video search a la Image Search Yahoo is extremely sensitive to rights holders –24 hour take-down policy –Working in partnership and collaboration with studios and networks to serve their needs –Leveraging existing business models and objectives – we drive traffic to your property –Great tool for monitoring and dealing with infringing content Extremely successful, more innovation in the works Video Search is not a technology problem per se, but a “business problem” Yahoo is well-poised to address.

51 Thanks!