The Economy of Distributed Metadata Authoring

Slides:



Advertisements
Similar presentations
Management, Population and Marketing of institutional repositories. Collaboration. Iryna Kuchma, eIFL Open Access program manager, eIFL.net Presented at.
Advertisements

Management, Population and Marketing of institutional repositories / open access journals Iryna Kuchma, eIFL Open Access program manager, eIFL.net Presented.
Universal Search and Social Networking Exploiting the features of each to enhance the other and the tools that make it possible Peter Wallqvist Ravn Systems.
 Andisheh Keikha Ryerson University Ebrahim Bagheri Ryerson University May 7 th
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
Galia Angelova Institute for Parallel Processing, Bulgarian Academy of Sciences Visualisation and Semantic Structuring of Content (some.
OntoBlog: Linking Ontology and Blogs Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of Informatics, Japan 2 Asian.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
A Topic Specific Web Crawler and WIE*: An Automatic Web Information Extraction Technique using HPS Algorithm Dongwon Lee Database Systems Lab.
A Mobile World Wide Web Search Engine Wen-Chen Hu Department of Computer Science University of North Dakota Grand Forks, ND
WebMiningResearch ASurvey Web Mining Research: A Survey By Raymond Kosala & Hendrik Blockeel, Katholieke Universitat Leuven, July 2000 Presented 4/18/2002.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
1 CS 430: Information Discovery Lecture 2 Introduction to Text Based Information Retrieval.
Exercise 1: Bayes Theorem (a). Exercise 1: Bayes Theorem (b) P (b 1 | c plain ) = P (c plain ) P (c plain | b 1 ) * P (b 1 )
Information Retrieval
The Social Web: A laboratory for studying s ocial networks, tagging and beyond Kristina Lerman USC Information Sciences Institute.
ACCESS TO QUALITY RESOURCES ON RUSSIA Tanja Pursiainen, University of Helsinki, Aleksanteri institute. EVA 2004 Moscow, 29 November 2004.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
1 The BT Digital Library A case study in intelligent content management Paul Warren
Digital Libraries: Redefining the Library Value Paradigm Peter E Sidorko The University of Hong Kong 3 December 2010.
1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,
Roy Tennant California Digital Library Is Metasearch Dead?
No Title, yet Hyunwoo Kim SNU IDB Lab. September 11, 2008.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
© Paul Buitelaar – November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas.
Autumn Web Information retrieval (Web IR) Handout #0: Introduction Ali Mohammad Zareh Bidoki ECE Department, Yazd University
Data Mining By Dave Maung.
Search Gotchas Sharon Richardson Joining Dots. Indexing Architecture There can be only one… …indexing server.
Mining real world data Web data. World Wide Web Hypertext documents –Text –Links Web –billions of documents –authored by millions of diverse people –edited.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
Search Strategies & Catalog Instruction Frederic Murray Assistant Professor MLIS, University of British Columbia BA, Political Science, University of Iowa.
A System for Automatic Personalized Tracking of Scientific Literature on the Web Tzachi Perlstein Yael Nir.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
1 DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen, Germany.
SEMANTIC WEB Presented by- Farhana Yasmin – MD.Raihanul Islam – Nohore Jannat –
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
1 Midterm Examination. 2 General Observations Examination was too long! Most people submitted by .
Automated Question Answering Suggestion Using User Expert and Semantic Information การแนะนำการตอบคำถามอัตโนมัติ โดยใช้ข้อมูลผู้เชี่ยวชาญ และข้อมูลเชิง.
WEB STRUCTURE MINING SUBMITTED BY: BLESSY JOHN R7A ROLL NO:18.
EERQI Final Conference, Brussels, March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. EERQI Innovative Indicators.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
Data Mining for Expertise: Using Scopus to Create Lists of Experts for U.S. Department of Education Discretionary Grant Programs Good afternoon, my name.
Data.gov: Web, Data Web, Social Data Web 7/22/2010 #health2stat.
The Power of Networks Six Principles That Connect Our Lives
RECENT TRENDS IN METADATA GENERATION
Presented by: Hassan Sayyadi
Modularization and Semantics of Learning Objects in a Cooperative Knowledge Space Nadine Ludwig Center for Multimedia in eLearning & eResearch, Berlin.
Text & Web Mining 9/22/2018.
Submitted By: Usha MIT-876-2K11 M.Tech(3rd Sem) Information Technology
An Efficient method to recommend research papers and highly influential authors. VIRAJITHA KARNATAPU.
Thanks to Bill Arms, Marti Hearst
Exploring Scholarly Data with Rexplore
CSE 635 Multimedia Information Retrieval
Semantic Wikis Visual Knowledge, Inc. Ontologizing the Ontolog Forum
How to publish in a format that enhances literature-based discovery?
Agenda What is SEO ? How Do Search Engines Work? Measuring SEO success ? On Page SEO – Basic Practices? Technical SEO - Source Code. Off Page SEO – Social.
Platinum Sponsors Silver Sponsors Say Thanks to our Sponsors
Conducting a STEM Literature Review
Search Engine Architecture
Anatomy of a modern data-driven content product
Web Mining Research: A Survey
Building Topic/Trend Detection System based on Slow Intelligence
Semantic Wikis Expedition #52 Conor Shankey CEO July 18, 2006
Information Retrieval and Web Design
Discussion Class 9 Google.
EERQI Innovative Indicators and Test Results
Welcome to SharePoint/O365 Saturday Kansas City!.
Presentation transcript:

The Economy of Distributed Metadata Authoring by Stefano Mazzocchi The presentation will sketch the differences between data creation and metadata creation, outlining the impact of these differences on the economy of distributed content creation and consumption. It will be shown how these economical effects might impact both semantically-enhanced distributed technologies and communities of users of these technologies. Finally, it will be suggested how the economical and social projections can be used as a metric for the feasibility of a proposed technologies that involve highly distributed environments. Experts' Workshop - Perspectives on networked knowledge spaces 25/26 October 2002, Sankt Augustin, Germany Organised by: MARS Exploratory Media Lab at the Fraunhofer Institut für Medienkommunikation

What is Metadata Metadata is information about information

Classic Examples of Metadata Keywords Author Date of creation/modification Address/Identifier

More provocative examples Punctuation in text Layout on a page Font size/weight/style in text Commentary audio tracks on DVD

General Metadata Properties Metadata is data about data, but it’s still data Metadata should be semantically orthogonal: data should be understandable even without metadata

Markup and Metadata Markup languages can be seen as metadata-driven languages. Markup syntax is designed to keep data and metadata orthogonal

The Importance of Metadata Key to semantic analysis Key to multidimensional augmentation of information Key to information relationability In short: key to more powerful datamining

Types of Metadata Human authored Automatically Inferred

Human Authoring (1) In-process: data and metadata are created at the same time Out-of-process: metadata is added after data has been created

Human Authoring (2) By the data author: data and metadata are written by the same person By another author: data and metadata are created by different people

Automatic Inference Recogniction of patterns and trends in data Semantic assumption of data-metadata correlations

Types of Automatic Inference Heuristic: some algorithm performs analysis on the data set (artificial reproduction of intelligent behavior) Transparent: some mechanically extracted information is transparently associated with some metadata performed by human semantic analysis

Transparent Inference Examples Google’s PageRank Amazon’s related items NEC’s CiteSeer

Google PageRank is the system that ranks the pages found after a query against their database It works on hyperlink topology analysis Metadata is inferred from the hyperlinks contained into the page

Amazon Relation between items is inferred from the analysis of the articles bought by the other users The act of a user buying two products is assumed to be a sign of relation between the items Simply by buying, the users are collectively filling up product metadata on relations

CiteSeer Digital Library of IT papers Ranks searches on ‘citations’ topology analysis Bibliographies become the source of relevance metadata

The Issues with Metadata Quality of metadata heavily influences the quality of all search/retrieval systems

First Law of Metadata Quality Artificial intelligence is just that: artificial! So: for a system that feels smart to humans, you need human-created metadata

First Law of Metadata Quantity The more high-quality metadata, the better. But: the more human-created metadata, the more expensive the authoring process gets.

Metrics In order to estimate the value of proposed technological solutions, a metric is required Economical feasibility is one possible metric

Consequences All current markup-based semantic web solutions (RDF, topic maps, ontologies) are economically infeasible. The best semantic solutions are those based on transparent inference

Suggestions (1) Plan the impact of metadata authoring costs on technology decisions. Don’t underestimate the importance of user feeling. Think about what can be inferred transparently without requiring heuristics

Suggestions (2) Do all efforts to make instant return on the investment of metadata authoring Don’t ask too much Be smart but not smarter

Thanks!