Content analysis and CERN Roman Chyla. Artificial intelligence Natural language processing Web of data Content analysis.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Building a Top Down Ontology From the Bottom Up Step by Step Approach for Identifying & Constructing Dimensions of an Ontology draft (v0.8): DeniseBedford.
Taxonomy & Ontology Impact on Search Infrastructure John R. McGrath Sr. Director, Fast Search & Transfer.
Classification & Your Intranet: From Chaos to Control Susan Stearns Inmagic, Inc. E-Libraries E204 May, 2003.
A Linguistic Approach for Semantic Web Service Discovery International Symposium on Management Intelligent Systems 2012 (IS-MiS 2012) July 13, 2012 Jordy.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Helping people find content … preparing content to be found Enabling the Semantic Web Joseph Busch.
Introduction Information Management systems are designed to retrieve information efficiently. Such systems typically provide an interface in which users.
The Unreasonable Effectiveness of Data Alon Halevy, Peter Norvig, and Fernando Pereira Kristine Monteith May 1, 2009 CS 652.
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
Contextual Intelligence: Scalability Issues in Personal Semantic Networks Oliver Brdiczka.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Knowledge Representation Reading: Chapter
1212 Management and Communication of Distributed Conceptual Design Knowledge in the Building and Construction Industry Dr.ir. Jos van Leeuwen Eindhoven.
OIL: An Ontology Infrastructure for the Semantic Web D. Fensel, F. van Harmelen, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider Presenter: Cristina.
4. The Historical Thesaurus. The Historical Thesaurus is a semantic index of the contents of the OED…
Large-Scale Cost-sensitive Online Social Network Profile Linkage.
Sublinear time algorithms Ronitt Rubinfeld Computer Science and Artificial Intelligence Laboratory (CSAIL) Electrical Engineering and Computer Science.
1 Design and Integration: Part 1 Nuggets about Design vs Project Management.
Advances in Technology and CRIS Nikos Houssos National Documentation Centre / National Hellenic Research Foundation, Greece euroCRIS Task Group Leader.
OOPSLA 2003 DSM Workshop Diagram Definition Facilities Based on Metamodel Mappings Edgars Celms, Audris Kalnins, Lelde Lace University of Latvia, IMCS,
Controlled Vocabulary & Thesaurus Design Planning & Maintenance.
Knowledge representation
Name : Emad Zargoun Id number : EASTERN MEDITERRANEAN UNIVERSITY DEPARTMENT OF Computing and technology “ITEC547- text mining“ Prof.Dr. Nazife Dimiriler.
MY E-PORTFOLIO (WHAT I’VE LEARNED DURING THESE MEETINGS, WHAT IS NOT SO CLEAR, WHAT I DON’T GET AT ALL)
Master Thesis Defense Jan Fiedler 04/17/98
Nancy Lawler U.S. Department of Defense ISO/IEC Part 2: Classification Schemes Metadata Registries — Part 2: Classification Schemes The revision.
Introduction Algorithms and Conventions The design and analysis of algorithms is the core subject matter of Computer Science. Given a problem, we want.
29-30 October, 2006, Estonia 1 IST4Balt Information analysis using social bookmarking and other tools IST4Balt Information analysis using social bookmarking.
Definition of a taxonomy “System for naming and organizing things into groups that share similar characteristics” Taxonomy Architectures Applications.
Ontology Summit2007 Survey Response Analysis Ken Baclawski Northeastern University.
Semantic Network as Continuous System Technical University of Košice doc. Ing. Kristína Machová, PhD. Ing. Stanislav Dvorščák WIKT 2010.
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
CSE5900 Lecture 9AI & MM Pt. 11 AI and MM Part 1 (An Embarrassingly Over-Simplified Introduction)
Terminology and documentation*  Object of the study of terminology:  analysis and description of the units representing specialized knowledge in specialized.
Coastal Atlas Interoperability - Ontologies (Advanced topics that we did not get to in detail) Luis Bermudez Stephanie Watson Marine Metadata Interoperability.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
How Solvable Is Intelligence? A brief introduction to AI Dr. Richard Fox Department of Computer Science Northern Kentucky University.
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
Comparing and Ranking Documents Once our search engine has retrieved a set of documents, we may want to Rank them by relevance –Which are the best fit.
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
© copyright 2014 Semantic Insights™ “A New Natural Language Understanding Technology for Research of Large Information Corpora." By Chuck Rehberg, CTO.
 What is Modeling What is Modeling  Why do we Model Why do we Model  Models in OMT Models in OMT  Principles of Modeling Principles of Modeling 
Semantics: A Many-Splendored Thing Amicalola Lodge 3-5 April 2002 Mike Uschold Mathematics and Computing Technology Boeing Phantom Works.
Topic 1 Object Oriented Programming. 1-2 Objectives To review the concepts and terminology of object-oriented programming To discuss some features of.
Some questions -What is metadata? -Data about data.
This Briefing is: UNCLASSIFIED Aha! Analytics 2278 Baldwin Drive Phone: (937) , FAX: (866) A Recurring Knowledge Transfer Problem, Linked.
SPINNING THE SEMANTIC WEB APPLICATIONS FOR THE MODERN ERA LIBRARIES
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
Advanced Semantics and Search Beyond Tag Clouds and Taxonomies Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
WEB PAGE CONTENTS VERIFICATION AGAINST TAGS USING DATA MINING TOOL IKNOW VІI scientific and practical seminar with international participation "Economic.
Named Entity Disambiguation on an Ontology Enriched by Wikipedia Hien Thanh Nguyen 1, Tru Hoang Cao 2 1 Ton Duc Thang University, Vietnam 2 Ho Chi Minh.
IT STARTS HERE. 1 Seventh Grade Conducting Research Lesson Plan.
1 Class exercise II: Use Case Implementation Deborah McGuinness and Peter Fox CSCI Week 8, October 20, 2008.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
OWL Web Ontology Language Summary IHan HSIAO (Sharon)
Artificial Intelligence, simulation and modelling.
Jean-Yves Le Meur - CERN Geneva Switzerland - GL'99 Conference 1.
COMPUTER SYSTEM FUNDAMENTAL Genetic Computer School INTRODUCTION TO ARTIFICIAL INTELLIGENCE LESSON 11.
Document Engineering Robin Burke ECT 360.
RECENT TRENDS IN METADATA GENERATION
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Implementing Boosting and Convolutional Neural Networks For Particle Identification (PID) Khalid Teli .
New Directions in Discovery
Global Enterprise Search
Abstract Classes An abstract class is a kind of ghost class. It can pass along methods and variables but it can’t ever be instantiated itself. We can.
Extracting Information from Diverse and Noisy Scanned Document Images
Topic: Semantic Text Mining
Presentation transcript:

Content analysis and CERN Roman Chyla

Artificial intelligence Natural language processing Web of data Content analysis

Semantic Web

Information extraction

?

A lot to do…

Semantic dictionary Link between infinite and finite domains Must be prepared (or at least revised) by humans –Purposeful –Incomplete –Constantly changing Very expensive to create/maintain –Solution? Use existing data!

Basic principles Keep it simple, stupid (I didn‘t want believe it could work, it was too simple!) You can‘t get it 100% right Dictionary ~ Universal semantic language –Not really a language, but taxonomy (not even ontology) –Lackss expresiveness –Still very much vague (but that is a feature, not bug!) –Cannot infere from facts BUT it is: – Simple to maintain –Ready to change and evolve, ready to accomodate other resources –Language independent –Problem of research question –Problem of universal and domain specific taxonomy

Word sense disambiguation Homonyms are obvious problem … and Seman can work with many definitions at the same time (think of 3 people and their definition of one word) Possible solutions: –Disambiguation by harvested definitions –Rules –Neural network (supervised learning) –If problems are few, humans can decide

cat

So what I want to do… Prepare another semantic dictionary for HEP (using whatever I can) and for english in general (UDC + existing seman) Diferentiate HEP core and non-core Search corrections (did you mean?) Search results categorization/facets Identify entities, data elements… make them available (this is mainly IE task) Identification of topics (metrics of similarity between document and „known characteristics“) Keywording – identification of statically significant occurences of concepts (not words) Come up with faster ways to enrich the taxonomy

Semantic dictionary Did you mean? IE engine (Bibclassify)

Thank you for your attention. Questions?