IMA 4/26/2004Advanced Math Search1 ADVANCED MATH SEARCH: ISSUES & TECHNIQUES Abdou Youssef The George Washington University And The National Institute.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Content-Based Image Retrieval
UCLA : GSE&IS : Department of Information StudiesJF : 276lec1.ppt : 5/2/2015 : 1 I N F S I N F O R M A T I O N R E T R I E V A L S Y S T E M S Week.
Compiler Principle and Technology Prof. Dongming LU Mar. 28th, 2014.
Extraction of text data and hyperlink structure from scanned images of mathematical journals Ann Arbor, March 19, 2002 Masakazu Suzuki (Kyushu University)
The Experience Factory May 2004 Leonardo Vaccaro.
Information Retrieval in Practice
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
Connecting with Computer Science, 2e
INFORMATION RETRIEVAL WEEK 1 AND 2
© Anselm SpoerriInfo + Web Tech Course Information Technologies Info + Web Tech Course Anselm Spoerri PhD (MIT) Rutgers University
Senior Project – I.D. Math & Computer Science jsMath Equation Editor Dana Cartwright Advisors – Prof. Cervone & Prof. Striegnitz Editor Design -
1 Calculemus Autumn School Approaches On Integration 10/28/02 Sabina Petride.
Overview of Search Engines
Chapter 14 Introduction to HTML
ACAT 2008 Erice, Sicily WebDat: Bridging the Gap between Unstructured and Structured Data Jerzy M. Nogiec, Kelley Trombly-Freytag, Ruben Carcagno Fermilab,
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
Visualization By: Simon Luangsisombath. Canonical Visualization  Architectural modeling notations are ways to organize information  Canonical notation.
Introduction to Mathematical Computing © Francis J. Wright, 2010 Goldsmiths' Company Mathematics Course 2010.
DOG I : an Annotation System for Images of Dog Breeds Antonis Dimas Pyrros Koletsis Euripides Petrakis Intelligent Systems Laboratory Technical University.
IMA 8/11/2006Advanced Math Search1 Relevance Ranking and Hit Packaging in Math Search Abdou Youssef The George Washington University And The National Institute.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Copyright 2000, Media Cybernetics, L.P. Array-Pro ® Analyzer Software.
_______________________________________________________________________________________________________________ E-Commerce: Fundamentals and Applications1.
Systems of Linear Equation and Matrices
A Generative and Model Driven Framework for Automated Software Product Generation Wei Zhao Advisor: Dr. Barrett Bryant Computer and Information Sciences.
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
1 The BT Digital Library A case study in intelligent content management Paul Warren
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Funded by: European Commission – 6th Framework Project Reference: IST WP 2: Learning Web-service Domain Ontologies Miha Grčar Jožef Stefan.
Defining Text Mining Preprocessing Transforming unstructured data stored in document collections into a more explicitly structured intermediate format.
The string data type String. String (in general) A string is a sequence of characters enclosed between the double quotes "..." Example: Each character.
1 HTML John Sum Institute of Technology Management National Chung Hsing University.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
CS315 – Link Analysis Three generations of Search Engines Anchor text Link analysis for ranking Pagerank HITS.
MATLAB
Semantic Network as Continuous System Technical University of Košice doc. Ing. Kristína Machová, PhD. Ing. Stanislav Dvorščák WIKT 2010.
Formal Verification Lecture 9. Formal Verification Formal verification relies on Descriptions of the properties or requirements Descriptions of systems.
Scientific Applications of XML Arvind Hulgeri, Shantanu Godbole
Thesaurus.maths.org: Connecting Mathematics Mike Pearson University of Cambridge, England Igor Podlubny Tech Univ of Kosice, Slovakia Vera Oláh J. Bolyai.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
The Evolving Digital Mathematics Library: A Mathematics Librarian’s Perspective Timothy W. Cole University of Illinois at Urbana-Champaign 8 Dec
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Structure of IR Systems INST 734 Module 1 Doug Oard.
Team Members Dilip Narayanan Gaurav Jalan Nithya Janarthanan.
March 31, 1998NSF IDM 98, Group F1 Group F Multi-modal Issues, Systems and Applications.
Daniel A. Keim, Hans-Peter Kriegel Institute for Computer Science, University of Munich 3/23/ VisDB: Database exploration using Multidimensional.
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
NA-MKM 2004 Phoenix MKM and the NIST DLMF Dan Lozier National Institute of Standards and Technology Gaithersburg, MD
A Constraint Language Approach to Grid Resource Selection Chuang Liu, Ian Foster Distributed System Lab University of Chicago
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Mining knowledge from natural language texts using fuzzy associated concept mapping Presenter : Wu,
Enabling Task Centered Knowledge Support through Semantic Markup Rob Jasper Mike Uschold Boeing Phantom Works.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Nov 9, 2005SIAM Johns Hopkins University1 Math on the Web and the DLMF Project Daniel W Lozier National Institute of Standards and Technology Gaithersburg,
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
Selecting Relevant Documents Assume: –we already have a corpus of documents defined. –goal is to return a subset of those documents. –Individual documents.
Scientific Markup Languages Birds of a Feather Brief Overview of MathML Timothy W. Cole Mathematics Librarian & Professor of Library.
COMP 412, FALL Type Systems C OMP 412 Rice University Houston, Texas Fall 2000 Copyright 2000, Robert Cartwright, all rights reserved. Students.
Information Retrieval in Practice
Search Engine Architecture
Introduction Multimedia initial focus
Software for scientific calculations
Case-Based Reasoning System for Bearing Design
Tomás Murillo-Morales and Klaus Miesenberger
Databases Continued 10/18/05.
Ann Arbor, March 19, 2002 Masakazu Suzuki (Kyushu University)
Presentation transcript:

IMA 4/26/2004Advanced Math Search1 ADVANCED MATH SEARCH: ISSUES & TECHNIQUES Abdou Youssef The George Washington University And The National Institute of Standards and Technology (DLMF)

IMA 4/26/2004Advanced Math Search2 Outline Context of our Math Search Project The Project’s Short-Term Goals Where we are: A Demo Issues faced Goals and Issues for the Longer Term

IMA 4/26/2004Advanced Math Search3 Context of our Math Search Project The Digital Library of Mathematical Functions (DLMF) at NIST Web+Book Replacement of the Abramowitz and Stegun Handbook Special functions, Analysis, Functions of Number Theory, Combinatorial Analysis, Numerical Methods, Statistical Methods, … DLMF:Mostly Equations – Need Math Search Support from NIST and NSF

IMA 4/26/2004Advanced Math Search4 Short-Term Goals Build a math search system that 1. Understands math symbols & structures 2. Returns equations directly, not just hit-titles 3. Highlights matched equations in documents 4. Understands dialects (Latex, Mathematica, Maple) 5. Provides different search modes (TOC, Index, Free-style search, and Menu-driven search)

IMA 4/26/2004Advanced Math Search5 Where we Are Demo of the Search System (Not available online)

IMA 4/26/2004Advanced Math Search6 Sample Queries: understanding math, eq. search & highlighting FormEntry int_0^infinity sin((1/3)t^3+xt) int sin((1/3)t^3+xt) sqrt(Ai^2+Bi^2)  ( -$+$) Gamma(lambda-$+$) J ν or J 0 J_nu or J_0 Ai and J Ai and BesselJ

IMA 4/26/2004Advanced Math Search7 Sample Queries: different dialects BesselJ(nu,z) BesselJ(nu, ) BesselJ(,zeta) JacobiP(n,alpha,beta,x) JacobiP(, alpha,, ) LaguerreL(,, x) LegendreP[, mu, ] LegendreP[,,sqrt(z)]

IMA 4/26/2004Advanced Math Search8 Search Modes One extreme: (static and limited) Table of contents(thematic, coarse-grained) Index(alphabetical, fine-grained) Opposite extreme: (dynamic and unlimited) Free-style search Another mode:(a middle ground) Menu-driven search based on an ontology constrained/standard vocabulary Hybrids of the above

IMA 4/26/2004Advanced Math Search9 Issues Faced Recognizing and Indexing Math Symbols and Structures Pre- MathML & XPath/XQuery Highlighting Matched Equations (GIF Images) inside HTML Documents Development of a Query Language that is Intuitive, Natural, Rich, and Consistent Obtaining/Deriving Metadata for Equations Development of a Math Taxonomy/Ontology Suitable for Menu-Driven Search

IMA 4/26/2004Advanced Math Search10 Techniques: for Handling Math Symbols and Structures For Recognizing and Indexing Math Symbols and Structures, TexSNize: 1. Textualization of math symbols 2. Scoping of the various parts of terms/exprs 3. Normalization of the orders of parts TexSNize the Contents Offline before Indexing TexSNize each Query before the search

IMA 4/26/2004Advanced Math Search11 Techniques: for Equation Search and Highlighting Create a data model that logically decouples equations from their native documents. Assign a unique ID to each equation For Returning Equations directly: algorithm that uses a hit list of equation IDs to generate online a document containing the equations For Highlighting Equations: Use the IDs of matched equations to locate the latter in a to-be-displayed document add coloring HTML markup to doc before display

IMA 4/26/2004Advanced Math Search12 Goals for the Longer Term We Embarked on Building a 2 nd Generation Math Search System Based on Content MathML+XPath/XQuery More Precise/Expressive Query Language Higher resolution search Term-occurrence search Predicate search Search with term substitution Similarity Search (for Sci. Data Mining)

IMA 4/26/2004Advanced Math Search13 Examples of Future Query Types Queries specifying subparts sin x in a denominator x-y in a 3 rd row of a matrix Predicate queries z k, where k is an integer that ranges from –4 to 4 Term-substitution queries g(ω)=z 2 + z +1, where z =e iω Abstraction support and similarity search x^2 + y^2 =1 whatever x and y

IMA 4/26/2004Advanced Math Search14 Candidate Syntax /(… sin x 2πx …) z^$k where integer($k) & abs($k)<5 x-y in matrix[2,3] x^2 in matrix [$k,$j] where abs($k-$j) < 2 $A where matrix($A) & (forall $k) (forall $j 0 $S where set($S) & ( ∃ $x in $S): integer($x) & $x>0 x<1 in condition(set)

IMA 4/26/2004Advanced Math Search15 Issues that Need to Be Faced under C-MathML & XPath/XQuery Canonical Normal Forms of Contents Math equivalences: ab/c: (a*b)/c or a*(b/c) Notational equivalences: Distributed definitions Uniform Symbolic Notation Standard Ontologies Development of Metadata Automated extrapolation of metadata Manual (by authors and communities)

IMA 4/26/2004Advanced Math Search16 Issues to Be Faced (Contd.) What Users Need/Want/Prefer What modes of search? What kinds of information? definitions, equations, theorems, proofs, proof techniques, step-by-step evaluations, themes, theories, expositions, etc.? What granularity of retrieval unit? What interactive features? Definition of terms, plotting of matched functions, computing of matched functions? Use of Knowledge of Users’ Needs/Preferences Better design of the search UI Better relevance-ranking of search results

IMA 4/26/2004Advanced Math Search17 We Are Barely Scratching the Surface The Possibilities Are Endless Search + Automated Reasoning Search + Computing + Visualization