?! Advanced CQL and ProfilingMike Taylor Advanced CQL and Profiling 1. Esoteric CQL features: – Word Anchoring – Proximity – Relation.

Slides:



Advertisements
Similar presentations
Information Technology Quiz Questions with Answers Part 11
Advertisements

Information Technology Quiz Questions with Answers Part 9
ELibrary Topic Search Basics eLibrary topic search allows users to locate articles and multimedia resources –Relevant to K-12 curricula and user.
Creating an EDS Search Box Using EBSCO’s Search Box Builder Tool
The creation of "Yaolan.com" A Site for Pre-natal and Parenting Education in Chinese by James Caldwell DAE Interactive Marketing a Web Connection Company.
EQUINOX DATA DELIVERY SYSTEM May 31, 2011 –Elizabeth Hill Equinox.uwo.ca.
Process a Customer Chapter 2. Process a Customer 2-2 Objectives Understand what defines a Customer Learn how to check for an existing Customer Learn how.
CQL – a Common Query LanguageMike Taylor CQL – a Common Query Language 1. What CQL is 2. Motivation 3. Examples and explanation 4. Applications 5. Implementation.
Searching very large bodies of data using a transparent peer-to-peer proxy Mike Taylor and Marc Cromme, Index Data
Delivering MARC/XML records from the Library of Congress catalogue using the open protocols SRW/U and Z39.50 Mike Taylor, Index Data
Alvis status report: Index DataMike Taylor Alvis status report: Index Data Check out the exciting things to come! 1. Technical contribution.
? CQL – a Common Query LanguageMike Taylor CQL – a Common Query Language 1. What CQL is 2. Motivation 3. Examples and explanation 4.
When worlds collide Metasearching meets central indexes Mike Taylor – Index Data –
Internet Search Methods 1.01 Understand Internet search tools and methods.
Overriding CMPS Overriding Recall, a method in a child class overrides a method in the parent class, if it has the same name and type signature.
Document Properties: adding information to your Microsoft Office documents Step 1: Add information to Document Properties What are Document Properties.
2004 EBSCO Publishing Presentation on EBSCOadmin.
MASTER QUOTE OVERVIEW.
: 3 00.
Inaport Training Standard Matching. © Copyright 2010 InaPlex Inc Matching Process of deciding which record or set of records in the target table(s) should.
Effect Size and Statistical Power Analysis in Behavioral and Educational Research Effect size 1 (P. Onghena) a.m. Effect size 2 (W. Van den.
1 Search Update Webmasters User Group by Kevin Paddock, DTS Search Administrator State of California Webmasters User Group Wednesday,
F o r m u l a M a s s. F o r m u l a a n d M o l e c u l a r M a s s z E v e r y M o l e c u l a r m a s s i s a f o r m u l a m a s s. z N o t e v e.
Modern information retrieval Modelling. Introduction IR systems usually adopt index terms to process queries IR systems usually adopt index terms to process.
Search Techniques Boolean Logic and Keyword Searching.
Query Languages. Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
CS 430 / INFO 430 Information Retrieval
South Dakota Library Network ALEPH Integrated ILL (ILL2) Set Up Working with Potential Suppliers Lists South Dakota Library Network 1200 University, Unit.
An introduction to Cambridge Collections Online… Full online access to collections of classic and newly- published scholarly titles in PDF format Contains.
IS530 Lesson 12 Boolean vs. Statistical Retrieval Systems.
CQL “Common Query Language” Ray Denenberg March 2005.
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
Search Strategies Online Search Techniques. Universal Search Techniques Precision- getting results that are relevant, “on topic.” Recall- getting all.
1 Query Languages. 2 Boolean Queries Keywords combined with Boolean operators: –OR: (e 1 OR e 2 ) –AND: (e 1 AND e 2 ) –BUT: (e 1 BUT e 2 ) Satisfy e.
1 CS 430 / INFO 430 Information Retrieval Lecture 7 String Processing.
Anatomy of the Keyword Search Results Screen. A keyword search will result in.
Empowering EPrints Search with Xapian
What’s New in VRS? GUGM May 15, 2008 Presenter: Kelly P. Robinson GIL Service Georgia State University
Z39.50 for Finding It All William E. Moen School of Library and Information Sciences Texas Center for Digital Knowledge University of North Texas Denton,
Welcome to Cambridge Histories Online This unique historical reference compendium allows instant access to the renowned texts of the Cambridge Histories.
Let VRS Work for You! ELUNA Conference 2008 Presenter: Kelly P. Robinson GIL Service Georgia State University
A Web Services Approach for Search and Retrieve The Next Generation Z39.50 Access 2004, October 13-16, 2004, Halifax, Nova Scotia William E. Moen School.
Querying Structured Text in an XML Database By Xuemei Luo.
1 University of Palestine Topics In CIS ITBS 3202 Ms. Eman Alajrami 2 nd Semester
Sébastien François, EPrints Lead Developer EPrints Developer Powwow, ULCC.
PhD Session Kate Purcell, Subject Librarian Tel:
Javadoc A very short tutorial. What is it A program that automatically generates documentation of your Java classes in a standard format For each X.java.
Unit 4: Conventional signs, sketch maps and plans A. What are conventional signs? Conventional signs  Represent different features on a map  Designed.
Ray Denenberg Rob Sanderson “ Key Standards Updates ” SRU Project Briefing April 4, 2006; Washington.
CNI, 4th April 2006 Slide 1 Key Standards Update: SRU (“Technical” Details) Dr. Robert Sanderson Dept. of Computer Science University of Liverpool
Saving the world through the wonder that is >>> CQL
Tap the Power of AVS by Jerry V. Caswell John Wynstra University of Northern Iowa
Hubnet Training One Health Network South East Asia Network Overview | Public and Members-only Pages; Communicating and Publishing using Blogs and News.
Experienced in Advanced Features in Microsoft Office Suite.
Welcome to Cambridge Histories Online This unique historical reference compendium allows instant access to the renowned texts of the Cambridge Histories.
Next Generation Z39.50 A Web Services Approach for Search and Retrieve 6 th Annual State GILS Conference, March 31 – April 3, 2004, Raleigh, NC William.
Basics of Databases and Information Retrieval1 Databases and Information Retrieval Lecture 1 Basics of Databases and Information Retrieval Instructor Mr.
Introduction to ProQuest and Ebook Central Platforms Ali Nazari-Nouri Training and Consulting Partner.
CRAI Library Catalog of University of Barcelona
Chapter 5 Queries.
User Awareness Program ‘Accessing Emerald’ Universitas Lancang Kuning
CRAI Library Catalog of University of Barcelona
Multimedia Information Retrieval
Guide To UNIX Using Linux Third Edition
CS 430 / INFO 430 Information Retrieval
Internet Search Methods
Internet Search Methods
Internet Search Methods
Internet Search Methods
Presentation transcript:

?! Advanced CQL and ProfilingMike Taylor Advanced CQL and Profiling 1. Esoteric CQL features: – Word Anchoring – Proximity – Relation modifiers – Boolean modifiers 2. Profiling 3. Prefix mapping 4. Defining relations

Mike Taylor CQL features: esoterica You are not expected to understand this. – comment in the Unix Version 7 source code. The point is that new users are not required to understand this, and may happily use CQL for many years – perhaps forever – without needing to. Advanced CQL and Profiling

Mike Taylor CQL esoterica: word anchoring A word beginning with ^ must occur at the start of its field. A word ending with ^ must occur at the end of its field. dinosaur– matches the complete dinosaur dinosaur^– also matches ^dinosaur– does not match the– matches the complete dinosaur ^the– also matches the^– does not match Advanced CQL and Profiling

Mike Taylor CQL esoterica: proximity The prox boolean, by default, requires its operands to be next to each other, in either order: cervical prox vertebra – equivalent to "cervical vertebra" or "vertebra cervical" (cervical or dorsal) prox vertebra – equivalent to "cervical vertebra" or "dorsal vertebra" or "vertebra cervical" or "vertebra dorsal" Advanced CQL and Profiling

Mike Taylor CQL esoterica: proximity II Modifiers can generalise the semantics of proximity: cervical prox/distance<=5/ vertebrae – within five words of each other cervical prox/distance=0/unit=sentence vertebrae – within the same sentence cervical prox/distance>0/unit=paragraph vertebrae – in different paragraphs cervical prox/ordered vertebrae – in the specified order: exactly equivalent to "cervical vertebra" Advanced CQL and Profiling

Mike Taylor CQL esoterica: relation modifiers Modifiers can refine the semantics of relations: title =/stem dig – finds dig, digging, dug, etc. title any/relevant "dinosaur bird reptile" – finds sauropods, avian, crocodile, snake, etc. author =/fuzzy tailor – finds Mike Taylor phoneNumber exact/fuzzy " " – finds Advanced CQL and Profiling

Mike Taylor CQL esoterica: relation modifiers II Relation modifiers can be overloaded to specify extra information about the term that the relation joins to the index: createdDate >/isoDate " :45:00" – the term is in ISO 8601 format. location within/geom.polygon "(12,46) (15,52)" – the term indicates a polygon of two points (i.e. a straight line) rather than the corners of a rectangle. Advanced CQL and Profiling

Modifiers can refine the semantics of boolean operators. We've already seen some examples of this in proximity. cervical prox/distance<=5/ vertebrae – within five words of each other cervical or/exclusive vertebrae – one or the other, but not both. "denenberg or/rel.mean "information retrieval" "denenberg or/rel.sum "information retrieval" "denenberg or/rel.max "information retrieval" – average, total or maximum relevance of operands Mike Taylor CQL esoterica: boolean modifiers Advanced CQL and Profiling

Mike Taylor Profiling CQL Advanced CQL and Profiling For simple searching, it suffices to use common indexes. Semantic interoperability requires more precise behaviour. This lesson was learned in the Z39.50 world and resulted in the invention of profiles - specifications for a subset of the full specification that are needed to support an application. The classic example in Z39.50 is a Bath Profile for bibliographic searching. Similarly, we define a Bath Profile for CQL searching.

Mike Taylor Profiles and context sets Advanced CQL and Profiling A profile is not the same thing as a context set! A context set is merely a bag of indexes (and relation modifiers and boolean modifiers) that may be used in any application. A profile provides a palette of indexes drawn from several context sets. The distinction is similar to that between XML namespaces and XML Schemas. Schemas depend on namespaces, and may use several. CQL profiles depend on context sets, and may use several.

Mike Taylor Example: the Bath Profile Advanced CQL and Profiling See Bath searches may use any of the following indexes: dc.creatorbath.personalName dc.titlebath.corporateName dc.subjectbath.conferenceName cql.anywherebath.uniformTitle dc.identifierbath.issn dc.daterec.id bath.keyTitlebath.geographicName dc.formatbath.notes dc.languagebath.topicalSubject bath.possessingInstitutionbath.genreForm bath.name

Mike Taylor Existing and possible profiles Explicit CQL profiles have been created for some applications: Bath Profile for bibliographic data Zthes profile for hierarchical thesaurus navigation Profile are in development (or unwritten) for others: Google-like structureless searching Simple metadata searching with the Dublin Core CCG for collectable card games Music – musicalKey, arranger, duration, etc. GILS (Global Information Locator Service)... your application goes here! Advanced CQL and Profiling

So far, we have been free and easy with index prefixes such as dc. But how do we know what they mean? Why should dc mean Dublin Core rather than Deep Custard? dc.custardDepth <= 20 Why should bath mean the Bath Profile for bibliographic searching instead of plumbing supplies? bath.capacityInGallons > 45 Mike Taylor CQL esoterica: prefix mapping Advanced CQL and Profiling

Prefixes are just convenient, easy-to-type abbreviations. The real identifier of a context set is its URI. For example, the Dublin Core context set is info:srw/cql-context-set/1/dc-v1.1 but we map that URI to a prefix for convenience. This is exactly like XML namespaces: they are identified by URIs, but the URIs do not appear in the names of elements or attributes: short prefixes are used instead. Mike Taylor CQL esoterica: prefix mapping II Advanced CQL and Profiling

In XML, a prefix is associated with a namespace using: In CQL, a prefix is associated with a namespace using: >prefix= and the rest of the query follows. The following queries are exactly equivalent: >dc=info:srw/cql-context-set/1/dc-v1.1 dc.title=fish >yx=info:srw/cql-context-set/1/dc-v1.1 yx.title=fish Most applications will have established default mappings. Mike Taylor CQL esoterica: prefix mapping III Advanced CQL and Profiling

It is possible to establish the context set from which indexes with no explicit prefix are taken by omitting the prefix= part from the mapping: > title=baron and side=sinister So the following queries are exactly equivalent: >info:srw/cql-context-set/1/dc-v1.1 title=fish >yx=info:srw/cql-context-set/1/dc-v1.1 yx.title=fish Mike Taylor CQL esoterica: prefix mapping IV Advanced CQL and Profiling

Finally... Finally! :-) Prefix mappings can be stacked up: >dc = info:srw/cql-context-set/1/dc-v1.1 >bath= >rec=info:srw/cql-context-set/2/rec-1.0 rec.created < and dc.title=ecology and bath.conferenceName=dinosaur (Yes, this is all one query.) Mike Taylor CQL esoterica: prefix mapping V Advanced CQL and Profiling

Don't try this at home. Mike Taylor CQL esoterica: prefix mapping VI Advanced CQL and Profiling

Mike Taylor Defining relations Advanced CQL and Profiling CQL has a feature where any word can act as a relation. For example, the query: foo bar baz is interpreted as index-name foo, relation bar, term baz – even though there is no relation bar. This is a misfeature. it prevents the obvious interpretation of this query as a phrase-search or AND search. If your profile needs a new relation, consider defining it as a relation modifier on one of the existing relation, instead.

?! Mike Taylor Thanks for listening! Advanced CQL and Profiling