Presentation is loading. Please wait.

Presentation is loading. Please wait.

Query Languages. Keyword-Based Querying  Single Word Queries  Context Queries  Phrase  Proximity  Boolean Queries  OR, AND, BUT  Natural Language.

Similar presentations


Presentation on theme: "Query Languages. Keyword-Based Querying  Single Word Queries  Context Queries  Phrase  Proximity  Boolean Queries  OR, AND, BUT  Natural Language."— Presentation transcript:

1 Query Languages

2 Keyword-Based Querying  Single Word Queries  Context Queries  Phrase  Proximity  Boolean Queries  OR, AND, BUT  Natural Language

3 Pattern Matching  … allow the retrieval of pieces of text that have some property (match a pattern).  Pattern is a set of syntactic features that must occur in a text segment  Words  Prefixes  Suffixes  Substrings  Ranges  Allowing errors (note edit distance)  Regular expressions  Union a|b Concatenation abrepetition a*  Example: (DNA | microbe) (question | problem) (  | s)

4 Structural queries  Form-like fixed structure  Given example: Mail archive  Other examples: Log file, …  Hypertext structure  Search by content and structure  Hierarchical structure  Intermediate level of flexibility

5 Hierarchical models  PAT expressions  PAT is a text searching system  Developed at University of Waterloo  Commercially available through Open Text Corporation in Waterloo See PAT expressions: an algebra for text search By Airi Salminen and Frank W. Tompa Acta Linguistica Hungarica 41, 1-4 (1992-93) 1994, 277-306 http://db.uwaterloo.ca/~fwtompa/publications.html  http://db.uwaterloo.ca/OED/search/expl-pat.html http://db.uwaterloo.ca/OED/search/expl-pat.html  Match a string and return the string and suffix, to the end of the document.

6  PAT interprets text as a set of suffix strings  For example, indexing every word in this sentence yields the 12 strings: For example, indexing every word in this sentence yields the 12 strings: example, indexing every word in this sentence yields the 12 strings: indexing every word in this sentence yields the 12 strings: every word in this sentence yields the 12 strings: word in this sentence yields the 12 strings: in this sentence yields the 12 strings: this sentence yields the 12 strings: sentence yields the 12 strings: yields the 12 strings: the 12 strings: 12 strings: strings:

7 PAT search example  >> water  1: 48442 matches  >> pr sample.7  192807323,..sper.dr- water + a&lenis.dhfa&acu.goj voracious: s..  520790341,..e took to the water, disappeared, leaving it on the low under ba..  145798504,..nced from the water like a carp. 1843 Paget..  549737948,..4 The 1929 water ski champion, Herr Pribitzer of the water-sk..  190797617,..ngsley Water-Bab. iii. 116 Dark hovers under swirl..  549099801,..ating-oil..of water-white and odorless qualities...  549623784,.. the maddest *Waterloo-Crackers. 1851 Mayhe..  Source: http://db.uwaterloo.ca/OED/search/expl-pat.html

8 >> a..z 2: 60343111 matches >> pr sample 555709177,..e Christopher as my owne, I will he be put unto the schoale. </T.. 290164101,..ir slangy off-colour jokes. 1972 G. Bl.. 10053096,../D> Compl. Fam.-Piece ii. iii. 388 Amber Pear.. 97073359,..> in Cott. Hom. 201 &Th.e muchele delit of &th.ine swe.. 58277014,.. specially in knowledge (as the seraphim in love); a conventiona.. 194517420,..mplative, and nonverbal. 1957 Wh.. 408029625,..us widths and patterns. 1833 J. Bennett.. 481205743,..design of the SEAC and DYSEAC. 1960 Gregory.. 440450458,..III. 558/2 The domain of Sonata was for a long while almost m.. 502535403,..ed comprises..two Gatling guns, and six *torpedo tubes or torped..

9 >> "to be or" 3: 458 matches >> pr sample.5 454233240,.. set upright; to be or become erect. Of hair, spines, etc.: cf... 562398537,.., liable to be or capable of being withheld... 94031003,..> ): i.e. to be (or make it) a matter of death of capital pu.. 192510576,..Of the voice: to be or to become husky. 1922.. 407435097,..7 A Sealer to be ordeyned &amp. sworne to stryke the Cloth &a..

10 Query Protocols  Z39.50  WAIS  CCL  CD-RDx  SFQL


Download ppt "Query Languages. Keyword-Based Querying  Single Word Queries  Context Queries  Phrase  Proximity  Boolean Queries  OR, AND, BUT  Natural Language."

Similar presentations


Ads by Google