Download presentation
Presentation is loading. Please wait.
1
Project Description 198:541
2
Query Processing Project 1. Exact query answering using standard indexes 2. Advanced query processing Multidimensional Text data Top-k query model
3
Implementation Details Choice of C++ or Java Data storage You do not have to implement disk storage… but you can. You can use a DBMS for storage but you have to implement your own indexes. You can simulate disk rid by a hash table to access full tuples For the purpose of this project main-memory implementation is fine, but it might be easier for you to have something more persistent Single or multiple tables You can have joins in the advanced query processing part of the project
4
Step 1: Finding Data You should find a dataset Multi-attributes (3-4 minimum) At least 1000 data points Domain Numeric values Some text fields if you want to look into IR techniques Find data on which you can ask meaningful queries (exact and advanced) Sources: Census data Weather statistics Bibliographical data Sales data (amazon)…
5
Step 2: Exact Query Processing Deciding on meaningful indexes for your application Bulk loading indexes (type is data and query dependent) B+ tree Hash tables Answering exact queries Single-attributes Multi-attributes (merging single attributes results)
6
Step 3: Advanced Query Processing Numeric Data Multidimensional Indexes Multidimensional range query processing Skyline Queries Find the best undominated tuples in the data set Related: maximize a function of the attributes values Top-k Query Processing, Nearest-Neighbor Queries Smart index accesses based on preferred results values Join optimization using specific join indexes
7
Step 3: Advanced Query Processing Numeric and Text Data IR techniques for text-only query Inverted lists Indexes Exact Queries Top-k queries (tf.idf scores) Text and value queries Exact queries: find articles written in 2004 with “XML Path Indexes” in their abstract Top-k Queries Exact matching on text, ranking on numeric value Exact matching on numeric values, ranking on text Ranking on both numeric values and text More research-oriented
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.