Beyond Basic Faceted Search Ben-Yitzhak, et al. Fahimeh Fakour CS 572 Summer 2010.

Slides:



Advertisements
Similar presentations
The Robert Gordon University School of Engineering Dr. Mohamed Amish
Advertisements

INFO624 - Week 2 Models of Information Retrieval Dr. Xia Lin Associate Professor College of Information Science and Technology Drexel University.
Introduction to Information Retrieval
OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Personalized Query Classification Bin Cao, Qiang Yang, Derek Hao Hu, et al. Computer Science and Engineering Hong Kong UST.
Improved TF-IDF Ranker
Fast Algorithms For Hierarchical Range Histogram Constructions
Enterprise Search – Where do we go from here? Aya Soffer, PhD DGM, Information and Interaction Technologies IBM Haifa Research Lab.
OLAP Services Business Intelligence Solutions. Agenda Definition of OLAP Types of OLAP Definition of Cube Definition of DMR Differences between Cube and.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Chapter 9 Business Intelligence Systems
Web Information Retrieval and Extraction Chia-Hui Chang, Associate Professor National Central University, Taiwan
Creating Concept Hierarchies in a Customer Self-Help System Bob Wall CS /29/05.
ISP 433/633 Week 10 Vocabulary Problem & Latent Semantic Indexing Partly based on G.Furnas SI503 slides.
LinkSelector: A Web Mining Approach to Hyperlink Selection for Web Portals Xiao Fang University of Arizona 10/18/2002.
The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February.
Organizing Data & Information
Web Information Retrieval and Extraction Chia-Hui Chang, Associate Professor National Central University, Taiwan Sep. 16, 2005.
Multiscale Visualization Using Data Cubes Chris Stolte, Diane Tang, Pat Hanrahan Stanford University Information Visualization October 2002 Boston, MA.
1 ISI’02 Multidimensional Databases Challenge: representation for efficient storage, indexing & querying Examples (time-series, images) New multidimensional.
SLIDE 1IS 202 – FALL 2003 Lecture 26: Final Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00.
Chapter 14 The Second Component: The Database.
CS2032 DATA WAREHOUSING AND DATA MINING
Misc Topics 2 Amol Deshpande CMSC424. Topics OLAP Data Warehouses Information Retrieval.
Databases & Data Warehouses Chapter 3 Database Processing.
SharePoint Users Group Content Classification Step by Step SharePoint 2007 and 2010.
Search Engines and Information Retrieval Chapter 1.
Introduction to Databases A line manager asks, “If data unorganized is like matter unorganized and God created the heavens and earth in six days, how come.
Promotion & Cataloguing AGCJ 407 Web Authoring in Agricultural Communications.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Beyond Basic Faceted Search Ori Ben-Yitzhak, …(10 people) IBM Research Lab & Yahoo! Research WSDM 2008 (ACM International Conference on W eb S earch and.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
Datawarehouse Objectives
Kelly Boccia Abi Natarajan Konstantin Livitski Senthil Anand Subbanan Meyyappan 1.
© 2001 Business & Information Systems 2/e1 Chapter 8 Personal Productivity and Problem Solving.
Lead Black Slide Powered by DeSiaMore1. 2 Chapter 8 Personal Productivity and Problem Solving.
1 Data Warehouses BUAD/American University Data Warehouses.
Topical Categorization of Large Collections of Electronic Theses and Dissertations Venkat Srinivasan & Edward A. Fox Virginia Tech, Blacksburg, VA, USA.
The Anatomy of a Large-Scale Hyper textual Web Search Engine S. Brin, L. Page Presenter :- Abhishek Taneja.
1 Topics about Data Warehouses What is a data warehouse? How does a data warehouse differ from a transaction processing database? What are the characteristics.
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
BNCOD07Indexing & Searching XML Documents based on Content and Structure Synopses1 Indexing and Searching XML Documents based on Content and Structure.
13 1 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
1 Of Crawlers, Portals, Mice and Men: Is there more to Mining the Web? Jiawei Han Simon Fraser University, Canada ACM-SIGMOD’99 Web Mining Panel Presentation.
Information systems and management in business Chapter 8 Business Intelligence (BI)
Foundations of Business Intelligence: Databases and Information Management.
Navigation Features. What are we discussing  We are not talking about layout and navigation bars here….we are talking beyond that.  Telling a user where.
Module 10a: Display and Arrangement IMT530: Organization of Information Resources Winter, 2008 Michael Crandall.
Automatic Metadata Discovery from Non-cooperative Digital Libraries By Ron Shi, Kurt Maly, Mohammad Zubair IADIS International Conference May 2003.
Social Search and Discovery Using a Unified Approach Einat Amitay et al. IBM Research Lab in Haifa, Israel HT’09 18 March 2011 IDB Lab Seminar.
 Product Variations and User Uploads  Product and Categories are not enough  Needs to extend product information  User can customize product information.
MIS2502: Data Analytics Advanced Analytics - Introduction.
CSE 5331/7331 F'071 CSE 5331/7331 Fall 2007 Dimensional Modeling Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University.
CS507 Information Systems. Lesson # 11 Online Analytical Processing.
Citation-Based Retrieval for Scholarly Publications 指導教授:郭建明 學生:蘇文正 M
Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems.
Query by Image and Video Content: The QBIC System M. Flickner et al. IEEE Computer Special Issue on Content-Based Retrieval Vol. 28, No. 9, September 1995.
Midterm/Final Presentation Project Name Students: [Name1], [Name2] Supervisor: [SV Name] Context: Project [A/B/Special] Semester: Winter/Spring, Year Date:
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Beyond Basic Faceted Search Presented by Chien-Ling Huang Jun. 30, 2011 Ori Ben-Yitzhak, Nadav Golbandi, Nadav Har’El, Ronny Lempel,Andreas Neumann, Shila.
WHIM- Spring ‘10 By:-Enza Desai. What is HCIR? Study of IR techniques that brings human intelligence into search process. Coined by Gary Marchionini.
Dense-Region Based Compact Data Cube
Information Organization: Overview
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Federated & Meta Search
WIRED Week 2 Syllabus Update Readings Overview.
Document Clustering Matt Hughes.
Information Organization: Overview
Presentation transcript:

Beyond Basic Faceted Search Ben-Yitzhak, et al. Fahimeh Fakour CS 572 Summer 2010

Introduction 1.Importance and significance 2.Background Information 3.Objective 4.Related work 5.Approach and Solutions 6.Enhancements 7.Contributions 8.Pros & Cons 7/7/20102Beyond Basic Faceted Search

1. Importance and Significance Too much info Transactions 7/7/20103Beyond Basic Faceted Search

1. Importance and Significance (cont) Categories, lists, and the human mind 7/7/20104Beyond Basic Faceted Search

7/7/2010Beyond Basic Faceted Search5 Research done in IBM & Yahoo Research labs Facets, buckets, and categories – Navigate multiple paths for different ordering Free text queries List of matching documents with count 2. Background Information

3. Objective Extend traditional facet – Beyond numbers NumbersWords Search & Index correlated documents Similarity to OLAP: multi- dimensional data 7/7/20106Beyond Basic Faceted Search

4. Related Work Multifaceted search – Lexical subsumption – Synsets and hypernym – RawSugar social tagging Online Analytical Processing (OLAP) – Multi-dimensional data – Aggregation of data: Cube N-dimensional “group by” Exciting new technique 7/7/20107Beyond Basic Faceted Search

5. Approach & Solutions 5.1Technologies: Lucene & Solr 5.2 Data Model 5.3Facet hierarchy: Forest 5.4Creating the facet paths 5.5Running the facet query 5.6Example 7/7/20108Beyond Basic Faceted Search

5.1. Technologies: Lucene & Solr Posting element: docID, offset, payload Matching document processing byte array of additional info (runtime accessible) 7/7/20109Beyond Basic Faceted Search

5.2. Data Model Taxonomy: hierarchical relationships among facets – Predefined taxonomy – Acquired/Learned through documents Facet-path forest – Tree: top-level facet 7/7/201010Beyond Basic Faceted Search

5.3. Facet hierarchy: Forest Find facet hierarchies Map documents to that hierarchy 7/7/201011Beyond Basic Faceted Search

5.4. Creating the facet paths Posting element for document for each prefix of P i Add path to taxonomy index Encode all k paths related to this document 7/7/201012Beyond Basic Faceted Search

5.5. Running the facet query Terms: – Faceted query  string + taxonomy subtrees – Faceted result set  ranked list of documents matching query + counters Lucene: use the Taxonomy Index function to determine ordinal number of paths 7/7/201013Beyond Basic Faceted Search

5.6. Example ClothingAll seasonsWomen’sAccessoriesChildren’sCoatsWinterCoats Price$30-$40$30-$35$36-$40 ColorRedBlue Facet$clothing: doc1,doc2 Facet$clothing$children’s:doc1 Facet$clothing: doc1,doc2 Facet$clothing$children’s:doc1 7/7/201014Beyond Basic Faceted Search

6. Enhancements 7/7/201015Beyond Basic Faceted Search

6.1. Business Intelligence Qualitative rather than quantitative – Best sellers rather than number of books published by author 7/7/201016Beyond Basic Faceted Search

6.2. Dynamic Facets: Welcome to the real world Not always independent data Example: – Running shorts Different sizes per color Location & price 7/7/201017Beyond Basic Faceted Search

6.2. Dynamic Facets: Solution Use tree over the data Manufacturer: Arthur’s Sports Model: Excalibur Type: Running Shorts Color: red Color: black Size: medium Store: SJ Price: $15 Store: NY Price: $20 Color: blue Size: small Store: SJ Price: $15 Store: NY Price: $20 7/7/201018Beyond Basic Faceted Search

6.2. Dynamic Facets: Solution (cont) Manufacturer: Arthur’s Sports Model: Galahad Type: Running Shorts Store: SJ Color: black, white Size: medium, large Price: $12 Color: blue Size: small Price: $20 7/7/201019Beyond Basic Faceted Search

7. Contributions “rich” aggregation : qualitative Engineering details Correlation in facet values 7/7/201020Beyond Basic Faceted Search

8.1. Pros Detailed description of engineering aspects & design decisions Use of implemented technologies Clearly defines the scope of the paper Give foundation/background information Compatible with real life data 7/7/201021Beyond Basic Faceted Search

8.2.  Cons  Experiments and testing: No qualitative measurement – effectiveness of “qualitative” facets Not explain relevance of some of the previous work Criteria for display/grouping? – Key use cases & known user access patterns not explained Build taxonomy: depth/breadth? 7/7/201022Beyond Basic Faceted Search

Thank You 7/7/201023Beyond Basic Faceted Search

References 247/7/201024Beyond Basic Faceted Search Ben-Yitzhak, et al. “Beyond Basic Faceted Search”. Proceedings of the international conference on Web search and web data mining. Pp.33-44, “Faceted Search with Solr” Lucid Imagination. July 1, from-the-Experts/Articles/Faceted-Search-Solr “Faceted classification” Wikipedia. July 7, Lemieux, Earley, and Associates. “Designing for Faceted Search” User Interface Engineering. July 6, (Originally in KM World, March 2009) Mattman, Chris. “Query Models” (presentation slides for class)