Holger Bast Benjamin Doerr Stefan Funke Joachim Giesen Kurt Mehlhorn Uli Meyer AG1: Ways of Working & Project Ideas MPI Retreat, Braunshausen, 12 th May.

Slides:



Advertisements
Similar presentations
A SharePoint site is a Web site that provides a central storage and collaboration space for documents, information, and ideas. A SharePoint site is a tool.
Advertisements

Lecture 24 MAS 714 Hartmut Klauck
The Efficiency of Algorithms Chapter 4 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Advanced Topics in Algorithms and Data Structures Lecture 7.2, page 1 Merging two upper hulls Suppose, UH ( S 2 ) has s points given in an array according.
Voronoi Diagrams in n· 2 O(√lglg n ) Time Timothy M. ChanMihai Pătraşcu STOC’07.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Engineering a Set Intersection Algorithm for Information Retrieval Alex Lopez-Ortiz UNB / InterNAP Joint work with Ian Munro and Erik Demaine.
1 Hashing, randomness and dictionaries Rasmus Pagh PhD defense October 11, 2002.
HCI Final Project Robust Real Time Face Detection Paul Viola, Michael Jones, Robust Real-Time Face Detetion, International Journal of Computer Vision,
MAE 552 – Heuristic Optimization Lecture 26 April 1, 2002 Topic:Branch and Bound.
1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman
Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.
Sigir’99 Inside Internet Search Engines: Search Jan Pedersen and William Chang.
1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Using Areas to Approximate to Sums of Series In the previous examples it was shown that an area can be represented by the sum of a series. Conversely the.
Federated Search of Text Search Engines in Uncooperative Environments Luo Si Language Technology Institute School of Computer Science Carnegie Mellon University.
BTREE Indices A little context information What’s the purpose of an index? Example of web search engines Queries do not directly search the WWW for data;
Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.
Subproject 3 Large Scale Optimization Participants CTI, CUNI, MPII, RWTH, TU Wroclaw, UCY, UPB presented by Burkhard Monien (UPB)
1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,
The CompleteSearch Engine: Interactive, Efficient, and Towards IR&DB integration Holger Bast Max-Planck-Institut für Informatik Saarbrücken, Germany joint.
Evaluation of software engineering. Software engineering research : Research in SE aims to achieve two main goals: 1) To increase the knowledge about.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
Type Less, Find More: Fast Autocompletion Search with a Succinct Index Holger Bast Max-Planck-Institut für Informatik Saarbrücken, Germany joint work with.
Search Engines WS 2009 / 2010 Prof. Dr. Hannah Bast Chair of Algorithms and Data Structures Department of Computer Science University of Freiburg Lecture.
Fast and Intelligent Search In Very Large Amounts of Data Hannah Bast Max-Planck-Institute for Informatics Saarbrücken Kick-off meeting for Cluster of.
Term Frequency. Term frequency Two factors: – A term that appears just once in a document is probably not as significant as a term that appears a number.
IT-522: Web Databases And Information Retrieval By Dr. Syed Noman Hasany.
MPII Joint Project Goals & ideas: department 1. Goals  cross departmental collaboration  strong outside visibility  external funding (DFG, VW-Stiftung?)
Evaluation of Agent Building Tools and Implementation of a Prototype for Information Gathering Leif M. Koch University of Waterloo August 2001.
Search Engines By: Faruq Hasan.
All right reserved by Xuehua Shen 1 Optimal Aggregation Algorithms for Middleware Ronald Fagin, Amnon Lotem, Moni Naor (PODS01)
Searching Specification Documents R. Agrawal, R. Srikant. WWW-2002.
Search Engines WS 2009 / 2010 Prof. Dr. Hannah Bast Chair of Algorithms and Data Structures Department of Computer Science University of Freiburg Lecture.
Uncovering the Invisible Web. Back in the day… Students used to research using resources hand-picked by librarians and teachers. These materials were.
Third Group Training Course in ICT for Production & Dissemination of Official Statistics ACTION PLAN By Rashmi Sharma India.
03/02/20061 Evaluating Top-k Queries Over Web-Accessible Databases Amelie Marian Nicolas Bruno Luis Gravano Presented By: Archana and Muhammed.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Sum and Difference Formulas Sum Formulas Sum and Difference Formulas Difference Formulas.
Information Retrieval and Web Search IR models: Vector Space Model Term Weighting Approaches Instructor: Rada Mihalcea.
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
5.1 Areas and Distances. Area Estimation How can we estimate the area bounded by the curve y = x 2, the lines x = 1 and x = 3, and the x -axis? Let’s.
Chapter 15 Running Time Analysis. Topics Orders of Magnitude and Big-Oh Notation Running Time Analysis of Algorithms –Counting Statements –Evaluating.
Fourth Grade Home Directory/H-Drive The location on the server where individual users can save their work. This directory is named the same as the username.
A Web Based Job Submission System for a Physics Computing Cluster David Jones IOP Particle Physics 2004 Birmingham 1.
1 Chapter 5 (3 rd ed) Your library is an excellent resource tool. Your library is an excellent resource tool.
Collection Synthesis Donna Bergmark Cornell Digital Library Research Group March 12, 2002.
Computer Science and Engineering Jianye Yang 1, Ying Zhang 2, Wenjie Zhang 1, Xuemin Lin 1 Influence based Cost Optimization on User Preference 1 The University.
5-4: Sigma Notation Objectives: Review sigma notation ©2002 Roy L. Gover
COLLECTING AND PROCESSING OF INFORMATION PRESENTATION © 2011 International Technology and Engineering Educators Association, STEM  Center for Teaching.
Confidence Intervals Cont.
Quantifying quantum coherence
Abolfazl Asudeh Azade Nazi Nan Zhang Gautam DaS
Text Based Information Retrieval
Information Retrieval in Department 1
The core algorithmic problem Ordinary Inverted Index
Collecting and processing of information Presentation 4.5.1
Classroom Examples of Robustness Problems in Geometric Computations
Use a table of values to estimate the value of the limit. {image}
Junqi Zhang+ Xiangdong Zhou+ Wei Wang+ Baile Shi+ Jian Pei*
Collecting and processing of information Presentation 4.5.1
Web archive data and researchers’ needs: how might we meet them?
Classroom Examples of Robustness Problems in Geometric Computations
Chapter 2.
Agenda What is SEO ? How Do Search Engines Work? Measuring SEO success ? On Page SEO – Basic Practices? Technical SEO - Source Code. Off Page SEO – Social.
Rounded Off Values Upper and Lower Bounds.
Notes Over 8.3 Simplifying Natural Base Expressions
Classroom Examples of Robustness Problems in Geometric Computations
Warm Up.
Warm Up.
Presentation transcript:

Holger Bast Benjamin Doerr Stefan Funke Joachim Giesen Kurt Mehlhorn Uli Meyer AG1: Ways of Working & Project Ideas MPI Retreat, Braunshausen, 12 th May 2006

AG1 Ways of Working & Project Ideas – MPI Retreat, Braunshausen, 12 th May 2006 AG1: Snapshot from 2005 EXPERIM.APPLIED THEORY

AG1 Ways of Working & Project Ideas – MPI Retreat, Braunshausen, 12 th May 2006 EXPERIM.APPLIED THEORY AG1: Typical Results 1/3

AG1 Ways of Working & Project Ideas – MPI Retreat, Braunshausen, 12 th May 2006 Matrix Rounding Given an n x n matrix, round each entry x to one of ⌊ x ⌋ or ⌈ x ⌉, such that maximal sum of differences in a sub-rectangle is minimized Komlós, Gabor, Tusnády 1975: is there an O(1) upper bound? Beck 1981: lower bound of log n, upper bound of (log n) 4, later (log n) 3.5+ε Bohus 1990: upper bound of (log n) 3.5 Matoušek 1995: upper bound of (log n) 2.5 (log log n) 0.5 Srinivasan 1997: upper bound of (log n) 2.5 Relaxation: each entry may be rounded to one of ⌊ x ⌋ -1, ⌊ x ⌋, ⌈ x ⌉, ⌈ x ⌉ + 1 Doerr 2003: upper bound of log n [SODA + European Journal of Combinatorics] Doerr, Güntürk, Yılmaz 2006: upper bound of 2 [to be submitted] difference in sub-rectangle: | | = 0.1 AG1: Typical Results 1/3

AG1 Ways of Working & Project Ideas – MPI Retreat, Braunshausen, 12 th May 2006 EXPERIM.APPLIED THEORY AG1: Typical Results 1/3

AG1 Ways of Working & Project Ideas – MPI Retreat, Braunshausen, 12 th May 2006 EXPERIM.APPLIED THEORY AG1: Typical Results 2/3

AG1 Ways of Working & Project Ideas – MPI Retreat, Braunshausen, 12 th May 2006 Robust Computational Geometry Goal: evaluation of critical predicates exactly but still efficiently e.g., say whether a given point is on a given line, or to its left or to its right Sample results: Arrangement of Curved Objects Eigenwillig, Kettner, Mehlhorn, Schömer, … SCG'04 (cubic curves), SCG'05 (some quartic curves) ESA'05 (EXACUS library) Mathematics for predicate design Experiments to demonstrate efficiency compute convex hull AG1: Typical Results 2/3

AG1 Ways of Working & Project Ideas – MPI Retreat, Braunshausen, 12 th May 2006 EXPERIM.APPLIED THEORY AG1: Typical Results 2/3

AG1 Ways of Working & Project Ideas – MPI Retreat, Braunshausen, 12 th May 2006 EXPERIM.APPLIED THEORY AG1: Typical Results 3/3

AG1 Ways of Working & Project Ideas – MPI Retreat, Braunshausen, 12 th May 2006 Autocompletion Search Engine Theory on central index data structure entropy(our new index) ≤ (1+ε)∙entropy(standard index) formulas for expected query processing time (under simplifying probablistic assumptions) Experiments on Terabyte Benchmark (25 million documents) standard index: 4.6 GB of space, up to 30 seconds per query our new index: 4.8 GB of space, never more than 50 milliseconds per query Usable Search Engine JavaScript, Ajax, Cookies, Debugging a Web Application, Apache, Socket Communication, Crawling, GUI, real data: äöüß, UTF, PS/PDF/DOC,... AG1: Typical Results 3/3 SIGIR 2006

AG1 Ways of Working & Project Ideas – MPI Retreat, Braunshausen, 12 th May 2006 EXPERIM.APPLIED THEORY AG1: Typical Results 3/3

AG1 Ways of Working & Project Ideas – MPI Retreat, Braunshausen, 12 th May 2006

Project Ideas Goals cross departmental collaboration strong outside visibility external funding (DFG, VW-Stiftung?) work packages / responsibilities proposal within six months duration ~ two years spin off?