CMU : Internet Search Technologies

Slides:



Advertisements
Similar presentations
Parallel Sorting Sathish Vadhiyar. Sorting  Sorting n keys over p processors  Sort and move the keys to the appropriate processor so that every key.
Advertisements

CPSC 335 Dr. Marina Gavrilova Computer Science University of Calgary Canada.
Lucene Part3‏. Lucene High Level Infrastructure When you look at building your search solution, you often find that the process is split into two main.
Parallel Sorting Algorithms Comparison Sorts if (A>B) { temp=A; A=B; B=temp; } Potential Speed-up –Optimal Comparison Sort: O(N lg N) –Optimal Parallel.
1 Tuesday, November 14, 2006 “UNIX was never designed to keep people from doing stupid things, because that policy would also keep them from doing clever.
Sorting and Searching Timothy J. PurcellStanford / NVIDIA Updated Gary J. Katz based on GPUTeraSort (MSR TR )U. of Pennsylvania.
Design of parallel algorithms Sorting J. Porras. Problem Rearrange numbers (x 1,...,x n ) into ascending order ? What is your intuitive approach –Take.
Microprocessors Introduction to ia64 Architecture Jan 31st, 2002 General Principles.
CS 584. Sorting n One of the most common operations n Definition: –Arrange an unordered collection of elements into a monotonically increasing or decreasing.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display. COMPSCI 125 Introduction to Computer Science I.
CS 584. Sorting n One of the most common operations n Definition: –Arrange an unordered collection of elements into a monotonically increasing or decreasing.
External Sorting Chapter 13.. Why Sort? A classic problem in computer science! Data requested in sorted order  e.g., find students in increasing gpa.
Fall 2013 Instructor: Reza Entezari-Maleki Sharif University of Technology 1 Fundamentals of Programming Session 17 These.
Face Detection using the Viola-Jones Method
1 Introduction to Computer Graphics with WebGL Ed Angel Professor Emeritus of Computer Science Founding Director, Arts, Research, Technology and Science.
ISC/GAM 4322 ISC 6310 Multimedia Development and Programming Unit 1 Graphics Systems and Models.
Lecture 12: Parallel Sorting Shantanu Dutt ECE Dept. UIC.
Outline  introduction  Sorting Networks  Bubble Sort and its Variants 2.
DATA REPRESENTATION, DATA STRUCTURES AND DATA MANIPULATION TOPIC 4 CONTENT: 4.1. Number systems 4.2. Floating point binary 4.3. Normalization of floating.
Learning to Detect Faces A Large-Scale Application of Machine Learning (This material is not in the text: for further information see the paper by P.
“Sorting networks and their applications”, AFIPS Proc. of 1968 Spring Joint Computer Conference, Vol. 32, pp
1 Discrete Structures – CNS2300 Text Discrete Mathematics and Its Applications Kenneth H. Rosen (5 th Edition) Chapter 2 The Fundamentals: Algorithms,
Unit-8 Sorting Algorithms Prepared By:-H.M.PATEL.
COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dr. Xiao Qin Auburn University
1 - CPRE 583 (Reconfigurable Computing): Streaming Applications Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 10: Fri 11/13/2009.
CS 501: Software Engineering Fall 1999 Lecture 23 Design for Usability I.
Advanced Algorithms Analysis and Design
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
Working in the Forms Developer Environment
NEW SORTING ALGORITHMS
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Interconnection Networks (Part 2) Dr.
External Sorting Chapter 13
Parallel Programming By J. H. Wang May 2, 2017.
Chapter 2 (16M) Sorting and Searching
CSE332: Data Abstractions Lecture 12: Introduction to Sorting
MCA 301: Design and Analysis of Algorithms
5.2 Eleven Advanced Optimizations of Cache Performance
Graphics Processing Unit
CS 21a: Intro to Computing I
Algorithms with numbers (1) CISC4080, Computer Algorithms
Timothy J. Purcell Stanford / NVIDIA
Implementing Simplified Molecular Dynamics Simulation in Different Parallel Paradigms Chao Mei April 27th, 2006 CS498LVK.
Chapter 15 QUERY EXECUTION.
Algorithm Design and Analysis (ADA)
How do you grade students projects in programming?
Algorithms Chapter 3 With Question/Answer Animations
Rank Aggregation.
Database Management Systems (CS 564)
Bitonic Sorting and Its Circuit Design
Mihir Awatramani Lakshmi kiran Tondehal Xinying Wang Y. Ravi Chandra
Objective of This Course
Implementation II Ed Angel Professor Emeritus of Computer Science
Arithmetic Logical Unit
How does Google search for everything? Computer Science at Work
External Sorting Chapter 13
Sorting and Searching Tim Purcell NVIDIA.
Information Technology Ms. Abeer Helwa
BIC 10503: COMPUTER ARCHITECTURE
Parallel Sorting Algorithms
Last Class We Covered Asymptotic Analysis Run “time” Big O
Kenneth Moreland Edward Angel Sandia National Labs U. of New Mexico
Direct Manipulation.
Search Engine Architecture
Simulation And Modeling
12. Command Pattern SE2811 Software Component Design
RADEON™ 9700 Architecture and 3D Performance
Chapter 8 Prototyping and Rapid Application Development
External Sorting Chapter 13
Implementation II Ed Angel Professor Emeritus of Computer Science
Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1
Presentation transcript:

CMU 15-505: Internet Search Technologies

15-505 Internet Search Technologies Instructors: Alona Fyshe Scott Larsen Chris Monson Kamal Nigam http://www.cs.cmu.edu/~knigam/15-505

What does it take to build a world-class search engine and related services? Lots of computer science Massively parallel computation Special-purpose data storage Information retrieval Machine learning Language analysis User interface design

Study each of these topics in narrow but deep fashion Format: small seminar, readings, interactive discussions, programming practicum Grading: 55% programming homework 30% reading response 15% class participation

What are reading responses? Practice for reading and thinking about computer science research papers Meant to be open-ended, fairly short (1 page) Can be: Summary of paper Critique of theory, experiments, approach Suggestions for follow-on studies

Collaboration and Cheating Please collaborate on ideas, approaches, diagnosing problems – use the mailing list All words and code must be your own Disclose all collaborations Clarify any doubts

What will make this class enjoyable? Interactive Flexibility to explore fun domains and data Early feedback to us about what works and doesn’t

Problems in Internet Search Technology: Huge Problems E.g. what changed in the web since this time yesterday? Classic Problems E.g. sorting a gazillion numbers fast New Problems E.g. making sense of dynamic Cyrillic web pages Practical Problems Eg. how do we make both advertisers and consumers happier at the same time? Non-practical Problems E.g. what do you see if you zoom all the way in on the moon? Beautiful Problems And Fun Problems

A Taste Sorting Matrix Operations Scaling size up Scale time requirements down Matrix Operations Thinking about the problem in a blend of old ways and new ways

Classic Sorting Algorithms Quick Merge Selection Shell Heap Radix Bucket …. Ever heard of the Patience sort? Bozo sort?

Enlarge the Problem: 1,000x too many keys for a single machine 1024 machines to use

Sorting: Parallel How would you do it? Quick? Merge? Selection? Shell? Heap? Radix? Bucket? ….

Bitonic Sort: Batcher (1968) Bitonic Sequence: <a0, a1, …, an-1 > Exists i such that <a0 .. ai> is monotonically increasing and <ai+1 .. an-1> is monotonically decreasing Or: there exists a cyclic shift of indices such that the above is satisfied Eg. < 8, 9, 2, 1, 0, 4> is a bitonic sequence

Bitonic Merging Network Compliments of Dr. Quinn Snell, BYU

Bitonic Merge on a Hypercube

Bitonic Sort

Bitonic Sort Procedure BitonicSort for i = 0 to d -1 for j = i downto 0 if (i + 1)st bit of iproc <> jth bit of iproc comp_exchange_max(j, item) else comp_exchange_min(j, item) endif endfor endfor comp_exchange_max and comp_exchange_min compare and exchange the item with the neighbor on the jth dimension

Bitonic Sort Demo http://www.inf.fh-flensburg.de/lang/algorithmen/sortieren/bitonic/bitonicen.htm

Parallel Sort: Beauty or a Beast? What does it take to implement this?

Bitonic Sort: Why? O(n log2(n)) Data independent Resource needs are perfectly defined Very parallel friendly

Matrix Multiplication 0.75 0.25 0.0 0.75 0.25 0.0 0.5625 0.375 0.0625 0.0 0.1875 0.675 = *

Matrix Pipeline 0.5625 0.75 0.25 0.0 + 0.0625 + 0.0 + 0.75 0.25 0.0 0.0 = 0.625 0.375 0.0 0.1875 0.75 0.0625 0.5625

Visualization = *

Visualization * =

Visualization

Visualization Add a “top down” slide with 4 rectangles and the image plane

Matrix Multiplication A cube of processors Each does a chunk of the computation Each needs different (and overlapping) portions of the input Each passes intermediate results to certain neighbors Result is stored across multiple machines Seems kinda heavy for a simple algorithm! Lookup Fox’s algorithm and Canon’s algorithm Very pretty at one level Gory at another level

A Different View Courtesy http://www.unrealtournament3.com/

Multiplication Multi-texturing *

Addition Blending + =

Graphics Pipeline Multiply Multiply Add Image (Frame Buffer)

How the Algorithm Works Add a “top down” slide with 4 rectangles and the image plane

How the Algorithm Works

How the Algorithm Works *

How the Algorithm Works * Color all four planes in upper right image

How the Algorithm Works * +

Performance

GPU Sorting

Problems in Internet Search Technology: Huge Problems Classic Problems New Problems Practical Problems Non-practical Problems Beautiful Problems Fun Problems

Questions? CMU 15-505: Internet Search Technologies Kamal Nigam (knigam@google.com) Chris Monson (shiblon@google.com) Alona Fyshe (alonaf@google.com) Scott Larsen (esl@google.com)

Bitonic Rearranging (cycling)