Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMU : Internet Search Technologies

Similar presentations


Presentation on theme: "CMU : Internet Search Technologies"— Presentation transcript:

1 CMU 15-505: Internet Search Technologies

2 15-505 Internet Search Technologies
Instructors: Alona Fyshe Scott Larsen Chris Monson Kamal Nigam

3 What does it take to build a world-class search engine and related services?
Lots of computer science Massively parallel computation Special-purpose data storage Information retrieval Machine learning Language analysis User interface design

4 Study each of these topics in narrow but deep fashion
Format: small seminar, readings, interactive discussions, programming practicum Grading: 55% programming homework 30% reading response 15% class participation

5 What are reading responses?
Practice for reading and thinking about computer science research papers Meant to be open-ended, fairly short (1 page) Can be: Summary of paper Critique of theory, experiments, approach Suggestions for follow-on studies

6 Collaboration and Cheating
Please collaborate on ideas, approaches, diagnosing problems – use the mailing list All words and code must be your own Disclose all collaborations Clarify any doubts

7 What will make this class enjoyable?
Interactive Flexibility to explore fun domains and data Early feedback to us about what works and doesn’t

8 Problems in Internet Search Technology:
Huge Problems E.g. what changed in the web since this time yesterday? Classic Problems E.g. sorting a gazillion numbers fast New Problems E.g. making sense of dynamic Cyrillic web pages Practical Problems Eg. how do we make both advertisers and consumers happier at the same time? Non-practical Problems E.g. what do you see if you zoom all the way in on the moon? Beautiful Problems And Fun Problems

9 A Taste Sorting Matrix Operations Scaling size up
Scale time requirements down Matrix Operations Thinking about the problem in a blend of old ways and new ways

10 Classic Sorting Algorithms
Quick Merge Selection Shell Heap Radix Bucket …. Ever heard of the Patience sort? Bozo sort?

11 Enlarge the Problem: 1,000x too many keys for a single machine
1024 machines to use

12 Sorting: Parallel How would you do it? Quick? Merge? Selection? Shell?
Heap? Radix? Bucket? ….

13 Bitonic Sort: Batcher (1968)
Bitonic Sequence: <a0, a1, …, an-1 > Exists i such that <a0 .. ai> is monotonically increasing and <ai+1 .. an-1> is monotonically decreasing Or: there exists a cyclic shift of indices such that the above is satisfied Eg. < 8, 9, 2, 1, 0, 4> is a bitonic sequence

14 Bitonic Merging Network
Compliments of Dr. Quinn Snell, BYU

15 Bitonic Merge on a Hypercube

16 Bitonic Sort

17 Bitonic Sort Procedure BitonicSort for i = 0 to d for j = i downto 0 if (i + 1)st bit of iproc <> jth bit of iproc comp_exchange_max(j, item) else comp_exchange_min(j, item) endif endfor endfor comp_exchange_max and comp_exchange_min compare and exchange the item with the neighbor on the jth dimension

18 Bitonic Sort Demo

19 Parallel Sort: Beauty or a Beast?
What does it take to implement this?

20 Bitonic Sort: Why? O(n log2(n)) Data independent
Resource needs are perfectly defined Very parallel friendly

21 Matrix Multiplication
0.75 0.25 0.0 0.75 0.25 0.0 0.5625 0.375 0.0625 0.0 0.1875 0.675 = *

22 Matrix Pipeline 0.5625 0.75 0.25 0.0 + 0.0625 + 0.0 + 0.75 0.25 0.0 0.0 = 0.625 0.375 0.0 0.1875 0.75 0.0625 0.5625

23 Visualization = *

24 Visualization * =

25 Visualization

26 Visualization Add a “top down” slide with 4 rectangles and the image plane

27 Matrix Multiplication
A cube of processors Each does a chunk of the computation Each needs different (and overlapping) portions of the input Each passes intermediate results to certain neighbors Result is stored across multiple machines Seems kinda heavy for a simple algorithm! Lookup Fox’s algorithm and Canon’s algorithm Very pretty at one level Gory at another level

28 A Different View Courtesy

29 Multiplication Multi-texturing
*

30 Addition Blending + =

31 Graphics Pipeline Multiply Multiply Add Image (Frame Buffer)

32 How the Algorithm Works
Add a “top down” slide with 4 rectangles and the image plane

33 How the Algorithm Works

34 How the Algorithm Works
*

35 How the Algorithm Works
* Color all four planes in upper right image

36 How the Algorithm Works
* +

37 Performance

38 GPU Sorting

39 Problems in Internet Search Technology:
Huge Problems Classic Problems New Problems Practical Problems Non-practical Problems Beautiful Problems Fun Problems

40 Questions? CMU 15-505: Internet Search Technologies
Kamal Nigam Chris Monson Alona Fyshe Scott Larsen

41 Bitonic Rearranging (cycling)


Download ppt "CMU : Internet Search Technologies"

Similar presentations


Ads by Google