Download presentation
Presentation is loading. Please wait.
1
Lars Arge 1/12 Lars Arge
2
2/12 Pervasive use of computers and sensors Increased ability to acquire/store/process data → Massive data collected everywhere Society increasingly “data driven” → Access/process data anywhere any time Increased data availability: “nano-technology-like” opportunity Nature issue 2/06: “2020 – Future of computing” Trend continues as e.g. sensors increasingly pervasive → Exponential growth of scientific data Paradigm shift: Science will be about mining data → Computer science paramount in all sciences Massive Data
3
Lars Arge 3/12 Algorithm Inadequacy Importance of scalability/efficiency → Algorithmics core computer science area Traditional algorithmics: Transform input to output using simple machine model Inadequate with e.g. Massive data Small/diverse devices Continually arriving data → Software inadequacies! Communities addressing inadequacies have emerged But much work still needs to be done
4
Lars Arge 4/12 Established 2007 Initially funded for 5 years by the national research foundation High level objectives: Advance algorithmic knowledge in “massive data” processing area Train researchers in world-leading international environment Be catalyst for multidisciplinary/industry collaboration Building on: Strong international team Establish vibrant international environment (focus on people) Focus areas: I/O-efficient, streaming, cache-oblivious algorithms Algorithm engineering Center for Massive Data Algorithmics
5
Lars Arge 5/12 I/O-Efficient Algorithms Problems involving massive data on disk Disk access is 10 6 times slower than main memory access Large access time amortized by transferring large blocks of data → Important to store/access data to take advantage of blocks I/O-efficient algorithms: Move as few disk blocks as possible to solve given problem “The difference in speed between modern CPU and disk technologies is analogous to the difference in speed in sharpening a pencil using a sharpener on one’s desk or by taking an airplane to the other side of the world and using a sharpener on someone else’s desk.” (D. Comer)
6
Lars Arge 6/12 I/O-Efficient Algorithms Matters Example: Traversing linked list (List ranking) Array size N = 10 elements Disk block size B = 2 elements Main memory size M = 4 elements (2 blocks) Difference between N and N/B large since block size is large Example: N = 256 x 10 6, B = 8000, 1ms disk access time N I/Os take 256 x 10 3 sec = 4266 min = 71 hr N/B I/Os take 256/8 sec = 32 sec Algorithm 2: N/B=5 I/OsAlgorithm 1: N=10 I/Os 1526 7 34108912 9 8 54763
7
Lars Arge 7/12 Streaming Algorithms Problems involving truly massive data Sequential read of disk blocks much faster than random read In many modern (sensor) applications data arrive continually → ( Massive) problems often have to be solved in one sequential scan Streaming algorithms: Use single scan, handle each element fast, using small space
8
Lars Arge 8/12 Cache-Oblivious Algorithms Problems to be solved on unknown and/or changing devices Block access important on all levels of memory hierarchy But memory hierarchies are very diverse Cache-oblivious algorithms: Use blocks efficiently on all levels of any hierarchy
9
Lars Arge 9/12 Algorithm engineering: Design/implementation of practical algorithms Experimentation Center motivated by theory inadequacy Center wants to promote interdisciplinary/industry work → Natural to consider algorithm engineering Algorithm engineering work can lead to practical breakthroughs Example: Flow simulation on massive terrain models ~18 billion points at 1 meter (>>1TB) Implementation of I/O-efficient alg. → two weeks to three hours!! Algorithm Engineering
10
Lars Arge 10/12 Center Team International core team of algorithms researchers Including top ranked US and European groups Leading expertise in focus areas AU: I/O, cache and algorithm engineering MPI: I/O (graph) and algorithm engineering MIT: Cache and streaming AU MPI MIT Arge Brodal Mehlhorn Meyer Demaine Indyk
11
Lars Arge 11/12 Center Activities Visits of core researchers Exchange of AU, MPI and MIT Post Docs and PhD students Visiting faculty, Post doc and students from other institutions Frequent summer schools: Streaming algorithms this week! Major international events: 25 th Annual Symposium on Computational Geometry in 2009 New conference: Symposium on Algorithms for Massive Datasets Theme Workshops to foster multidisciplinary/industry collaboration
12
Lars Arge 12/12 Initially funded for 5 years by the National Research Foundation Collaboration between three leading international groups Addresses inadequacies of traditional theory When processing massive data When using diverse devices Initial research focus on I/O-efficient, streaming, cache-oblivious Also focus on algorithm engineering and multidisciplinary work Focus on people: Establish vibrant international environment www.madalgo.au.dk
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.