1 Scalable Distributed Data Structures Part 2 Witold Litwin Paris 9

Slides:



Advertisements
Similar presentations
Where to leave the data ? – Parallel systems – Scalable Distributed Data Structures – Dynamic Hash Table (P2P)
Advertisements

RAID (Redundant Arrays of Independent Disks). Disk organization technique that manages a large number of disks, providing a view of a single disk of High.
External Memory Hashing. Model of Computation Data stored on disk(s) Minimum transfer unit: a page = b bytes or B records (or block) N records -> N/B.
1 Hash-Based Indexes Module 4, Lecture 3. 2 Introduction As for any index, 3 alternatives for data entries k* : – Data record with key value k – –Choice.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 11.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 11.
CPSC 404, Laks V.S. Lakshmanan1 Hash-Based Indexes Chapter 11 Ramakrishnan & Gehrke (Sections )
DBMS 2001Notes 4.2: Hashing1 Principles of Database Management Systems 4.2: Hashing Techniques Pekka Kilpeläinen (after Stanford CS245 slide originals.
Chapter 11 Indexing and Hashing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
File Processing : Hash 2015, Spring Pusan National University Ki-Joune Li.
Rim Moussa University Paris 9 Dauphine Experimental Performance Analysis of LH* RS Parity Management Workshop on Distributed Data Structures: WDAS 2002.
Chapter 11 (3 rd Edition) Hash-Based Indexes Xuemin COMP9315: Database Systems Implementation.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
RAID- Redundant Array of Inexpensive Drives. Purpose Provide faster data access and larger storage Provide data redundancy.
2P13 Week 11. A+ Guide to Managing and Maintaining your PC, 6e2 RAID Controllers Redundant Array of Independent (or Inexpensive) Disks Level 0 -- Striped.
B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
Dr. Kalpakis CMSC 661, Principles of Database Systems Index Structures [13]
Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
BTrees & Bitmap Indexes
B+-tree and Hashing.
Data Indexing Herbert A. Evans. Purposes of Data Indexing What is Data Indexing? Why is it important?
Scalable and Distributed Similarity Search in Metric Spaces Michal Batko Claudio Gennaro Pavel Zezula.
1 Hash-Based Indexes Chapter Introduction  Hash-based indexes are best for equality selections. Cannot support range searches.  Static and dynamic.
Quick Review of material covered Apr 8 B+-Tree Overview and some definitions –balanced tree –multi-level –reorganizes itself on insertion and deletion.
Witold Litwin Riad Mokadem Thomas Schwartz Disk Backup Through Algebraic Signatures.
E.G.M. PetrakisHashing1 Hashing on the Disk  Keys are stored in “disk pages” (“buckets”)  several records fit within one page  Retrieval:  find address.
Homework #3 Due Thursday, April 17 Problems: –Chapter 11: 11.6, –Chapter 12: 12.1, 12.2, 12.3, 12.4, 12.5, 12.7.
File Structures Dale-Marie Wilson, Ph.D.. Basic Concepts Primary storage Main memory Inappropriate for storing database Volatile Secondary storage Physical.
I/O Systems and Storage Systems May 22, 2000 Instructor: Gary Kimura.
CS4432: Database Systems II
Chapter 61 Chapter 6 Index Structures for Files. Chapter 62 Indexes Indexes are additional auxiliary access structures with typically provide either faster.
ICS 220 – Data Structures and Algorithms Week 7 Dr. Ken Cosh.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment W. LITWIN CERIA Laboratory H.YAKOUBEN Paris Dauphine University
1 SD-Rtree: A Scalable Distributed Rtree Witold Litwin & Cédric du Mouza & Philippe Rigaux.
1 High-Availability LH* Schemes with Mirroring W. Litwin, M.-A. Neimat U. Paris 9 & HPL Palo-Alto
1 CPS216: Advanced Database Systems Notes 04: Operators for Data Access Shivnath Babu.
Multidimensional Indexes Applications: geographical databases, data cubes. Types of queries: –partial match (give only a subset of the dimensions) –range.
Scalable Distributed Data Structures & High-Performance Computing Witold Litwin Fethi Bennour CERIA University Paris 9 Dauphine
12.1 Chapter 12: Indexing and Hashing Spring 2009 Sections , , Problems , 12.7, 12.8, 12.13, 12.15,
Hashing and Hash-Based Index. Selection Queries Yes! Hashing  static hashing  dynamic hashing B+-tree is perfect, but.... to answer a selection query.
1 High-Availability in Scalable Distributed Data Structures W. Litwin.
1 WDAS – 14 June THESSALONIKI(Greece) Range Queries to Scalable Distributed Data Structure RP* WDAS – 14 June THESSALONIKI(Greece) Range.
Indexing and hashing Azita Keshmiri CS 157B. Basic concept An index for a file in a database system works the same way as the index in text book. For.
1 Scalable Distributed Data Structures S tate-of-the-art Part 1 Witold Litwin Paris 9
The concept of RAID in Databases By Junaid Ali Siddiqui.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Indexed Sequential Access Method.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 10.
RAID Disk Arrays Hank Levy. 212/5/2015 Basic Problems Disks are improving, but much less fast than CPUs We can use multiple disks for improving performance.
Marwan Al-Namari Hassan Al-Mathami. Indexing What is Indexing? Indexing is a mechanisms. Why we need to use Indexing? We used indexing to speed up access.
LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment W. LITWIN CERIA Laboratory H.YAKOUBEN Paris Dauphine University
Introduction and File Structures Database System Implementation CSE 507 Some slides adapted from R. Elmasri and S. Navathe, Fundamentals of Database Systems,
Chapter 11 Indexing And Hashing (1) Yonsei University 1 st Semester, 2016 Sanghyun Park.
Data Indexing Herbert A. Evans.
Indexing ? Why ? Need to locate the actual records on disk without having to read the entire table into memory.
Hash-Based Indexes Chapter 11
External Memory Hashing
RAID Disk Arrays Hank Levy 1.
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
External Memory Hashing
Hash-Based Indexes Chapter 10
RAID Disk Arrays Hank Levy 1.
Indexing and Hashing Basic Concepts Ordered Indices
Multidimensional Indexes
Implementation issues
2018, Spring Pusan National University Ki-Joune Li
RAID Disk Arrays Hank Levy 1.
Hash-Based Indexes Chapter 11
Chapter 11 Instructor: Xin Zhang
Presentation transcript:

1 Scalable Distributed Data Structures Part 2 Witold Litwin Paris 9

2 High-availability LH* schemes n In a large multicomputer, it is unlikely that all servers are up n Consider the probability that a bucket is up is 99 % –bucket is unavailable 3 days per year n One stores every key in 1 bucket –case of typical SDDSs, LH* included n Probability that n-bucket file is entirely up is » 37 % for n = 100 »0 % for n = 1000

3 High-availability LH* schemes n Using 2 buckets to store a key, one may expect : –99 % for n = 100 –91 % for n = 1000 n High availability SDDS –make sense –are the only way to reliable large SDDS files

4 High-availability LH* schemes n High-availability LH* schemes keep data available despite server failures –any single server failure –most of two server failures –some catastrophic failures n Three types of schemes are currently known –mirroring –striping –grouping

5 High-availability LH* schemes n There are two files called mirrors n Every insert propagates to both –splits are nevertheless autonomous n Every search is directed towards one of the mirrors –the primary mirror for the corresponding client n If a bucket failure is detected, the spare is produced instantly at some site –the storage for failed bucket is reclaimed –it is allocated to another bucket when again available

6 High-availability LH* schemes n Two types of LH* schemes with mirroring appear n Structurally-alike (SA) mirrors –same file parameters »keys are presumably at the same buckets n Structurally-dissimilar (SD) mirrors »keys are presumably at different buckets –loosely coupled = same LH-functions h i –minimally coupled = different LH-functions h i

7 LH* with mirroring SA-mirrors SD-mirrors

8 n SA-mirrors –most efficient for access and spare production –but max loss in the case of two-bucket failure n Loosely-coupled SD-mirrors –less efficient for access and spare production –lesser loss of data for a two-bucket failure n Minimally-coupled SD-mirrors –least efficient for access and spare production –min. loss for a two-bucket failure LH* with mirroring

9 Some design issues n Spare production algorithm n Propagating the spare address to the client n Forwarding in the presence of failure n Discussion in : –High-Availability LH* Schemes with Mirroring. W. Litwin, M.-A. Neimat. COOPIS-96, Brussels

10 LH* with striping (LH* arrays) n High-availability through striping of a record among several LH* files –as for RAID (disk array) schemes –but scalable to as many sites as needed n Less storage than for LH* with mirroring n But less efficient for insert and search –more messaging »although using shorter messages

11 LH* S

12 n Spare segments are produced when failures are detected n Segment file schemes are as for LH* with mirroring –SA-segment files »+ perf. pour l'adressage, mais - perf. pour la sécurité –SD-segment files n Performance analysis in detail remains to be done LH* S

13VariantesVariantes n Segmentation level –bit »best security –meaningless single-site data –meaningless content of a message –block »less CPU time for the segmentation –attribute »selective access to segments becomes possible »fastest non-key search

14 LH*gLH*g n Avoids striping –to improve non-key search time –keeping about the same storage requirements for the file n Uses grouping of k records instead –group members remain always in different buckets »despite the splits and file growth n Allows for high-availability on demand –without restructuring the original LH* file

15 LH*gLH*g Primary LH* file Parity LH* file

16 LH*gLH*g Evolution of an LH*g file before 1st split (a), and after a few more inserts, (b), (c). Evolution of an LH*g file before 1st split (a), and after a few more inserts, (b), (c). Group size k = 3

17 LH*gLH*g n If a primary or parity bucket fails –the hot-spare can always be produced from the group members that are still alive n If more than one group member fails –then there is data loss n Unless the parity file has more extensive data –e.g. Hamming codes

18 Other hash SDDSs n DDH (B. Devine, FODO-94) –uses Extensible Hash as the kernel –clients have EH images –less overflows –more forwardings n Breitbart & al (ACM-Sigmod-94) –less overflows & better load – more forwardings

19 RP* schemes n Produce 1-d ordered files –for range search n Uses m-ary trees –like a B-tree n Efficiently supports range queries –LH* also supports range queries »but less efficiently n Consists of the family of three schemes –RP* N RP* C and RP* S

20 RP* schemes

21

22 RP* Range Query Termination n Time-out n Deterministic –Each server addressed by Q sends back at least its current range –The client performs the union U of all results –It terminates when U covers Q

23 RP*c client image 0 -  for 2 in of IAMs 1 of  3 for in

24 RP*sRP*s. An RP* file with (a) 2-level kernel, and (b) 3-level kernel s aa and these the these a of that is of in i it for for a a   this these to a in  2 of 1 these 4  b a inb c c  a in 0 for 3 c   * (b) a and  the to a of that of is of in i it for for 3 in 2 of 1  a for  a a a  (a) Distr. Index root Distr. Index root Distr. Index page Distr. Index page IAM = traversed pages

25 RP* N drawback of multicasting

26 RP* N RP* C RP* S LH*

27

28

29 Research Frontier ?

30 Kroll & Widmayer schema (ACM-Sigmod 94) n Provides for 1-d ordered files –practical alternative to RP* schemes n Efficiently supports range queries n Uses a paged distributed binary tree –can get unbalanced

31 k-RP*k-RP* n Provides for multiattribute (k-d) search –key search –partial match and range search –candidate key search »Orders of magnitude better search perf. than traditional ones n Uses –a paged distributed k-d tree index on the servers –partial k-d trees on the clients

32 Access performance (case study) n Three queries to a 400 MB, 4GB and a 40 GB file –Q1 - A range query, which selects 1% of the file –Q2 - Query Q1 and an additional predicate on non-key attributes selecting 0.5% of the records selected by Q1 –Q3 - A partial match x0 = c0 successful search in a 3-d file, where x0 is a candidate key n Response time is computed for: –a traditional disk file –a k-RP* file on a 10 Mb/s net –a k-RP* file on a 1 Gb/s net n Factor S is the corresponding speed-up –reaches 7 orders of magnitude

33

34 dPi-treedPi-tree n Side pointers between the leaves –traversed when an addressing error occurs »a limit can be set-up to guard against the worst case n Base index at some server n Client images tree built path-by-path –from base index or through IAMs »called correction messages n Basic performance similar to k-RP* ? –Analysis pending

35 ConclusionConclusion n Since their inception, in 1993, SDDS were subject to important research effort n In a few years, several schemes appeared –with the basic functions of the traditional files »hash, primary key ordered, multi-attribute k-d access –providing for much faster and larger files –confirming inital expectations

36 Future work n Deeper analysis –formal methods, simulations & experiments n Prototype implementation –SDDS protocol (on-going in Paris 9) n New schemes –High-Availability & Security –R* - trees ? n Killer apps –large storage server & object servers –object-relational databases »Schneider, D & al (COMAD-94) –video servers –real-time –HP scientific data processing

37 ENDEND Thank you for your attention Witold Litwin

38