Parallel Databases Michael French, Spencer Steele, Jill Rochelle When Parallel Lines Meet by Ken Rudin (BYTE, May 98)

Slides:



Advertisements
Similar presentations
Interactive lesson about operating system
Advertisements

Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
 Computer hardware components are the physical pieces of the computer.  The major hardware components of a computer are: – The central processing.
Master/Slave Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al.
Higher Computing Computer Systems 3. Computer Performance.
RAID- Redundant Array of Inexpensive Drives. Purpose Provide faster data access and larger storage Provide data redundancy.
RAID Redundant Arrays of Inexpensive Disks –Using lots of disk drives improves: Performance Reliability –Alternative: Specialized, high-performance hardware.
Parallel Database Systems The Future Of High Performance Database Systems David Dewitt and Jim Gray 1992 Presented By – Ajith Karimpana.
CS 300 – Lecture 22 Intro to Computer Architecture / Assembly Language Virtual Memory.
CPSC 231 Sorting Large Files (D.H.)1 LEARNING OBJECTIVES Sorting of large files –merge sort –performance of merge sort –multi-step merge sort.
Chapter 1 Introduction 1.1A Brief Overview - Parallel Databases and Grid Databases 1.2Parallel Query Processing: Motivations 1.3Parallel Query Processing:
CS 347Notes 041 CS 347: Distributed Databases and Transaction Processing Notes04: Query Optimization Hector Garcia-Molina.
CMSC724: Database Management Systems Instructor: Amol Deshpande
Data-centric computing with Netezza Architecture DISC reading group September 24, 2007.
Chapter 5 Parallel Join 5.1Join Operations 5.2Serial Join Algorithms 5.3Parallel Join Algorithms 5.4Cost Models 5.5Parallel Join Optimization 5.6Summary.
Lecture 39: Review Session #1 Reminders –Final exam, Thursday 3:10pm Sloan 150 –Course evaluation (Blue Course Evaluation) Access through.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Given UPC algorithm – Cyclic Distribution Simple algorithm does cyclic distribution This means that data is not local unless item weight is a multiple.
Analyzing the Energy Efficiency of a Database Server Hanskamal Patel SE 521.
Chapter 3  Manage the computer’s resources ◦ CPU ◦ Memory ◦ Disk drives ◦ Printers  Establish a user interface  Execute and provide services for applications.
Shilpa Seth.  Centralized System Centralized System  Client Server System Client Server System  Parallel System Parallel System.
Server Hardware Chapter 22 Release 22/10/2010Jetking Infotrain Ltd.
C-Store: Column Stores over Solid State Drives Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun 19, 2009.
N-Tier Client/Server Architectures Chapter 4 Server - RAID Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept RAID – Redundant Array.
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
Basics and Architectures
Parallel Algorithms Sorting and more. Keep hardware in mind When considering ‘parallel’ algorithms, – We have to have an understanding of the hardware.
Previously Fetch execute cycle Pipelining and others forms of parallelism Basic architecture This week we going to consider further some of the principles.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Query optimization in relational DBs Leveraging the mathematical formal underpinnings of the relational model.
Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Chapter 3 Installing Windows XP Professional. Preparing for installation Pre-installation requirement; ◦ Hardware requirements ◦ Hardware compatibility.
Chapter 1: Introduction. 1.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 1: Introduction What Operating Systems Do Computer-System.
Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.
IT253: Computer Organization
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Parallel Database Systems Instructor: Dr. Yingshu Li Student: Chunyu Ai.
IMS 4212: Database Implementation 1 Dr. Lawrence West, Management Dept., University of Central Florida Physical Database Implementation—Topics.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Modeling Big Data Execution speed limited by: –Model complexity –Software Efficiency –Spatial and temporal extent and resolution –Data size & access speed.
Data Management for Decision Support Session-4 Prof. Bharat Bhasker.
Radix Sort and Hash-Join for Vector Computers Ripal Nathuji 6.893: Advanced VLSI Computer Architecture 10/12/00.
Relational Operator Evaluation. Overview Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g.,
CS 147 Virtual Memory Prof. Sin Min Lee Anthony Palladino.
Mapping the Data Warehouse to a Multiprocessor Architecture
(Superficial!) Review of Uniprocessor Architecture Parallel Architectures and Related concepts CS 433 Laxmikant Kale University of Illinois at Urbana-Champaign.
Computer Organization CS224 Fall 2012 Lesson 52. Introduction  Goal: connecting multiple computers to get higher performance l Multiprocessors l Scalability,
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Unit - 4 Introduction to the Other Databases.  Introduction :-  Today single CPU based architecture is not capable enough for the modern database.
Modern Information Retrieval
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Implementation of Database Systems, Jarek Gryz 1 Parallel DBMS Chapter 22, Part A.
 A computer is an electronic device that receives data (input), processes data, stores data, and produces a result (output).  It performs only three.
Computer Performance. Hard Drive - HDD Stores your files, programs, and information. If it gets full, you can’t save any more. Measured in bytes (KB,
Course 03 Basic Concepts assist. eng. Jánó Rajmond, PhD
CPT-S Advanced Databases 11 Yinghui Wu EME 49.
Introduction CSE 410, Spring 2005 Computer Systems
Modeling Big Data Execution speed limited by: Model complexity
CSE 410, Spring 2006 Computer Systems
Very Large Databases in your future
Assembly Language for Intel-Based Computers, 5th Edition
Spatial Analysis With Big Data
Mapping the Data Warehouse to a Multiprocessor Architecture
Lecture 17: Distributed Transactions
Disk Storage, Basic File Structures, and Buffer Management
Very large Databases in your future Eric Peterson.
CSE8380 Parallel and Distributed Processing Presentation
The Gamma Database Machine Project
Parallel DBMS DBMS Textbook Chapter 22
Presentation transcript:

Parallel Databases Michael French, Spencer Steele, Jill Rochelle When Parallel Lines Meet by Ken Rudin (BYTE, May 98)

What are Parallel/Scalable Databases? n Parallel/Scalable Databases: n Hardware Architecture Multiple Processors Multiple Disk Drives Large Memory Banks n Software Architecture Capable of processing parallel queries Data shipping capabilities

What makes Parallel Databases different from previous technologies?

Previous Technology n Hardware Single processor Small Disk Capacity Less Memory n Software Sequential Queries No partitioning of queries

Parallel Query: n A Query that partitions information to multiple processors and also has the ability to pipeline information

Information Partitioning n Divide the information into smaller tasks n Can have multiple meanings: –Distribution of info to multiple CPUs –Division of hard drive space to contain certain parts of the data

Information Partitioning 2

Information Pipelining n Allows separate processors to work on separate stages of a query –Scan –Join –Sort n Concept is akin to assembly line idea n Allows multiple queries to run at the same time

Information Pipelining 2

Sequential Query Example n Two Tables with 20 million rows each run on a uniprocessor machine –To perform scan, join & sort, query takes 12 mins. n Add partitioning –Query takes 3 mins. n Add Pipelining –12 queries can be run in 12 mins.

Parallel Kinds n Share-Everything –Hardware –Software n Share-Disk –Hardware –Software n Share-Nothing –Hardware –Software

Conclusion n Pros –Allows you to process more information –Provides for faster processing of queries n Cons –Expensive hardware & software –Much higher maintenance n Is a parallel database right for your organization?