5 Creating the Physical Model. Designing the Physical Model Phase IV: Defining the physical model.

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

Distributed Processing, Client/Server and Clusters
Database Architectures and the Web
High Performance Analytical Appliance MPP Database Server Platform for high performance Prebuilt appliance with HW & SW included and optimally configured.
Introduction to DBA.
Chapter 7 LAN Operating Systems LAN Software Software Compatibility Network Operating System (NOP) Architecture NOP Functions NOP Trends.
Netscape Application Server Application Server for Business-Critical Applications Presented By : Khalid Ahmed DS Fall 98.
Distributed Processing, Client/Server, and Clusters
Chapter 16 Client/Server Computing Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
Technical Architectures
Components and Architecture CS 543 – Data Warehousing.
Physical Database Monitoring and Tuning the Operational System.
Physical Design CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 Physical Design Steps 1. Develop standards 2.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 17 Client-Server Processing, Parallel Database Processing,
DISTRIBUTED COMPUTING
Distributed Systems: Client/Server Computing
1 © Prentice Hall, 2002 The Client/Server Database Environment.
Tiered architectures 1 to N tiers. 2 An architectural history of computing 1 tier architecture – monolithic Information Systems – Presentation / frontend,
Client-Server Processing and Distributed Databases
Chapter 3 Database Architectures and the Web Pearson Education © 2009.
An Introduction to Infrastructure Ch 11. Issues Performance drain on the operating environment Technical skills of the data warehouse implementers Operational.
Design Considerations CS2312. Conceptual Design includes Operational Use Mini World Requirements collection & analysis Conceptual design Data model design.
Designing a Data Warehouse Issues in DW design. Three Fundamental Processes Data Acquisition Data Storage Data a Access.
Shilpa Seth.  Centralized System Centralized System  Client Server System Client Server System  Parallel System Parallel System.
Database Architectures and the Web
PMIT-6102 Advanced Database Systems
1 Distributed Processing, Client/Server, and Clusters Chapter 13.
Computer System Architectures Computer System Software
Cloud Computing Lecture Column Store – alternative organization for big relational data.
1 © Prentice Hall, 2002 Chapter 8: The Client/Server Database Environment Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott,
MBA 664 Database Management Systems Dave Salisbury ( )
Database Architectures and the Web Session 5
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
Choosing a Computing Architecture Chapter 8. Architectural Requirements ScalabilityManageabilityAvailabilityExtensibility FlexibilityIntegration UserBusiness.
Database Design – Lecture 16
Oracle Challenges Parallelism Limitations Parallelism is the ability for a single query to be run across multiple processors or servers. Large queries.
© 2005 by Prentice Hall 1 Chapter 9: The Client/Server Database Environment Modern Database Management 7 th Edition Jeffrey A. Hoffer, Mary B. Prescott,
Data Warehousing at Acxiom Paul Montrose Data Warehousing at Acxiom Paul Montrose.
Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
1 Optimizing Your ColdFusion Applications for Oracle Justin Fidler, CNA, CPS, CCFD Chief Technology Officer Bantu, Inc. 8 May 2001.
Oracle Tuning Ashok Kapur Hawkeye Technology, Inc.
Designing and Deploying a Scalable EPM Solution Ken Toole Platform Test Manager MS Project Microsoft.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Intro – Part 2 Introduction to Database Management: Ch 1 & 2.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003.
Data Management for Decision Support Session-4 Prof. Bharat Bhasker.
Creating the Dimensional Model
7 Strategies for Extracting, Transforming, and Loading.
Coupling Facility. The S/390 Coupling Facility (CF), the key component of the Parallel Sysplex cluster, enables multisystem coordination and datasharing.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization.
ORACLE & VLDB Nilo Segura IT/DB - CERN. VLDB The real world is in the Tb range (British Telecom - 80Tb using Sun+Oracle) Data consolidated from different.
Background Computer System Architectures Computer System Software.
Primitive Concepts of Distributed Systems Chapter 1.
4 Copyright © Oracle Corporation, All rights reserved. Modeling the Data Warehouse.
Database Systems, 8 th Edition SQL Performance Tuning Evaluated from client perspective –Most current relational DBMSs perform automatic query optimization.
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Business System Development
Chapter 12 Distributed Database Management Systems
Flash Storage 101 Revolutionizing Databases
The Client/Server Database Environment
Database Performance Tuning and Query Optimization
Introduction of Week 3 Assignment Discussion
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
Mapping the Data Warehouse to a Multiprocessor Architecture
TERADATA RDBMS ARCHITECTURE
Chapter 11 Database Performance Tuning and Query Optimization
Presentation transcript:

5 Creating the Physical Model

Designing the Physical Model Phase IV: Defining the physical model

Database Object Naming Conventions Keep the logical and physical names similar and descriptive. Capitalize table and attribute names. Use underscores instead of spaces to delineate separate words in an object’s name. Use a suffix of _PK to indicate primary keys. Use a suffix of _ID to indicate production keys. Find a good balance between using very specific and very vague names.

Database Object Naming Conventions Develop a reasonable list of abbreviations. List all the objects’ names, and work with the user community to define them. Resolve name disputes. Document your naming standards in the metadata document. Plan for the naming standards to be a living document.

Translating the Dimensional Model into a Physical Model Apply the naming standards to the tables and attributes of the dimensional model. List table columns with primary keys listed first. Label primary keys consistently. Identify the format and length of columns. Label unique keys with a (#). Label column optionality with NULL (o) or NOT NULL (*) constraints. Label foreign keys with _FK. Use synonyms for user tables.

Physical Model Product *PRODUCT_ID v(11) *PRODUCT_DESC v(125) *PRODUCT_NAME v(35) *CATEGORY_ID v(20) *CATEGORY_DESC v(25) *SUPPLIER_ID v(20) *PRODUCT_STATUS v(10) *LIST_PRICE n *CATALOG_ID v(20) *PRODCUT_TYPEv(20) *PRODUCT_CODE v(10) *PROMOTION_CODE v(10) *WHSE_LOCATION v(10) *VALID_FROM_DATE d *VALID_TO_DATE d # *Product _PK n # *Channel_PK n # *Promotion_PK n

Defining the Hardware Transforming the base dimensional data model into the physical model includes some of the following: Defining naming and database standards Performing an initial sizing Designing tablespaces Defining an initial indexing strategy Using partitioning to split table and index data into smaller, more manageable chunks Determining where to place database objects on disk (RAID, striping, disk mapping) Using parallel processing

Architectural Requirements Scalability Manageability Availability Extensibility Integrated Accessibility Reliability Flexibility User Budget Business Technology

Architecture Characteristics Robust Available Reliable Extensible Scalable Supportable Recoverable Parallel VLM (very large memory) 64-bit Connective Open

Hardware Requirements SMP (Symmetric multiprocessing) Cluster and MPP (massively parallel processing) Hybrids using SMP and MPP

Evaluation Criteria Determine the platform for your needs: SMP Clusters MPP Scalability Maturity Low High Low High

Application Database Operating system Hardware Parallel Processing Parallel daily operations Shared resources –Memory –Disk –Nothing Loosely or tightly coupled

Requirements differ from operational systems Benchmark –Available from vendors –Develop your own –Use realistic queries Scalability important Making the Right Choice

Shared disks Common bus CPU Shared memory Symmetric Multiprocessing (SMP) Communication by shared memory Disk controllers accessible to all CPUs Proven technology

Benefits: High concurrency Workload balancing Moderate scalability Easy administration Limitations: Memory (cluster for improvements) Bandwidth CPU Shared memory SMP

Clusters Node 1 Node 2Node 3 Common high-speed bus Shared disks Common high-speed bus Shared memory CPU Shared memory CPU Shared memory CPU

Clusters Shared disk, loosely coupled Dedicated memory High-speed bus Shared resources SMP node

Massively Parallel Processing (MPP) CPU Memory CPU Memory CPU Memory CPU Disk

MPP n Cube Arrangements A shared nothing architecture Many nodes Fast access Exclusive memory on a node Low cost per node Scalable n CUBE configuration

MPP Benefits Unlimited incremental growth Very scalable Fast access Low cost per node Good for DSS

MPP Limitations Rigid partitioning Cache consistency Restricted disk access High memory cost per node High management burden Careful data placement

Architectural Tiers Tiered structures: Modular Logical separation Distributed structures: Two-tier Three-tier Four-tier (and more) DB server Apps server Workstations Web server Internet

Sample System Architecture

Gateway Middleware Technologies for integration

Database Server Requirements Robust Available Reliable Extensible Scalable Supportable Recoverable Parallel

Parallelism Database Query Load Index Sort Backup Recovery

Further Considerations Optimization strategy Partitioning strategy Summarization strategy Indexing techniques Hardware and software scalability Availability Administration

Processor 1 Elapsed time Not parallel Processor 2 Processor 1 Processor 4 Processor 3 Parallel Parallel Processing A large task broken into smaller tasks: Concurrent execution One or more processors

Processor 2 Processor 1 Processor 4 Processor 3 Parallel Parallel Database Increased speed Improved scalability Performance gains –Availability –Flexibility –More users

Parallel Query SQL code split among server processes Query Subquery

Feb 98 Mar 98 Order table Jan 98 Parallel Load Bypass SQL processing to speed throughput

Parallel Processing Index: reduces the time to create Sort: allocates memory in cache efficiently

Parallel Processing Backup: runs simultaneously from any node (online and offline) Recovery: runs simultaneously from redo logs Summaries: uses the CREATE TABLE AS SELECT statement