5 Creating the Physical Model. Designing the Physical Model Phase IV: Defining the physical model.

5 Creating the Physical Model

Designing the Physical Model Phase IV: Defining the physical model

Database Object Naming Conventions Keep the logical and physical names similar and descriptive. Capitalize table and attribute names. Use underscores instead of spaces to delineate separate words in an object’s name. Use a suffix of _PK to indicate primary keys. Use a suffix of _ID to indicate production keys. Find a good balance between using very specific and very vague names.

Database Object Naming Conventions Develop a reasonable list of abbreviations. List all the objects’ names, and work with the user community to define them. Resolve name disputes. Document your naming standards in the metadata document. Plan for the naming standards to be a living document.

Translating the Dimensional Model into a Physical Model Apply the naming standards to the tables and attributes of the dimensional model. List table columns with primary keys listed first. Label primary keys consistently. Identify the format and length of columns. Label unique keys with a (#). Label column optionality with NULL (o) or NOT NULL (*) constraints. Label foreign keys with _FK. Use synonyms for user tables.

Physical Model Product *PRODUCT_ID v(11) *PRODUCT_DESC v(125) *PRODUCT_NAME v(35) *CATEGORY_ID v(20) *CATEGORY_DESC v(25) *SUPPLIER_ID v(20) *PRODUCT_STATUS v(10) *LIST_PRICE n *CATALOG_ID v(20) *PRODCUT_TYPEv(20) *PRODUCT_CODE v(10) *PROMOTION_CODE v(10) *WHSE_LOCATION v(10) *VALID_FROM_DATE d *VALID_TO_DATE d # *Product _PK n # *Channel_PK n # *Promotion_PK n

Defining the Hardware Transforming the base dimensional data model into the physical model includes some of the following: Defining naming and database standards Performing an initial sizing Designing tablespaces Defining an initial indexing strategy Using partitioning to split table and index data into smaller, more manageable chunks Determining where to place database objects on disk (RAID, striping, disk mapping) Using parallel processing

Architectural Requirements Scalability Manageability Availability Extensibility Integrated Accessibility Reliability Flexibility User Budget Business Technology

Architecture Characteristics Robust Available Reliable Extensible Scalable Supportable Recoverable Parallel VLM (very large memory) 64-bit Connective Open

Hardware Requirements SMP (Symmetric multiprocessing) Cluster and MPP (massively parallel processing) Hybrids using SMP and MPP

Evaluation Criteria Determine the platform for your needs: SMP Clusters MPP Scalability Maturity Low High Low High

Application Database Operating system Hardware Parallel Processing Parallel daily operations Shared resources –Memory –Disk –Nothing Loosely or tightly coupled

Requirements differ from operational systems Benchmark –Available from vendors –Develop your own –Use realistic queries Scalability important Making the Right Choice

Shared disks Common bus CPU Shared memory Symmetric Multiprocessing (SMP) Communication by shared memory Disk controllers accessible to all CPUs Proven technology

Benefits: High concurrency Workload balancing Moderate scalability Easy administration Limitations: Memory (cluster for improvements) Bandwidth CPU Shared memory SMP

Clusters Node 1 Node 2Node 3 Common high-speed bus Shared disks Common high-speed bus Shared memory CPU Shared memory CPU Shared memory CPU

Clusters Shared disk, loosely coupled Dedicated memory High-speed bus Shared resources SMP node

Massively Parallel Processing (MPP) CPU Memory CPU Memory CPU Memory CPU Disk

MPP n Cube Arrangements A shared nothing architecture Many nodes Fast access Exclusive memory on a node Low cost per node Scalable n CUBE configuration

MPP Benefits Unlimited incremental growth Very scalable Fast access Low cost per node Good for DSS

MPP Limitations Rigid partitioning Cache consistency Restricted disk access High memory cost per node High management burden Careful data placement

Architectural Tiers Tiered structures: Modular Logical separation Distributed structures: Two-tier Three-tier Four-tier (and more) DB server Apps server Workstations Web server Internet

Sample System Architecture

Gateway Middleware Technologies for integration

Database Server Requirements Robust Available Reliable Extensible Scalable Supportable Recoverable Parallel

Parallelism Database Query Load Index Sort Backup Recovery

Further Considerations Optimization strategy Partitioning strategy Summarization strategy Indexing techniques Hardware and software scalability Availability Administration

Processor 1 Elapsed time Not parallel Processor 2 Processor 1 Processor 4 Processor 3 Parallel Parallel Processing A large task broken into smaller tasks: Concurrent execution One or more processors

Processor 2 Processor 1 Processor 4 Processor 3 Parallel Parallel Database Increased speed Improved scalability Performance gains –Availability –Flexibility –More users

Parallel Query SQL code split among server processes Query Subquery

Feb 98 Mar 98 Order table Jan 98 Parallel Load Bypass SQL processing to speed throughput

Parallel Processing Index: reduces the time to create Sort: allocates memory in cache efficiently

Parallel Processing Backup: runs simultaneously from any node (online and offline) Recovery: runs simultaneously from redo logs Summaries: uses the CREATE TABLE AS SELECT statement

5 Creating the Physical Model. Designing the Physical Model Phase IV: Defining the physical model.

Similar presentations

Presentation on theme: "5 Creating the Physical Model. Designing the Physical Model Phase IV: Defining the physical model."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

5 Creating the Physical Model. Designing the Physical Model Phase IV: Defining the physical model.

Similar presentations

Presentation on theme: "5 Creating the Physical Model. Designing the Physical Model Phase IV: Defining the physical model."— Presentation transcript:

Similar presentations

About project

Feedback