Download presentation
Presentation is loading. Please wait.
Published byJemimah Lawrence Modified over 9 years ago
2
Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177
3
Do More with Less More performance More scalability More users Less capital cost Less administration cost
4
RAC for Scalability, Availability, and Flexibility
5
Linux and RAC for DW Scalability Data Warehouse DB Linux ‘Starter’ Cluster: -Two nodes -One shared database
6
Linux and RAC for DW Scalability As the Business Grows … Data Warehouse DB
7
Linux and RAC for DW Scalability As the Business Grows … … so does your Environment: -Three Nodes -One Database Data Warehouse DB
8
Linux and RAC for DW Scalability As the Business Grows … Data Warehouse DB … and again: -Four Nodes -One Database
9
Linux and RAC for DW Availability When one node fails … Data Warehouse DB
10
Linux and RAC for DW Availability When one node fails … … the load is rebalanced and 3/4 th of the cluster continues the work Data Warehouse DB
11
Linux and RAC for DW Flexibility The Cluster can share all workload ubiquitously … Query ETL Data Warehouse DB
12
Linux and RAC for DW Flexibility … or do workload partitioning Query ETL Query ETL Data Warehouse DB
13
Linux and RAC for DW Flexibility Query ETL Query ETL Workload Management and Provisioning made easy ETL Data Warehouse DB Christmas – “Data Season” for Retail
14
Linux and RAC for DW Flexibility Query ETLQuery ETL Workload Management and Provisioning made easy ETL Data Warehouse DB January – “Analysis Season” Query
15
RAC and Parallel Execution
16
Very large queries utilize all resources on the cluster Large Query
17
RAC and Parallel Execution Many large-scale DWs have many concurrrent jobs –Multiple “small-to-medium” size queries –Degree of parallelism < CPUs-per-node With Oracle, queries will automatically run on a single node, eliminating traffic over the interconnect Q1Q2Q4Q3 Q5Q7Q6 Q8 Q9Q12Q11Q10
18
Recipe for a RAC Linux DW Processors I/O Interconnect
19
Data warehouse workload determines total number of CPU’s – Same sizing considerations as non-clustered DW How many processors per node? – Enough CPU’s so that a single node can handle most database operations Often, 4 cpu’s is a good balance Recipe for a RAC Linux DW: Processors
20
Recipe for a RAC Linux DW: I/O I/O is typically the primary determinant of data warehouse performance – Storage configurations for a data warehouse should always be chosen based on I/O bandwidth not storage capacity Rule of thumb: at least 100 MBytes/sec of IO bandwidth per gigahertz of processing power Every component of the IO system should provide enough bandwidth: disks, IO channels, IO adapters
21
Recipe for a RAC Linux DW: I/O CPU power and IO bandwidth should be balanced within a server – Example: Each node has 4 x 2ghz processors each node can utilize at least 800 MB/sec Each node should have enough slots to accommodate the necessary IO throughput If one host bus adapter drives 150 MB/sec, then 6 HBA’s should accommodate the needed IO bandwidth Note that at least one slot is required for the interconnect
22
Recipe for a RAC Linux DW: Interconnect Gigabit ethernets are generally sufficient for data-warehouse workloads – Oracle minimizes interconnect traffic for multi- user workloads Workloads requiring inter-node parallel query will utilize more interconnect bandwidth – 10Gb ethernet, fibre channel, Infiniband
23
‘Typical’ Cluster configuration 16-port switch 1 Gigabit ethernet 16 Storage arrays, each with 10-20 disks 4 nodes, each with 4 x 2 Ghz CPUs 5 PCI slots
24
Oracle Linux/RAC DW Customers
25
RAC/Linux DW Customers Euronext – Database size: 1.5 TB – Hardware: 2 x HP DL580 (4 CPUs) – Storage: HP MSA 1000 – Interconnect: 1 Gb ethernet – OS: Red Hat AOK Berlin – Database size: 780 GB – Hardware: 2 x HP DL580 (4 CPUs) – Storage: EMC Symmetrix – Interconnect: 2 x 1Gb ethernet – OS: SuSE Vanderbilt University – Database size: 50 GB – Hardware: 3 x HP DL580 (4 CPUs) – Storage: EMC Symmetrix – Interconnect: 1 Gb ethernet – OS: Red Hat National Bank AG – Database size: 75 GB – Hardware: 3 x IBM Express5800 (2 CPUs) – Interconnect: 100 Mb ethernet – OS: SuSE Ellis Island Foundation – Database size: 60 GB – Hardware: 2 x HP DL580 (4 CPUs) – Storage: NetApp – Interconnect: 1Gb ethernet – OS: Red Hat
26
Euronext Data warehouse supporting Euronext options exchange Oracle9i Release 2 (2-node RAC) HP DL580 G2 w 4 cpus each HP MSA1000 storage arrays
27
AOK Berlin 780 GB prodn Linux DW – Technical Overview – Uses RAC for scalability, consolidation – Oracle9i Release 2 (2-node RAC) new implementation – HP DL580 G2 w 4 cpus each 2 x 1Gb Ethernet interconnect – Linux O/S (SuSE) – Oracle Cluster Manager, Oracle Cluster File System – EMC Symmetrix – Plan to grow to 1 TB+ additional plans for 2-node 2 TB test system
28
National Bank AG 75 GB prodn Linux DW – Technical Overview – Customer portfolio management system – Oracle9i Release 2 (3-node RAC) new implementation – IBM Express5800/120Me BULL w 2 cpus, 6GB each 100 Mb Ethernet interconnect – Linux O/S (SuSE) – Oracle Cluster Manager + OCFS – Fiber-connected storage
29
Dell Global IT 35 GB prodn Linux DW – Technical Overview – 500 users – Datamart consolidation Single-instance upgrade Starting by moving Unix datamarts into single instance – Oracle9i Release 1 (2-node RAC) – Dell servers – Linux O/S (RedHat Advanced Server) – Oracle Cluster Manager – EMC Clarion storage
30
Vanderbilt University 50 GB prodn Linux DW – Technical Overview – 1000 users – Oracle9i Release 2 (3-node RAC) single-instance upgrade – Query and Reporting for GL and Labor Data mixed workload, primarily query BusinessObjects is query tool – HP Proliant DL580 w 4x2GHz cpus + 8 GB RAM each 1Gb Ethernet interconnect – Moved from HP-UX to Linux – Linux O/S (RedHat Advanced Server) – Oracle Cluster Manager – EMC Symmetrix
31
Ellis Island Foundation 60 GB prodn Linux DW – Technical Overview – 1000 users – Immigration records for the 22 million people who entered America through the port of New York and Ellis Island from 1892-1924 – Oracle9i Release 1 (2-node RAC) single-instance upgrade – HP Proliant DL580 w 4 cpus each 1 Gb interconnect – Linux O/S (RedHat Advanced Server) – Oracle Cluster Manager – NetApp Filer storage
32
eachnet.com Network Information Service 20 GB prodn Linux DW – Technical Overview – The largest C2C website in China who has about more than 1 million users. – Oracle9i Release 2 (2-node RAC) upgrade from single-instance Oracle8i – Dell PowerEdge 6650 w 4 cpus each 1Gb Ethernet interconnect – Moved from HP-UX to Linux – Linux O/S (RedHat Advanced Server) – Oracle Cluster Manager – EDI Technology EDI 3500 high-availability disk array
33
Linux-RAC and the Grid
34
Increasingly common customer theme these days is “provisioning” Customers want more value out of their hardware expenditures – they want to take advantage of unused capacity Oracle’s architecture is unique in being able to truly support flexible provisioning of processing power across multiple databases Oracle will be widely deployed in large commercial computing “grids” in the future Evolution of Business Intelligence with Oracle
35
ETL processing, Query & Reporting, Data Mining and Scoring, Cube Creation and OLAP Analysis Order Entry, Shipments, Procurement, Inventory, … Real Application Clusters
36
Resource Provisioning December: Order Processing Heavy – Analytics Light ETL processing, Query & Reporting, Data Mining, … Order Entry, Shipments, Procurement, Inventory, …
37
Order Entry, Shipments, Procurement, Inventory, … ETL processing, Query & Reporting, Data Mining and Scoring, Cube Creation and OLAP Analysis Resource Provisioning January: Order Processing Light – Heavy Analytics
38
Oracle RAC Brings Flexible Processing Power to Databases on the Grid
39
Next Steps … Data Warehousing DB Sessions 11:00 AM #40153, Room 304 Oracle Warehouse Builder: New Oracle Database 10g Release 3:30 PM #40176, Room 303 Security and the Data Warehouse 4:00 PM #40166, Room 130 Oracle Database 10g SQL Model Clause 8:30 AM #40125, Room 130 Oracle Database 10g: A Spatial VLDB Case Study 3:30 PM #40177, Room 303 Building a Terabyte Data Warehouse, Using Linux and RAC 5:00 PM #40043, Room 104 Data Pump in Oracle Database 10g: Foundation for Ultrahigh-Speed Data Movement TuesdayMonday For More Info On Oracle BI/DW Go To http://otn.oracle.com/products/bi/db/dbbi.html
40
8:30 AM #40179, Room 304 Oracle Database 10g Data Warehouse Backup and Recovery 11:00 AM #36782, Room 304 Experiences with Real-Time Data Warehousing using Oracle 10g 1:00PM #40150, Room 102 Turbocharge your Database, Using the Oracle Database 10g SQLAccess Advisor Thursday Oracle Database 10g Oracle OLAP Oracle Data Mining Oracle Warehouse Builder Oracle Application Server 10g Business Intelligence and Data Warehousing Demos All Four Days In The Oracle Demo Campground For More Info On Oracle BI/DW Go To http://otn.oracle.com/products/bi/db/dbbi.html Next Steps … Data Warehousing DB Sessions
41
Reminder – please complete the OracleWorld online session survey Thank you.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.