Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grid Computing Meets the Database Chris Smith Platform Computing Session # 36686.

Similar presentations


Presentation on theme: "Grid Computing Meets the Database Chris Smith Platform Computing Session # 36686."— Presentation transcript:

1 Grid Computing Meets the Database Chris Smith Platform Computing Session # 36686

2 © Platform Computing Inc. 2003 The best thing about the Grid is that it is unstoppable. The Economist, June 21, 2001 2

3 © Platform Computing Inc. 2003 3 Grid : Transparent, secure and coordinated computing resource sharing across geographically disparate sites What is Grid computing?

4 © Platform Computing Inc. 2003 4 Benefits of Grid Computing Grid technology is used to aggregate computing resources across the entire organization, regardless of location or business unit.  Provides virtually unlimited computing capacity  Delivers reliable, “always-on” computing infrastructure  Virtualizes IT infrastructure for end-users  Coordinates the usage of heterogeneous computing resources in order to accomplish business processing tasks

5 © Platform Computing Inc. 2003 5 Example Use Cases  Batch Process Automation  Multi-Site Capacity Computing  Service Virtualization

6 Batch Process Automation

7 © Platform Computing Inc. 2003 7 What is Platform JobScheduler? Intelligent batch process automation Grid-enabled enterprise batch process automation software Provides a Graphical Design Studio & Management console to design and control the scheduling of Oracle jobs and compute jobs with various dependencies (Line-of-Business Processes) across a virtualized environment

8 © Platform Computing Inc. 2003 8 Simplified Scheduling Environment for Oracle jobs and Compute jobs Single Point of Control to Design & Monitor Job Events, File Events, Time Events Central Repository for Storing/Sharing Jobs Business flows Sub flows Proxy dependencies Consistent, Flexible & Extensible Automated Exception Handling Re-running jobs, Killing jobs, Triggering other jobs

9 © Platform Computing Inc. 2003 9 More Efficient Use of Computing Resources for Oracle jobs and Compute jobs Resource Virtualization Ensures the reliability of mission critical business flows and always- on availability of resources Provision additional databases for specific tasks across time Matching demand for resources with the supply of resources

10 © Platform Computing Inc. 2003 10 JobScheduler Architecture Client Grid-Enabled Application Execution Infrastructure Load XML Save XML Log Grid Master & Grid Agents Scheduling Time, Job, file, Other events Jobflow Server Process Designing/ Control Oracle Database

11 © Platform Computing Inc. 2003 11 JobScheduler and Oracle scheduler integration Platform JobScheduler client Platform JobScheduler server LSF Master host Oracle instance Oracle client CB 1 23 4 LSF host orajobstart elim.oracle.C elim.oracle.B LSF Cluster

12 © Platform Computing Inc. 2003 12 ETL using Platform JobScheduler A common use of the Platform JobScheduler and Oracle scheduler integration is for ETL into a data warehouse. Example: a brokerage firm wants to load the day’s trading data into their data warehouse for analysis (e.g. risk positions, trending, etc)  ETL flow is triggered by: Time of day event Arrival of market data in flat-file format Completion of a stored procedure which collects location brokerage data  Data is cleansed and loaded with SQL*Loader into the database  Stored procedures are invoked which do some analysis and initial reporting

13

14 Multi-Site Capacity Computing

15 © Platform Computing Inc. 2003 15 Increasing Computing Capacity with Platform MultiCluster A parameter space study is done on tens of thousands of individual sets of parameters, resulting in tens of thousands of analysis jobs Local cluster doesn’t have enough capacity, so Platform MultiCluster is used to allow the forwarding of analysis jobs to clusters located at other sites of the organization The DBMS_STREAMS_ADM.MAINTAIN_TABLESPACES procedure provided with Oracle Database 10g is used to replicate input data for the analysis at the remote site Database aware scheduling is used to make intelligent decisions about which sites are suitable for receiving jobs

16 © Platform Computing Inc. 2003 16 Platform MultiCluster Job Forwarding Model Compute Servers Compute Servers Site A Site B Send queue Receive queue You submit We do --- Job transfer data staging Account mapping Accounting

17 © Platform Computing Inc. 2003 17 Enterprise Grid Architecture

18 © Platform Computing Inc. 2003 18 Workload driven data management 1. Job forwarded Pre-exec script Application Master molecular database (MOL) Tablespaces for MOL Streams maintained version of MOL Tablespaces for MOL 2. Run pre-exec 3. Connect to MOL and run MAINTAIN_TABLESPACES 4. MOL metadata and tablespaces transferred 5. pre-exec finished 6. Job is run 7. Job uses copy Streams DML updates

19 © Platform Computing Inc. 2003 19 Database aware scheduling MOL Site 1 Site 3 Site 2 Data Management Service Site 1 – MOL, MOL2 Site 2 – (none) Site 3 - MOL MOL2 1. Poll for datasets 2. Update cache info 3. bsub -extsched MOL 4. Local site is overloaded Database aware scheduler plug-in decides to forward the job to site 3, since it has the MOL database 5. Job forwarded to site 3

20 Service Virtualization

21 © Platform Computing Inc. 2003 21 Demo Lab Hardware -- A Common Web Service/Application Environment Node NAS/SAN Node Web Server & App Server Oracle RAC CISCO Hardware Load Balancer Web Interconnect network Storage network Public network (Linux) (Linux AS 2.1)

22 © Platform Computing Inc. 2003 22 Oracle RAC Provisioning Demo System Apps Web Server instances … Provisioner Agent Manager Node5 Managed node Web Layer/Nodes (Linux) RAC Agent Node8 Managed node App Agent Agent Manager Apps Applicatio n instances Service Agent Node6 Managed node Agent Manager Apps App Server instances Service Agent Apps RAC instances RAC Managed cluster Node1Node4 … RAC Layer/Nodes (Linux AS 2.1) Application Layer/Nodes (Linux)

23 © Platform Computing Inc. 2003 23 Proof of Concept Demos Dynamic Provisioning within Database Layer Dynamic Provisioning cross Database and Application Layers

24 RAC Layer dbHR dbFinance -Show one RAC node running dbFinance, two RAC nodes running dbHR, and one RAC node is idle -Have a lot of data access to dbFinance, a few of data access to dbHR -Without dynamic provisioning, the response time to dbFinance is very slow, while other RAC nodes are idle -Applying dynamic provisioning, one idle node is added to dbFinance, one dbHR node is shutdown and moved to dbFinance -The response time to dbFinance is improved ? App LayerWeb Layer Node Web ServerApp Server Provisioning Within DB Layer

25 Provisioning Across DB & App Layers RAC Layer dbHR dbFinance -Show one RAC node running dbFinance, one RAC node running dbHR, and two RAC nodes are idle -Have a lot of applications need to run on App Layer -Without dynamic provisioning, the response time of App Layer is very slow, while some RAC nodes are idle -Applying dynamic provisioning, some applications are running on two idle RAC nodes -The response time of App Layer is improved App Layer Web Layer ? App Server -When there are some data accesses to dbFinance, more database instances are needed -Applications on the RAC nodes are gracefully preempted, and two more dbFinance instances are started Node Web ServerApp Server

26 © Platform Computing Inc. 2003 26 RAC Agent Gathers Metrics: numInstances – Instances in a given database. instanceState – Operation state of an instance. dbLoad – Various load metrics from a database User Calls, Recursive Calls Physical Reads, Physical Writes Consistent Gets, dB Block Gets Takes Actions: startInstance – Start an instance on a candidate stopInstance – Stop an instance on a candidate

27 © Platform Computing Inc. 2003 27 Policy Functions Discover State of System What is the current state of the Candidates Database High Load If a candidate is free start an Instance of the loaded database. Database Low Load If a candidate was added, shutdown the database instance on the candidate.

28 © Platform Computing Inc. 2003 28 Scenario 1: Results Discovery Discover pe02, and pe03 are free High Load Detect High Load on HR database. Have a candidate free. Remove candidate from free host list. Start another instance of the HR database. Add the candidate to the list of HR instances.

29 © Platform Computing Inc. 2003 29 Scenario 1: Results Continued High Load Detect low load on the HR database. Detect that candidate hosts are in use. Remove from last added candidate from list of HR instances. Stop HR instance on candidate. Return candidate to list of free hosts. Low Load Add the remaining candidate to the HR instances.

30 Questions?


Download ppt "Grid Computing Meets the Database Chris Smith Platform Computing Session # 36686."

Similar presentations


Ads by Google