Download presentation
Presentation is loading. Please wait.
Published byJulia Hammond Modified over 11 years ago
1
Economics Paradigm for “Resource Management and Scheduling” for Service Oriented P2P/Grid Computing
WW Grid Rajkumar Buyya Melbourne, Australia
3
Need Honest Answers! WW Grid I want to have access to your Grid resources & want to know how many of you are willing to give me access ? (following cases) I am unable to give you access our Australian machines, but I want to have access to yours! [social] Want to solve academic problems Want to solve business problems I am willing to gift you Kangaroos! [bartering] I am willing to give you access to my machines, if you want. (sharing, but no measure & no QoS) [bartering] I am willing to pay you dollars on usage basis. [economic incentive, market-based, and QoS]
4
Overview A quick glance at today’s Grid computing
Resource Management challenges for next generation Grid computing A Glance at Approaches to Grid computing. Grid Architecture for Computational Economy Economy Grid = Globus + GRACE Nimrod-G -- Grid Resource Broker Scheduling Experiments Case Study: Drug Design Application on Grid Conclusions Scheduling Economics Grid Economy Grid
5
Scalable HPC: Breaking Administrative Barriers & new challenges
2100 ? PERFORMANCE 2100 Administrative Barriers Individual Group Department Campus State National Globe Inter Planet Universe Desktop SMPs or SuperComputers Local Cluster Enterprise Cluster/Grid Global Cluster/Grid Inter Planetary Grid!
6
Why Grids? Large Scale Explorations need them—Killer Applications.
Solving grand challenge applications using modeling, simulation and analysis Aerospace Internet & Ecommerce Life Sciences CAD/CAM Digital Biology Military Applications Military Applications Military Applications
8
What is Grid ? An infrastructure that logically couples distributed resources: Computers – PCs, workstations, clusters, supercomputers, laptops, notebooks, mobile devices, PDA, etc; Software – e.g., ASPs renting expensive special purpose applications on demand; Catalogued data and databases – e.g. transparent access to human genome database; Special devices – e.g., radio telescope – searching for life in galaxy. People/collaborators. and presents them as an integrated global resource. It enables the creation of virtual enterprises (VEs) for resource sharing. Wide area data archives
9
P2P/Grid Applications-Drivers
Distributed HPC (Supercomputing): Computational science. High-Capacity/Throughput Computing: Large scale simulation/chip design & parameter studies. Content Sharing (free or paid) Sharing digital contents among peers (e.g., Napster) Remote software access/renting services: Application service provides (ASPs) & Web services. Data-intensive computing: Virtual Drug Design, Particle Physics, Stock Prediction... On-demand, realtime computing: Medical instrumentation & Mission Critical. Collaborative Computing: Collaborative design, Data exploration, education. Service Oriented Computing (SOC): Computing as Utility: New paradigm and new industries.
10
Building and Using Grids require
Services that make our systems Grid Ready! Security mechanisms that permit resources to be accessed only by authorized users. (New) programming tools that make our applications Grid Ready!. Tools that can translate the requirements of an application/user into the requirements of computers, networks, and storage. Tools that perform resource discovery, trading, selection/allocation, scheduling and distribution of jobs and collects results. Globus ?
11
Players in Grid Computing
12
What users want ? Users in Grid Economy & Strategy
Grid Consumers Execute jobs for solving varying problem size and complexity Benefit by selecting and aggregating resources wisely Tradeoff timeframe and cost Strategy: minimise expenses Grid Providers Contribute “idle” resource for executing consumer jobs Benefit by maximizing resource utilisation Tradeoff local requirements & market opportunity Strategy: maximise return on investment
13
Challenges for Next Generation Grid Technology Development
14
Challenges for Grid Computing
Uniform Access Security System Management Computational Economy Resource Discovery Resource Allocation & Scheduling Data locality Network Management Application Development Tools
15
Sources of Complexity in Resource Management for World Wide Grid Computing
Size (large number of nodes, providers, consumers) Heterogeneity of resources (PCs, Workstations, clusters, and supercomputers, instruments, databases, software) Heterogeneity of fabric management systems (single system image OS, queuing systems, etc.) Heterogeneity of fabric management polices Heterogeneity of application requirements (CPU, I/O, memory, and/or network intensive) Heterogeneity in resource demand patterns (peak, off-peak, ...) Applications need different QoS at different times (time critical results). The utility of experimental results varies from time to time. Geographical distribution of users & located different time zones Differing goals (producers and consumers have different objectives and strategies) Unsecure and Unreliable environment
16
Traditional approaches to resource management & scheduling are NOT useful for Grid ?
They use centralised policy that need complete state-information and common fabric management policy or decentralised consensus-based policy. Due to too many heterogenous parameters in the Grid it is impossible to define/get: system-wide performance matrix and common fabric management policy that is acceptable to all. “Economics” paradigm proved to effective institution in managing decentralization and heterogeneity that is present in human economies! Fall of USSR & Emergence of US as world superpower! (monopoly?) So, we propose/advocate the use of computational economics principles in management of resources and scheduling computations on world wide Grid. Think locally and act globally approach to grid computing!
17
Benefits of Computational Economies
It provides a nice paradigm for managing self interested and self-regulating entities (resource owners and consumers) Helps in regulating supply-and-demand for resources. Services can be priced in such a way that equilibrium is maintained. User-centric / Utility driven: Value for money! Scalable: No need of central coordinator (during negotiation) Resources(sellers) and also Users(buyers) can make their own decisions and try to maximize utility and profit. Adaptable It helps in offering different QoS (quality of services) to different applications depending the value users place on them. It improves the utilisation of resources It offers incentive for resource owners for being part of the grid! It offers incentive for resource consumers for being good citizens There is large body of proven Economic principles and techniques available, we can easily leverage it.
18
New challenges of Computational Economy
Resource Owners How do I decide prices ? (economic models?) How do I specify them ? How do I enforce them ? How do I advertise & attract consumers ? How do I do accounting and handle payments? ….. Resource Consumers How do I decide expenses ? How do I express QoS requirements ? How I trade between timeframe & cost ? …. Any tools, traders & brokers available to automate the process ?
19
Grid Computing Approaches
mix-and-match Object-oriented Internet/partial-P2P Grid Computing Approaches Network enabled Solvers NetSolve Market/Computational Economy Nimrod-G
20
Many Grid Projects & Initiatives
Australia Economy Grid Nimrod-G Virtual Lab Active Sheets DISCWorld ..new coming up Europe UNICORE MOL Lecce GRB Poland MC Broker EU Data Grid EuroGrid MetaMPI Dutch DAS XW, JaWS and many more... Japan Ninf DataFarm USA Globus Legion Javelin AppLeS NASA IPG Condor Harness NetSolve AccessGrid GrADS and many more... Cycle Stealing & .com Initiatives Distributed.net …. Entropia, UD, Parabon,…. Public Forums Global Grid Forum P2P Working Group IEEE TFCC Grid & CCGrid conferences
21
Many Testbeds ? & who pays ?, who regulates supply and demand ?
WW Grid GUSTO (decommissioned) World Wide Grid Legion Testbed NASA IPG
22
Testbeds so far -- observations
Who contributed resources & why ? Volunteers: for fun, challenge, fame, charismatic apps, public good like distributed.net & projects. Collaborators: sharing resources while developing new technologies of common interest – Globus, Legion, Ninf, Ninf, MC Broker, Lecce GRB,... Unless you know lab. leaders, it is impossible to get access! How long ? Short term: excitement is lost, too much of admin. Overhead (Globus inst+), no incentive, policy change,… What we need ? Grid Marketplace! Regulates supply-and-demand, offers incentive for being players, simple, scalable solution, quasi-deterministic – proven model in real-world.
23
Building an Economy Grid (Next Generation Grid Computing!)
To enable the creation and promotion of: Grid Marketplace (competitive) ASP Service Oriented Computing . . . And let users focus on their own work (science, engineering, or commerce)!
24
GRACE: A Reference Grid Architecture for Computational Economy
Grid Bank Grid Market Services Information Server(s) Sign-on Health Monitor Info ? Grid Explorer Grid Node N … Application … Secure Job Control Agent Schedule Advisor Grid Node1 QoS Pricing Algorithms Trade Server Trading Trade Manager Accounting Resource Reservation Misc. services … Deployment Agent JobExec Resource Allocation Grid User Grid Resource Broker Storage R1 R2 … Rm Grid Middleware Services Grid Service Providers See PDPTA 2000 paper!
25
See SPIE ITCom 2001 paper!: with Heinz Stockinger, CERN!
Economic Models Price-based: Supply,demand,value, wealth of economic system Commodity Market Model Posted Price Model Bargaining Model Tendering (Contract Net) Model Auction Model English, first-price sealed-bid, second-price sealed-bid (Vickrey), and Dutch (consumer:low,high,rate; producer:high, low, rate) Proportional Resource Sharing Model Monopoly (one provider) and Oligopoly (few players) consumers may not have any influence on prices. Bartering Shareholder Model Partnership Model See SPIE ITCom 2001 paper!: with Heinz Stockinger, CERN!
26
Grid Open Trading Protocols
Trade Manager Trade Server Get Connected Pricing Rules Call for Bid(DT) Reply to Bid (DT) Negotiate Deal(DT) …. API Confirm Deal(DT, Y/N) DT - Deal Template - resource requirements (BM) - resource profile (BS) - price (any one can set) - status - change the above values - negotiation can continue - accept/decline - validity period Cancel Deal(DT) Change Deal(DT) Get Disconnected
27
Grid Components … … … … Grid Apps. Scientific Engineering
Applications and Portals Grid Apps. … Scientific Engineering Collaboration Prob. Solving Env. Web enabled Apps Development Environments and Tools Grid Tools … Languages Libraries Debuggers Monitoring Resource Brokers Web tools Distributed Resources Coupling Services Grid Middleware … Security Information Process QoS Resource Trading Market Info Local Resource Managers Operating Systems Queuing Systems Libraries & App Kernels … TCP/IP & UDP Grid Fabric Networked Resources across Organisations … Computers Clusters Storage Systems Data Sources Scientific Instruments
28
Economy Grid = Globus + GRACE
Applications … Grid Apps. Science Engineering Commerce Portals ActiveSheet High-level Services and Tools Grid Status … Grid Tools DUROC MPI-G CC++ Nimrod/G globusrun Core Services Heartbeat Monitor Nexus GRAM GRACE-TS Grid Middleware Globus Security Interface MDS GASS DUROC GARA GMD GBank Grid Fabric Local Services Condor GRD QBank JVM TCP UDP LSF PBS eCash Linux Irix Solaris See IPDPS HWC 2001 paper!
29
GRACE components A resource broker (e.g., Nimrod/G)
Various resource trading protocols for different economic models A mediator for negotiating between users and grid service providers (Grid Market Directory) A deal template for specifying resource requirements and services offers Grid Trading Server Pricing policy specification Accounting (e.g., QBank) and payment management (GridBank, not yet implemented)
30
(digital transactions)
Pricing, Accounting, Allocations and Job Scheduling each site/Grid Level Pricing Policy GRID Bank (digital transactions) 2 Trade Server QBank Site 1 3 5 8 0. Make Deposits, Transfers, Refunds, Queries/Reports 1. Clients negotiates for access cost. 2. Negotiation is performed per owner defined policies. 3. If client is happy, TS informs QB about access deal. 4. Job is Submitted 5. Check with QB for “go ahead” 6. Job Starts 7. Job Completes 8. Inform QB about resource resource utilization. Resource Manager 4 IBM-LL/PBS/…. 6 7 Compute Resources clusters/SGI/SP/...
31
Service Items to be Charged
CPU - User and System time Memory: maximum resident set size - page size amount of memory used page faults: with/without physical I/O Storage: size, r/w/block IO operations Network: msgs sent/received Signals received, context switches Software and Libraries accessed Data Sources (e.g. Protein Data Bank)
32
How to decide Price ? Fixed price model (like today’s Internet)
Dynamic/Demand and Supply (like tomorrow’s Internet) Usage Period Loyalty of Customers (like Airlines favoring frequent flyers!) Historical data Advance Agreement (high discount for corporations) Usage Timing (peak, off-peak, lunch time) Calendar based (holiday/vacation period) Bulk Purchase (register 100 .com domains at once!) Voting -- trade unions decide pricing structure Resource capability as benchmarked in the market! Academic R&D/public-good application users can be offered at cheaper rate compared to commercial use. Customer Type – Quality or price sensitive buyers. Can be Prescribed by Regulating (Govt.) authorities
33
Payments- Options & Automation
Buy credits in advance / GSPs bill the user later--”pay as you go” Pay by Electronic Currency via Grid Bank NetCash (anonymity), NetCheque, and Paypal NetCheque: - Users register with NC accounting servers, can write electronic cheques and send (e.g ). When deposited, balance is transferred from sender to receiver account. NetCash - It supports anonymity and it uses the NetCheque system to clear payments between currency servers. Paypal.com– account+ is linked to credit card. Enter the recipient’s address and the amount you wish to request. The recipient gets an notification and pays you at
34
Nimrod-G: The Grid Resource Broker
Soft Deadline and Budget-based Economy Grid Resource Broker for Parameter Processing on P2P Grids
35
Parametric Computing (What Users think of Nimrod Power)
Parameters Magic Engine Multiple Runs Same Program Multiple Data Killer Application for the Grid! See IPDPS 2000 paper! Courtesy: Anand Natrajan, University of Virginia
36
P-study Applications -- Characteristics
Code (Single Program: sequential or threaded) High Resource Requirements Long-running Instances Numerous Instances (Multiple Data) High Computation-to-Communication Ratio Embarrassingly/Pleasantly Parallel
37
Sample P-Sweep Applications
Bioinformatics: Drug Design / Protein Modelling Combinatorial Optimization: Meta-heuristic parameter estimation Ecological Modelling: Control Strategies for Cattle Tick Sensitivity experiments on smog formation Data Mining High Energy Physics: Searching for Rare Events Electronic CAD: Field Programmable Gate Arrays Computer Graphics: Ray Tracing Finance: Investment Risk Analysis VLSI Design: SPICE Simulations Civil Engineering: Building Design Automobile: Crash Simulation Network Simulation Aerospace: Wing Design astrophysics
38
Thesis Perform parameter sweep (bag of tasks) (utilising distributed resources) within “T” hours or early and cost not exceeding $M. Three Options/Solutions: Using pure Globus commands Build your own Distributed App & Scheduler Use Nimrod-G (Resource Broker)
39
Remote Execution Steps
Choose Resource Transfer Input Files Set Environment Start Process Pass Arguments Monitor Progress Summary View Job View Event View Read/Write Intermediate Files Transfer Output Files +Resource Discovery, Trading, Scheduling, Predictions, Rescheduling, ...
40
Using Pure Globus commands
Do all yourself! (manually) Total Cost:$???
41
Build Distributed Application & Scheduler
Build App case by case basis Complicated Construction E.g., AppLeS/MPI based Total Cost:$???
42
Use Nimrod-G Aggregate Job Submission Aggregate View Submit & Play!
43
Nimrod & Associated Family of Tools
Remote Execution Server (on demand Nimrod Agent) P-sweep App. Composition: Nimrod/ Enfusion Resource Management and Scheduling: Nimrod-G Broker Design Optimisations: Nimrod-O App. Composition and Online Visualization: Active Sheets Grid Simulation in Java: GridSim Drug Design on Grid: Virtual Lab File Transfer Server Upcoming?: HEPGrid (+U. Melbourne), GAVE(+Rutherford Appleton Lab) Grid (Un)Aware Virtual Engineering
44
Nimrod/G : A Grid Resource Broker
A resource broker for managing, steering, and executing task farming (parametric sweep/SPMD model) applications on Grid based on deadline and computational economy. Based on users’ QoS requirements, our Broker dynamically leases services at runtime depending on their quality, cost, and availability. Key Features A single window to manage & control experiment Persistent and Programmable Task Farming Engine Resource Discovery Resource Trading Scheduling & Predications Generic Dispatcher & Grid Agents Transportation of data & results Steering & data management Accounting
45
A Glance at Nimrod-G Broker
Nimrod/G Client Nimrod/G Client Nimrod/G Client Nimrod/G Engine Schedule Advisor Grid Store Trading Manager Grid Dispatcher Grid Explorer Grid Middleware Globus, Legion, Condor, etc. TM TS GE GIS Grid Information Server(s) RM & TS RM & TS RM & TS G C L G Legion enabled node. Globus enabled node. L G C L See HPCAsia 2000 paper! RM: Local Resource Manager, TS: Trade Server Condor enabled node.
46
Nimrod/G Grid Broker Architecture
Legacy Applications Customised Apps (Active Sheet) Monitoring and Steering Portals Nimrod Clients P-Tools (GUI/Scripting) (parameter_modeling) XML? Farming Engine Meta-Scheduler XML Programmable Entities Management Algorithm1 Schedule Advisor . . . Resources Jobs Tasks Channels AlgorithmN Nimrod Broker IP hourglass ? AgentScheduler Agents JobServer Database (Postgres) Grid Explorer Trading Manager Dispatcher & Actuators . . . Globus-A Legion-A P2P-A Globus Legion Condor P2P . . . GTS GMD G-Bank Middleware Computers Local Schedulers Storage Networks . . . Instruments Fabric PC/WS/Clusters Condor/LL/Mosix/ Database . . . Radio Telescope
47
A Nimrod/G Monitor Cost Deadline Legion hosts Globus Hosts
Bezek is in both Globus and Legion Domains
48
User Requirements: Deadline/Budget
49
Active Sheet: Spreadsheet Processing on Grid
Nimrod Proxy Nimrod/G See HPC 2001 paper!
51
Nimrod/G Interactions
Resource Discovery Grid Info servers Scheduler Grid Trade Server Resource allocation (local) Farming Engine Dispatcher Queuing System Nimrod Agent User process Process server I/O server File access “Do this in 30min. for $10?” Root node Gatekeeper node Computational node
52
Adaptive Scheduling Algorithms
See HPDC AMS 2001 paper! Discover More Resources Discover Resources Establish Rates Compose & Schedule Evaluate & Reschedule Meet requirements ? Remaining Jobs, Deadline, & Budget ? Distribute Jobs
53
Cost Model Without cost ANY shared system becomes un-managable
Charge users more for remote facilities than their own Choose cheaper resources before more expensive ones Cost units (G$) may be Dollars Shares in global facility Stored in bank
54
Cost Matrix @ Grid site X
1 3 2 User 5 Machine 1 User 1 Machine 5 Non-uniform costing Encourages use of local resources first Real accounting system can control machine usage Resource Cost = Function (cpu, memory, disk, network, software, QoS, current demand, etc.) Simple: price based on peaktime, offpeak, discount when less demand, ..
55
Deadline and Budget-based Cost Minimization Scheduling
Sort resources by increasing cost. For each resource in order, assign as many jobs as possible to the resource, without exceeding the deadline. Repeat all steps until all jobs are processed.
56
Deadline-based Cost-minimization Scheduling
M - Resources, N - Jobs, D - deadline Note: Cost of any Ri is less than any of Ri+1 …. Or Rm RL: Resource List need to be maintained in increasing order of cost Ct - Time when accessed (Time now) Ti - Job runtime (average) on Resource i (Ri) [updated periodically] Ti is acts as a load profiling parameter. Ai - number of jobs assigned to Ri , where: Ai = Min (No.Unassigned Jobs, No. Jobs Ri can complete by remaining deadline) No.UnAssignedJobsi = Diff( N, (A1+…+Ai-1)) JobsRi consume = RemainingTime (D- Ct) DIV Ti ALG: Invoke Job Assignment() periodically until all jobs done. Job Assignment()/Reassignment(): Establish ( RL, Ct , Ti , Ai ) dynamically – Resource Discovery. For all resources (I = 1 to M) { Assign Ai Jobs to Ri , if required}
57
Deadline and Budget Constraint (DBC) Time Minimization Scheduling
For each resource, calculate the next completion time for an assigned job, taking into account previously assigned jobs. Sort resources by next completion time. Assign one job to the first resource for which the cost per job is less than the remaining budget per job. Repeat all steps until all jobs are processed. (This is performed periodically or at each scheduling-event.)
58
Deadline and Budget Constraint (DBC) Time+Cost Min. Scheduling
Split resources by whether cost per job is less than budget per job. For the cheaper resources, assign jobs in inverse proportion to the job completion time (e.g. a resource with completion time = 5 gets twice as many jobs as a resource with completion time = 10). For the dearer resources, repeat all steps (with a recalculated budget per job) until all jobs are assigned. [Schedule/Reschedule] Repeat all steps until all jobs are processed.
59
Evaluation of Scheduling Heuristics
A Hypothetical Application on World Wide Grid WW Grid
60
World Wide Grid (WWG) Internet Australia North America Monash Uni.:
WW Grid World Wide Grid (WWG) Australia North America Monash Uni.: ANL: SGI/Sun/SP2 USC-ISI: SGI UVa: Linux Cluster UD: Linux cluster UTK: Linux cluster Nimrod/G Linux cluster Globus+Legion GRACE_TS Solaris WS Globus/Legion GRACE_TS Internet WW Grid Asia/Japan Europe Tokyo I-Tech.: ETL, Tuskuba ZIB/FUB: T3E/Mosix Cardiff: Sun E6500 Paderborn: HPCLine Lecce: Compaq SC CNR: Cluster Calabria: Cluster CERN: Cluster Pozman: SGI/SP2 Linux cluster Globus + GRACE_TS Chile: Cluster Globus + GRACE_TS Globus + GRACE_TS South America
61
Experiment-1 Setup Workload:
165 jobs, each need 5 minute of cpu time Deadline: 1 hrs. and budget: 800,000 units Strategy: minimise cost and meet deadline Execution Cost with cost optimisation AU Peaktime: (G$) AU Offpeak time: (G$)
62
Resources Selected & Price/CPU-sec.
Resource Type & Size Owner and Location Grid services Peaktime Cost (G$) Offpeak cost Linux cluster (60 nodes) Monash, Australia Globus/Condor 20 5 IBM SP2 (80 nodes) ANL, Chicago, US Globus/LL 10 Sun (8 nodes) Globus/Fork SGI (96 nodes) Globus/Condor-G 15 SGI (10 nodes) ISI, LA, US
63
Execution @ AU Peak Time
64
Execution @ AU Offpeak Time
65
AU peak: Resources/Cost in Use
After the calibration phase, note the difference in pattern of two graphs. This is when scheduler stopped using expensive resources.
66
AU offpeak: Resources/Cost in Use
67
Experiment-2 Setup Workload: Deadline: 2 hrs. and budget: 396000 units
165 jobs, each need 5 minute of CPU time Deadline: 2 hrs. and budget: units Strategy: minimise time / cost Execution Cost with cost optimisation Optimise Cost: (G$) (finished in 2hrs.) Optimise Time: (G$) (finished in 1.25 hr.) In this experiment: Time-optimised scheduling run costs double that of Cost-optimised. Users can now trade-off between Time Vs. Cost.
68
Resources Selected & Price/CPU-sec.
Resource & Location Grid services & Fabric Cost/CPU sec.or unit No. of Jobs Executed Time_Opt Cost_Opt. Linux Cluster-Monash, Melbourne, Australia Globus, GTS, Condor 2 64 153 Linux-Prosecco-CNR, Pisa, Italy Globus, GTS, Fork 3 7 1 Linux-Barbera-CNR, Pisa, Italy 4 6 Solaris/Ultas2 TITech, Tokyo, Japan 9 SGI-ISI, LA, US 8 37 5 Sun-ANL, Chicago,US 42 Total Experiment Cost (G$) 237000 115200 Time to Complete Exp. (Min.) 70 119
69
DBC Scheduling for Time Optimization
70
DBC Scheduling for Cost Optimization
71
Application Case Study
The Virtual Laboratory Project: "Molecular Modelling for Drug Design" on Peer-to-Peer Grid
72
Virtual Drug Design: Data Intensive Computing on Grid
A Virtual Laboratory for “Molecular Modelling for Drug Design” on Peer-to-Peer Grid. It provides tools for examining millions of chemical compounds (molecules) in the Protein Data Bank (PDB) to identify those having potential use in drug design. In collaboration with: Kim Branson, Structural Biology, Walter and Eliza Hall Institute (WEHI)
73
Virtual Drug Design A Virtual Lab for “Molecular Modeling for Drug Design” on P2P Grid
Data Replica Catalogue Grid Market Directory Grid Info. Service “Give me list PDBs sources Of type aldrich_300?” “service cost?” “service providers?” GTS Resource Broker “Screen 2K molecules in 30min. for $10” “mol.5 please?” GTS (RB maps suitable Grid nodes and Protein DataBank) “get mol.10 from pdb1 & screen it.” PDB2 GTS “mol.10 please?” GTS GTS (GTS - Grid Trade Server) PDB1
74
DataGrid Brokering Nimrod/G Computational PDB Broker Grid Broker
“Screen 2K molecules in 30min. for $10” Nimrod/G Computational Grid Broker Algorithm1 Data Replica Catalogue PDB Broker . . . AlgorithmN 3 “PDB replicas please?” “advise PDB source? 1 5 2 4 “process & send results” Grid Info. Service “selection & advise: use GSP4!” “Screen mol.5 please?” “Is GSP4 healthy?” 7 6 “mol.5 please?” PDB2 PDB Service PDB Service GSP1 GSP2 GSP3 (Grid Service Provider) GSP4 GSPm GSPn
75
Software Tools Molecular Modelling Tools (DOCK)
Parameter Modelling Tools (Nimrod/enFusion) Grid Resource Broker (Nimrod-G) Data Grid Broker Protein Data Bank (PDB) Management and Intelligent Access Tools PDB databse Lookup/Index Table Generation. PDB and associated index-table Replication. PDB Replica Catalogue (that helps in Resource Discovery). PDB Servers (that serve PDB clients requests). PDB Brokering (Replica Selection). PDB Clients for fetching Molecule Record (Data Movement). Grid Middleware (Globus and GrACE) Grid Fabric Management (Fork/LSF/Condor/Codine/…)
76
DOCK code* (Enhanced by WEHI, U of Melbourne)
A program to evaluate the chemical and geometric complementarities between a small molecule and a macromolecular binding site. It explores ways in which two molecules, such as a drug and an enzyme or protein receptor, might fit together. Compounds which dock to each other well, like pieces of a three-dimensional jigsaw puzzle, have the potential to bind. So, why is it important to able to identify small molecules which may bind to a target macromolecule? A compound which binds to a biological macromolecule may inhibit its function, and thus act as a drug. Thus disabling the ability of (HIV) virus attaching itself to molecule/protein! With system specific code changed, we have been able to compile it for Sun-Solaris, PC Linux, SGI IRIX, Compaq Alpha/OSF1 * Original Code: University of California, San Francisco:
77
Molecule to be screened
Dock input file score_ligand yes minimize_ligand yes multiple_ligands no random_seed anchor_search no torsion_drive yes clash_overlap conformation_cutoff_factor 3 torsion_minimize yes match_receptor_sites no random_search yes maximum_cycles ligand_atom_file S_1.mol2 receptor_site_file ece.sph score_grid_prefix ece vdw_definition_file parameter/vdw.defn chemical_definition_file parameter/chem.defn chemical_score_file parameter/chem_score.tbl flex_definition_file parameter/flex.defn flex_drive_file parameter/flex_drive.tbl ligand_contact_file dock_cnt.mol2 ligand_chemical_file dock_chm.mol2 ligand_energy_file dock_nrg.mol2 Molecule to be screened
78
Parameterized Dock input file
score_ligand $score_ligand minimize_ligand $minimize_ligand multiple_ligands $multiple_ligands random_seed $random_seed anchor_search $anchor_search torsion_drive $torsion_drive clash_overlap $clash_overlap conformation_cutoff_factor $conformation_cutoff_factor torsion_minimize $torsion_minimize match_receptor_sites $match_receptor_sites random_search $random_search maximum_cycles $maximum_cycles ligand_atom_file ${ligand_number}.mol2 receptor_site_file $HOME/dock_inputs/${receptor_site_file} score_grid_prefix $HOME/dock_inputs/${score_grid_prefix} vdw_definition_file vdw.defn chemical_definition_file chem.defn chemical_score_file chem_score.tbl flex_definition_file flex.defn flex_drive_file flex_drive.tbl ligand_contact_file dock_cnt.mol2 ligand_chemical_file dock_chm.mol2 ligand_energy_file dock_nrg.mol2 Molecule to be screened
79
Molecules to be screened
Dock PlanFile (contd.) parameter database_name label "database_name" text select oneof "aldrich" "maybridge" "maybridge_300" "asinex_egc" "asinex_epc" "asinex_pre" "available_chemicals_directory" "inter_bioscreen_s" "inter_bioscreen_n" "inter_bioscreen_n_300" "inter_bioscreen_n_500" "biomolecular_research_institute" "molecular_science" "molecular_diversity_preservation" "national_cancer_institute" "IGF_HITS" "aldrich_300" "molecular_science_500" "APP" "ECE" default "aldrich_300"; parameter score_ligand text default "yes"; parameter minimize_ligand text default "yes"; parameter multiple_ligands text default "no"; parameter random_seed integer default 7; parameter anchor_search text default "no"; parameter torsion_drive text default "yes"; parameter clash_overlap float default 0.5; parameter conformation_cutoff_factor integer default 5; parameter torsion_minimize text default "yes"; parameter match_receptor_sites text default "no"; parameter random_search text default "yes"; parameter maximum_cycles integer default 1; parameter receptor_site_file text default "ece.sph"; parameter score_grid_prefix text default "ece"; parameter ligand_number integer range from 1 to 200 step 1; Molecules to be screened
80
Dock PlanFile task nodestart copy ./parameter/vdw.defn node:.
copy ./parameter/chem.defn node:. copy ./parameter/chem_score.tbl node:. copy ./parameter/flex.defn node:. copy ./parameter/flex_drive.tbl node:. copy ./dock_inputs/get_molecule node:. copy ./dock_inputs/dock_base node:. endtask task main node:substitute dock_base dock_run node:substitute get_molecule get_molecule_fetch node:execute sh ./get_molecule_fetch node:execute $HOME/bin/dock.$OS -i dock_run -o dock_out copy node:dock_out ./results/dock_out.$jobname copy node:dock_cnt.mol2 ./results/dock_cnt.mol2.$jobname copy node:dock_chm.mol2 ./results/dock_chm.mol2.$jobname copy node:dock_nrg.mol2 ./results/dock_nrg.mol2.$jobname
81
Nimrod/TurboLinux enFuzion GUI tools for Parameter Modeling
82
Docking Experiment Preparation
Setup PDB DataGrid Index PDB databases Pre-stage (all) Protein Data Bank (PDB) on replica sites Start PDB Server Create Docking GridScore (receptor surface details) for a given receptor on home node. Pre-Staging Large Files required for Docking: Pre-stage Dock executables and PDB access client on Grid nodes, if required (e.g., dock.Linux, dock.SunOS, dock.IRIX64, and dock.OSF1 on Linux, Sun, SGI, and Compaq machines respectively). Use globus-rcp. Pre-stage/Cache all data files (~3-13MB each) representing receptor details on Grid nodes. This can can be done demand by Nimrod/G for each job, but few input files are too large and they are required for all jobs). So, pre-staging/caching at http-cache or broker level is necessary to avoid the overhead of copying the same input files again and again!
83
Protein Data Bank Databases consist of small molecules from commercially available organic synthesis libraries, and natural product databases. There is also the ability to screen virtual combinatorial databases, in their entirety. This methodology allows only the required compounds to be subjected to physical screening and/or synthesis reducing both time and expense.
84
Target Testcase The target for the test case: electrocardiogram (ECE) endothelin converting enzyme. This is involved in “heart stroke” and other transient ischemia. Is·che·mi·a : A decrease in the blood supply to a bodily organ, tissue, or part caused by constriction or obstruction of the blood vessels.
85
Nimrod/G in Action: Screening on World-Wide Grid
86
Any Scientific Discovery
Any Scientific Discovery ? Did your collaborator invent new drug for xxxx? Anyway, checkout the announcement of Nobel-prize winners for next year Not Yet ?
87
Conclude with a comparison with the Electrical Grid………..
Where we are ???? Courtesy: Domenico Laforenza
88
Fresco by N. Cianfanelli (1841)
Alessandro Volta in Paris in 1801 inside French National Institute shows the battery while in the presence of Napoleon I Fresco by N. Cianfanelli (1841) (Zoological Section "La Specula" of National History Museum of Florence University)
89
What ?!?! Oh, mon Dieu ! This is a mad man… ….and in the future,
I imagine a worldwide Power (Electrical) Grid …... Oh, mon Dieu ! What ?!?! This is a mad man…
90
= 200 Years 2001 1801
91
Grid Computing: A New Wave ?
Can we Predict its Future ? ” I think there is a world market for about five computers.” Thomas J. Watson Sr., IBM Founder, 1943
92
Summary and Conclusions
P2P and Grid Computing is emerging as a next generation computing platform for solving large scale problems through sharing of geographically distributed resources. Resource management is a complex undertaking as systems need to be adaptive, scalable, competitive,…, and driven by QoS. We proposed a framework based on “computational economies” and discussed several economic models for resource allocation and for regulating supply-and-demand for resources. Scheduling experiments on World Wide Grid demonstrate our Nimrod-G broker ability to dynamically lease or rent services at runtime based on their quality, cost, and availability depending on consumers QoS requirements. Economics paradigm for QoS driven resource management is essential to push P2P/Grids into mainstream computing!
93
Download Software & Information
Nimrod & Parameteric Computing: Economy Grid & Nimrod/G: Virtual Laboratory/Virtual Drug Design: Grid Simulation (GridSim) Toolkit (Java based): World Wide Grid (WWG) testbed: Looking for new volunteers to grow Please contact me to barter your & our machines! Want to build on our work/collaborate: Talk to me now or
94
Thank You… Any ??
95
Further Information Books: IEEE Task Force on Cluster Computing
High Performance Cluster Computing, V1, V2, R.Buyya (Ed), Prentice Hall, 1999. The GRID, I. Foster and C. Kesselman (Eds), Morgan-Kaufmann, 1999. IEEE Task Force on Cluster Computing Global Grid Forum IEEE/ACM CCGrid’xy: CCGrid 2002, Berlin: ccgrid2002.zib.de Grid workshop -
96
Further Information Cluster Computing Info Centre:
Grid Computing Info Centre: IEEE DS Online - Grid Computing area: Compute Power Market Project
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.