Download presentation
Presentation is loading. Please wait.
Published byAldous Briggs Modified over 9 years ago
1
Millions of Jobs or a few good solutions …. David Abramson Monash University MeSsAGE Lab X
2
No shortage of applications How many jobs do they want/need? Physics Chemistry Environmental Science Biological Systems Engineering Astronomy
3
The Nimrod Tool Family Nimrod workflows for robust design and search –Vary parameters –Execute programs –Copy data in and out Sequential and parallel dependencies Computational economy drives scheduling Computation scheduled near data when appropriate Use distributed high performance platforms Upper middleware broker for resources discovery Wide Community adoption 3 Generate scenarios Execution Results gathered Analysis
4
4 Nimrod/G Grid Middleware Nimrod/ONimrod/E Nimrod Portal Actuators Plan File parameter pressure float range from 5000 to 6000 points 4 parameter concent float range from 0.002 to 0.005 points 2 parameter material text select anyof “Fe” “Al” task main copy compModel node:compModel copy inputFile.skel node:inputFile.skel node:substitute inputFile.skel inputFile node:execute./compModel results copy node:results results.$jobname endtask
5
5 Prepare Jobs using Portal Jobs Scheduled Executed Dynamically Sent to available machines Results displayed & interpreted
6
Parameter Sweeps and searches A full parameter sweep is the cross product of all the parameters –Too easy to generate millions! An optimization run minimizes some output metric and returns parameter combinations that do this –Limited concurrency (except GAs) Design of Experiments limits number of combinations further. –And old idea …. Results Nimrod/OResults
7
Issues for millions of jobs Generation issues –Don’t necessarily need 1,000,000 jobs! –Smarter ways of specifying problems Don’t want to see 1,000,000 jobs! Don’t necessarily generate all at once Performance issues –Nimrod/G: Server load Hierarchical resource management –Nimrod/K: Handling token load in matching store Need k-bounded loops ideas from 1980’s Fault tolerance –Engaging the user Don’t want to see 1,000,000 jobs –Distributed experiment management (p2p)?
8
Issues for millions of jobs Analysis issues –Need smarter ways of interacting with results Scientific visualisation, data mining, mega-pixel displays Commercial realities –License management Need parametric licenses like parallel ones. Appropriate infrastructure –Tera Grid class of machine not most appropriate –Parametric Clouds
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.