Download presentation
Presentation is loading. Please wait.
Published bySharyl McDowell Modified over 9 years ago
1
VLab: A Collaborative Cyberinfrastructure for Computations of Materials Properties at High Pressures and Temperatures Cesar R. S. da Silva 1 Pedro R. C. da Silveira 1 Renata M. Wentzcovitch 1,2 1 Minnesota Supercomputing Institute, University of Minnesota 2 Department of Chemical Engineering and Materials Science, University of Minnesota Work Sponsored by NSF grant ITR-0426757
2
-“VLab is a cyberinfrastructure aimed to facilitate execution of complex calculations - mostly parameter sampling workflows - of materials at high pressures and temperatures.” -Parameter Sampling Workflows - High P,T C ij as example -Basic Problem: -Job deluge -Proposed Solution: - Features - Performance -Overall Requirements -Workflow Support Specific Requirements -Service Oriented Architecture Outline
3
Thermodynamic Method VDoS and F(T,V) within the QHA Fitted at several temperatures either by - Vinet EOS, or - N-th (N=3,4,5…) order isothermal (eulerian) finite strain EoS
4
equilibrium structure (P n ) kl re-optimize Thermoelastic constant tensor C ij S (T,P)
5
Basic Problem - Wow can High (P,T) Materials Computations be improved? Demand for Extensive Parameter Sampling Typical High (P,T) study (ex. Thermal Properties) {P n }x{q i } => ~10 2 jobs Huge High (P,T) study ( C ij (P,T) ) {P n }x{ i }x{q j } => ~10 3-4 jobs 10 2 -10 4 Jobs to prepare, submit and monitor Manual work is prone to human errors First Principles => Sheer number (10 15 -10 20 ) of operations (Today) => Well over 10 22 in 3-5 years
6
The VLab - Consolidated Web Interface (Portal) to a set of tools: - Quantum ESPRESSO Package tools - Input preparation for pwscf, phonon, workflows, etc … - Data Analysis Tools - Visualization Tools (VTK/OpenGL) - etc. … - Workflow Management Leverages computing capabilities of distributed resources (TeraGrid, Compute Farms, scattered resources, other grids) Collaboration through shared access to resources - Task Distribution and Data Recollection
7
The Big Challenge of Performance Proposed Solution: Leveraging Concurrent Computing for features and performance High Performance Parallel Computing High Throughput Distributed Processing Scale-up approach is difficult Limited number of processors in a single system Even using the fastest vector processors is not enough Trend is towards denser processing, not faster single-thread execution MPP systems are not cost effective for this class of problems FFT and matrix transposition: Limited scalability or Low performance per processor
8
Vlab - Not Just a Client/Server The Client/Server Approach: -The portal and the supporting modules have access to a large central multi-processor system. -Can work as a facilitator but lacks other important features found in VLab. -No Flexibility of Scheduling -No redundancy => Poor availability -No choice for cost (usually High)
9
Vlab - Not Just a Client/Server The VLab Distributed System Approach: -Distributed resources are replicated for: - Redundancy - Performance - Flexibility -No central system to fail and bring everything down! -More Flexible Scheduling for: - Cost - Turnaround Time - Job Throughput - Workload Balance - System Throughput
10
VLAB requirements Workflow management => Facilitator Support for distributed computations Ease of use Support for collaboration Flexibility (update/add tools, new features) Fault tolerance Diversity of tools –analysis, visualization, data reduction, storage, etc.
11
VLab Workflows Typical VLab workflows, like the High-T C ij calculation involve iterations through the following steps: 1) Prepare inputs for tasks, and generate execution packages containing required files. 2) Dispatch the execution packages to compute nodes for execution. 3) Gather results for analysis and eventually iterate steps 1-3. - Results always return to the input sources => Tree-like service architecture
12
VLab Service Oriented Architecture On the Web: http://dasilveira.msi.umn.edu:8080/vlab/ Usage oriented view of VLab SOA => Tree-like structure in 4 layers: 1) User Interface (Portal) 2) Workflow control and monitoring (Project Executor / Interaction) 3) Task Dispatching / Interaction, task data retrieving, Auxiliary Services 4) Heavy computations and Visualization resources layer.
13
Fault Tolerance -Reactive: We have not identified any need for proactive FT. -Registry Based: Persistent sessions are registered and must periodically inform the registry about its "alive" state. -Redundant Registry and Metadata DB for data persistence -Fully Journaling (data and metadata) of Critical Transactions for data and metadata integrity. This guarantee the state of any persistent session can be restored in case of failure. Only Project Executor sessions and few user and project interaction sessions are required to be persistent. Therefore, a simple approach to Fault Tolerance (FT) is possible:
14
Scheduling The usual approach: -Use agents that interact with the broker Problem: Agents are not stateless! -More complicated to develop -Persistence must be guaranteed The VLab approach: -Use an independent WS to monitor workload. -Persistence of data is provided by a local DB. -Compute WS and Workload Monitor are stateless!
15
VLab in Action Watch a demonstration movie at vlab.msi.umn.edu -> Follow the links “portal” -> “movie” Calculation of High P,T Thermodynamic Properties Cubic MgO 2 atom cell Static + Lattice Dynamics calculation {P n }x{ i } sampling Show distributed computing capabilities Ability to integrate visualization and data analysis tools
16
VLab Workflows Left: Extensive High-T Cij Right: Detailed View of Cij and phonon
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.