Download presentation
Presentation is loading. Please wait.
Published byMildred Heath Modified over 8 years ago
1
INTRODUCTION TO XSEDE
2
INTRODUCTION Extreme Science and Engineering Discovery Environment (XSEDE) “most advanced, powerful, and robust collection of integrated advanced digital resources and services in the world” Five-year, $121-million project is supported by the National Science Foundation
3
Originally TeraGrid until July 2011. XSEDE is an extension and expansion of this original program. 16 SUPERcomputers Support short-term and long-term projects No monetary cost to scientists The goal is to give scientists and researchers the tools for creating scientific discoveries via high-performance computing technologies
4
SUPPORT NETWORK Researchers to support development of scientific workflows, project design IT support “Campus Champions”: On-campus representatives granted access to XSEDE resources, provide outreach, publicity for high-performance computing Technology Insertion Service: Allows researchers to recommend new technologies for inclusion into XSEDE infrastructure
5
Revised September 15, 2014 Campus Champion Institutions Standard – 97 EPSCoR States – 56 Minority Serving Institutions – 12 EPSCoR States and Minority Serving Institutions – 9 Total Campus Champion Institutions – 174
6
The map illustrates how widespread XSEDE is across the US Total of 174 Campus Champion institutions nationwide Underscores strong academic focus of XSEDE and broad geographical range
7
ALLOCATION Three Types of Allocation: Startup: Targets new users, projects with modest computational requirements Education: Targets faculty and teachers for in-class instruction, training in cyberinfrastructure Research: For projects beyond startup phase, intense computational requirements, most stringent application process Must submit allocation requests early
8
RESOURCES Four types: Computing resources Visualization resources Storage resources High-throughput resources
9
Computing resources Designed for High Performance Computing Supports massive parallel processing, large memory distributed across many compute nodes Simulation time rages from a few hours to days Visualization resources Data is translated into images or animations Crucial for understanding and communicating results of complex scientific problems
10
Storage resources Support both short and long-term storage and storage related to compute and visualization allocations. Three types of storage(Stand-alone Storage, Archival Storage, Resource file system storage)
11
EXAMPLES OF RESOURCE SPECS
12
SECURE FILE TRANSFER Uses Globus Online for secure, fast file transfer Geared toward big data, research community Moves terabytes of data Fault recovery
13
RUNNING JOBS OVERVIEW Compute Nodes Home Scratch File System JobData Researcher Login Login Nodes Command Line Use for tasks such as file editing, code compilation, data backup, and job submission. Read/write data from compute nodes to Scratch directory. Store project files Such as source code, scripts, and input data sets to Home directory. Run jobs by submitting your batch script to the compute nodes using the "qsub" command. When you connect to a resource, you are on a login node shared by many users. Your job is submitted to a queue and will wait in line until nodes are available. Queues are managed by a job scheduler that allows jobs to run efficiently. Batch Script Commands for code execution, copy input files to scratch,… Specify number/type of nodes, length of run, output directory, … Credit: New User Tutorial: Jay Alameda
14
SCIENTIFIC WORKFLOW First step in XSEDE methodology Considerations for scientists: Integrate diverse components and data Automate data processing steps Repeat processing steps on new data Reproduce previous results Share their analysis steps with other researchers Track the provenance of data products Reliably execute analyses on unreliable infrastructure Execute analyses in parallel on distributed resources
16
HOW TO CREATE A WORKFLOW Visual editor or Programmatically
17
SUBMITTING JOBS Resources rely on batch scheduler Must create job script specifying: Number/type of nodes you need How long jobs need to run Where output files will be written to
18
Troubleshoot potential errors Running out of CPU time, specifying correct data directories, software permissions, etc. Pack many executions into single job for efficiency in scheduling process
19
MONITOR JOBS
20
TRAINING RESOURCES XSEDE Training classes: https://www.xsede.org/group/xup/training https://www.xsede.org/group/xup/training Community mailing list: workflows@xsede.orgworkflows@xsede.org NCSA online courses in High Performance Computing
21
“CYBERSHAKE” WORKFLOW Southern California Earthquake Center Research Question for seismologists: What will peak ground motion be at new building in next 50 years? Using Probabilistic Seismic Hazard Analysis (PSHA) Workflow does calculations on site data, generates hazard curve Ground motion prediction equations (GMPE) Each site has sub-workflow of 820,000 tasks Final outputs: 500 million files, 5.8 TB
22
Seismic hazard maps showing difference between four official GMPE's.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.