Download presentation
Presentation is loading. Please wait.
Published byCleopatra Thompson Modified over 9 years ago
1
Research Computing University Of South Florida Providing Advanced Computing Resources for Research and Instruction through Collaboration
2
Mission Provide advanced computing resources required by a major research university o Software o Hardware o Training o Support
3
User Base 40 Research groups 6 Colleges 100 faculty 300 students
4
Hardware System was build on the condominium model and consists of 300 Nodes 2400 Processors o University provides infrastructure and some computational resources o Faculty funding provides bulk of computational resources
5
Software Over 50 scientific codes o Installation o Integration o Upgrades o Licensing
6
Support Personnel Provide all systems administration Software support One-on-one consulting System efficiency improvements Users are no longer just the traditional “number crunchers
7
Current Projects Consolidating the last standalone cluster (of appreciable size) Advanced Visualization Center o Group of 19 Faculty applied for funding o Personnel o Training o Large Resolution 3D display
8
Current Projects New computational resources o Approximately 100 nodes o GPU resources o Upgrade parallel file system Virtual Clusters o HPC for the other 90 % FACC
9
Florida State University's Shared HPC Building and Maintaining Sustainable Research Computing at FSU
10
Shared-FSU HPC Mission Support multidisciplinary research Provide a general access computing platform Encourage cost sharing by departments with dedicated computing needs Provide a broad base of support and training opportunities
11
Turn-key Research Solution Participation is Voluntary University provides staffing University provides general infrastructure o Network fabrics o Racks o Power/Cooling Additional buy-in incentives o Leverage better pricing as a group o Matching funds Offer highly flexible buy-in options o Hardware purchase only o Short-term Service Level Agreements o Long-term Service Level Agreements Shoot for 50% of hardware costs covered by Buy-in
13
Research Support @ FSU 500 plus users 33 Academic Units 5 Colleges
14
HPC Owner Groups 2007 o Department of Scientific Computing o Center for Ocean-Atmosphere Prediction Studies o Department of Meteorology 2008 o Gunzburger Group (Applied Mathematics) o Taylor Group (Structural Biology) o Department of Scientific Computing o Kostov Group (Chemical & Biomedical Engineering) 2009 o Department of Physics (HEP, Nuclear, etc.) o Institute of Molecular Biophysics o Bruschweiler Group (National High Magnetic Field Laboratory) o Center for Ocean-Atmosphere Prediction Studies (with the Department of Oceanography) o Torrey Pines Institute of Molecular Studies 2010 o Chella Group (Chemical Engineering) o Torrey Pines Institute of Molecular Studies o Yang Group (Institute of Molecular Biophysics) o Meteorology Department o Bruschweiler Group o Fajer Group (Institute of Molecular Biophysics) o Bass Group (Biology)
15
Research Support @ FSU Publications o Macromolecules o Bioinformatics o Systematic Biology o Journal of Biogeography o Journal of Applied Remote Sensing o Journal of Chemical Theory and Computation o Physical Review Letters o Journal of Physical Chemistry o Proceeding of the National Academy of Science o Biophysical Journal o Journal Chemical Theory Computation o Journal: J. Phys. Chem. o PLoS Pathogens o Journal of Virology o Journal of the American Chemical Society o The Journal of Chemical Physics o PLoS Biology o Ocean Modeling o Journal of Computer-Aided Molecular Design
16
Sliger Data Center Shared- HPC pfs FSU’s Shared-HPC Stage 1: Infiniband Connected Cluster
17
Single and Multiprocessor Usage Year 1
18
DSL Building Sliger Data Center Shared- HPC pfs Condor FSU’s Shared-HPC Stage 2: Alternative Backfilling
19
Backfilling Single Proc Jobs on Non- HPC Resources Using Condor
20
Condor Usage ~1000 processor cores available for single processor computations 2,573,490 processor hours used since Condor was made available to all HPC users in September Seven users have been using Condor from HPC Dominate users are Evolutionary Biology, Molecular Dynamics, and Statistics (same users that were submitting numerous single proc. jobs) Two workshop introducing it to HPC users
21
Single vs. Multi-processor Jobs Year 2
22
Single vs. Multi-processor Jobs Year 3
23
DSL Building Sliger Data Center Shared- HPC pfs Condor SMP FSU’s Shared-HPC Stage 3: Scalable SMP
24
One MOAB Queue for SMP or very large memory jobs Three “nodes” o M905 blade with 16 cores and 64GB mem o M905 blade with 24 cores and 64GB mem o 3Leaf system with up to 132 cores and 528 GB mem
25
DSL Building DSL Data Center Sliger Data Center Shared- HPC pfs Condor SMP 2° fs Vis
26
Interactive Cluster Functions Facilitates data exploration Provides venue for software not well suited for a batch scheduled environment o (e.g., some MatLab, VMD, R, Python, etc.) Provides access to hardware not typically found on standard desktops/laptops/mobile devises (e.g. lots of memory, high-end GPUs) Provides licensing and configuration support for software applications and libraries
27
Interactive Cluster Hardware Layout 8 high-end CPU based host nodes o Multi-core Intel or AMD processors o 4 to 8 GB of memory per core o 16X PCIe connectivity o QDR IB connectivity to Luster storage o IP (read-only) connectivity to Panasas o 10 Gbps connectivity to campus network backbone One C410x external PCI chassis o Compact o IPMI management o Supports up to 16 NVIDIA Tesla M2050 Up to 16.48 teraflops
28
DSL Building DSL Data Center Sliger Data Center Shared- HPC pfs Condor SMP 2° fs Vis Db.Web
29
Web/Database Hardware Function Facilitates creation of Data analysis Pipelines/Workflows Favored by external funding agencies o Demonstrated cohesive Cyberinfrastructure o Fits well into required Data Management Plans (NSF) Intended to facilitate access to data on Secondary storage or cycles on owner share of HPC Basic Software Install, no development support Bare Metal or VM
30
Web/Database Hardware Examples
32
FSU Research CI HPC HTC SMP 1° storage 2° Storage Vis and interactive DB and Web
33
Florida State University's Shared HPC Universities are by design multifaceted and lack a singular focus of support Local HPC resources should also be multifaceted and have a broad basis of support
34
HPC Summit University of Florida HPC Center
35
HPC Summit Short history Started in 2003 2004 Phase I: CLAS – Avery – OIT 2005 Phase IIb: o COE – 9 investors 2007 Phase IIb: o COE – 3 investors 2009 Phase III: o DSR – 17 investors - ICBR - IFAS 2011 Phase IV: o 22 investors
36
HPC Summit Budget Total budget o 2003-3004 $0.7 M o 2004-2005 $1.8 M o 2005-2006 $0.3 M o 2006-2007 $1.2 M o 2007-2008 $1.6 M o 2008-2009 $0.4 M o 2009-2010 $0.9 M
37
HPC Summit Hardware 4,500 cores 500 TB storage InfiniBand connected In three machine rooms o Connected by 20 Gbit/sec Campus Research Network
38
HPC Summit System software RedHat Enterprise Linux o through free CentOS distribution o upgrade once per year Lustre file system o mounted on all nodes o Scratch only o Provide backup through CNS service Requires separate agreement between researcher and CNS
39
HPC Summit Other software Moab scheduler (commercial license) Intel compilers (commercial license) Numerous applications o Open and commercial
40
HPC Summit Operation Shared cluster some hosted systems 300 users 90% - 95% utilization
41
HPC Summit Investor Model Normalized Computing Unit o $400 per NCU o Is one core o In fully functional system (RAM, disk, shared file system) o For 5 years
42
HPC Summit Investor Model Optional Storage Unit o $140 per OSU o 1 TB of file storage (RAID) on one of a few global parallel file systems (Lustre) o For 1 year
43
HPC Summit Other options Hosted system o Buy all hardware, we operate o No sharing Pay as you go o Agree to pay monthly bill o Equivalent (almost) to $400 NCU prorated on a monthly basis Or rates are 0.009 cents per hour o Cheaper than Amazon Elastic Cloud
44
www.ccs.miami.edu
45
Mission Statement UM CCS is establishing nationally and internationally recognized research programs, focusing on those of an interdisciplinary nature, and actively engaging in computational research to solve the complex technological problems of modern society. We provide a framework for promoting collaborative and multidisciplinary activities across the University and beyond
46
CCS overview Started in June 2007 Faculty Senate approval in 2008 Four Founding Schools: A&S, CoE, RSMAS, Medical Offices in all Campus ~30 FTEs Data Center at the NAP of Americas
47
UM CCS Research Programs and Cores Physical Science & Engineering Computational Biology & Bioinformatics Data Mining Visualization Computational Chemistry Software Engineering High Performance Computing Social Systems Informatics
48
Quick Facts Over 1,000 UM users 5,200 cores of Linux Based Cluster 1,500 cores of Power-based Cluster ~2.0 PT of Storage 4.0 PT of Back-up More at: o http://www.youtube.com/watch?v=JgUNBRJHr C4 o www.ccs.miami.edu
49
High Performance Computing UM Wide Resource Provides Academic Community & Research Partners with Comprehensive HPC Resources: o Hardware & Scientific Software Infrastructure o Expertise in Designing & Implementing HPC Solutions o Designing & Porting Algorithms & Programs to Parallel Computing Models Open Access of compute processing (first come serve) o Peer Review for large projects – Allocation Committee o Cost Center for priority access HPC services o Storage Cloud o Visualization and Data Analysis Cloud o Processing Cloud
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.