ICER User Meeting 3/26/10. Agenda What’s new in iCER (Wolfgang) Whats new in HPCC (Bill) Results of the recent cluster bid Discussion of buy-in (costs,

Slides:



Advertisements
Similar presentations
NERCS Users’ Group, Oct. 3, 2005 Interconnect and MPI Bill Saphir.
Advertisements

Deploying GMP Applications Scott Fry, Director of Professional Services.
PRAKTICKÝ ÚVOD DO SUPERPOČÍTAČE ANSELM Infrastruktura, přístup a podpora uživatelů David Hrbáč
GPU System Architecture Alan Gray EPCC The University of Edinburgh.
Early Linpack Performance Benchmarking on IPE Mole-8.5 Fermi GPU Cluster Xianyi Zhang 1),2) and Yunquan Zhang 1),3) 1) Laboratory of Parallel Software.
HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
SUMS Storage Requirement 250 TB fixed disk cache 130 TB annual increment for permanently on- line data 100 TB work area (not controlled by SUMS) 2 PB near-line.
Academic and Research Technology (A&RT)
IFIN-HH LHCB GRID Activities Eduard Pauna Radu Stoica.
Distributed MapReduce Team B Presented by: Christian Bryan Matthew Dailey Greg Opperman Nate Piper Brett Ponsler Samuel Song Alex Ostapenko Keilin Bickar.
1 petaFLOPS+ in 10 racks TB2–TL system announcement Rev 1A.
Research Computing with Newton Gerald Ragghianti Nov. 12, 2010.
Tripwire Enterprise Server – Getting Started Doreen Meyer and Vincent Fox UC Davis, Information and Education Technology June 6, 2006.
HPCC Mid-Morning Break Powertools Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Research
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Capacity Planning in SharePoint Capacity Planning Process of evaluating a technology … Deciding … Hardware … Variety of Ways Different Services.
Cluster Components Compute Server Disk Storage Image Server.
HPCC Mid-Morning Break Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery Introduction to the new GPU (GFX) cluster.
Welcome! Computer 101 Session 2 With Laura Crichton.
Recent experience in buying and configuring a cluster John Matrow, Director High Performance Computing Center Wichita State University.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
Network Design A Step by Step Process. Design with Change in Mind Building the network is just the beginning Growing the network for larger numbers of.
Research Computing with Newton Gerald Ragghianti Newton HPC workshop Sept. 3, 2010.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Introduction to the HPCC Jim Leikert System Administrator High Performance Computing Center.
MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating.
Introduction to the HPCC Dirk Colbry Research Specialist Institute for Cyber Enabled Research.
David Hutchcroft on behalf of John Bland Rob Fay Steve Jones And Mike Houlden [ret.] * /.\ /..‘\ /'.‘\ /.''.'\ /.'.'.\ /'.''.'.\ ^^^[_]^^^ * /.\ /..‘\
1 Evolution of OSG to support virtualization and multi-core applications (Perspective of a Condor Guy) Dan Bradley University of Wisconsin Workshop on.
InfiniSwitch Company Confidential. 2 InfiniSwitch Agenda InfiniBand Overview Company Overview Product Strategy Q&A.
17-April-2007 High Performance Computing Basics April 17, 2007 Dr. David J. Haglin.
4 November 2008NGS Innovation Forum '08 11 NGS Clearspeed Resources Clearspeed and other accelerator hardware on the NGS Steven Young Oxford NGS Manager.
Future Server and Storage Technology Brian Minick, Infrastructure Design Leader - GE.
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
Kaizilege Karoma Storage Capacity + What to buy Binary Did you know that all of the information that travels through your computer is based on two commands?
Hardware. Make sure you have paper and pen to hand as you will need to take notes and write down answers and thoughts that you can refer to later on.
Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.
PDSF at NERSC Site Report HEPiX April 2010 Jay Srinivasan (w/contributions from I. Sakrejda, C. Whitney, and B. Draney) (Presented by Sandy.
26SEP03 2 nd SAR Workshop Oklahoma University Dick Greenwood Louisiana Tech University LaTech IAC Site Report.
Jacquard: Architecture and Application Performance Overview NERSC Users’ Group October 2005.
ITEP computing center and plans for supercomputing Plans for Tier 1 for FAIR (GSI) in ITEP  8000 cores in 3 years, in this year  Distributed.
CCS Overview Rene Salmon Center for Computational Science.
ALMA Archive Operations Impact on the ARC Facilities.
Obtaining Computer Allocations and Monitoring Use SCD User Meeting at AMS January 11, 2005 Ginger Caldwell, SCD.
GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop Status of the D1 (Event) and D2 (Spacecraft Data) Database Prototypes for DC1 Robert.
Rob Allan Daresbury Laboratory NW-GRID Training Event 25 th January 2007 Introduction to NW-GRID R.J. Allan CCLRC Daresbury Laboratory.
Ultimate Integration Joseph Lappa Pittsburgh Supercomputing Center ESCC/Internet2 Joint Techs Workshop.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
What’s Coming? What are we Planning?. › Better docs › Goldilocks – This slot size is just right › Storage › New.
Parallel Computers Today Oak Ridge / Cray Jaguar > 1.75 PFLOPS Two Nvidia 8800 GPUs > 1 TFLOPS Intel 80- core chip > 1 TFLOPS  TFLOPS = floating.
Getting Started: XSEDE Comet Shahzeb Siddiqui - Software Systems Engineer Office: 222A Computer Building Institute of CyberScience May.
Introduction to Exadata X5 and X6 New Features
Creating Grid Resources for Undergraduate Coursework John N. Huffman Brown University Richard Repasky Indiana University Joseph Rinkovsky Indiana University.
Slide 1 User-Centric Workload Analytics: Towards Better Cluster Management Saurabh Bagchi Purdue University Joint work with: Subrata Mitra, Suhas Javagal,
Sobolev(+Node 6, 7) Showcase +K20m GPU Accelerator.
Slide 1 Cluster Workload Analytics Revisited Saurabh Bagchi Purdue University Joint work with: Subrata Mitra, Suhas Javagal, Stephen Harrell (Purdue),
An Brief Introduction Charlie Taylor Associate Director, Research Computing UF Research Computing.
High Performance Computing Center ACI-REF Virtual Residency August 7-13, 2016 How do Design a Cluster Dana Brunson Asst. VP for Research Cyberinfrastructure.
Advanced Computing Facility Introduction
Buying into “Summit” under the “Condo” model
HPC usage and software packages
Low-Cost High-Performance Computing Via Consumer GPUs
Lattice QCD Computing Project Review
Low-Cost High-Performance Computing Via Consumer GPUs
System G And CHECS Cal Ribbens
Advanced Computing Facility Introduction
Constructing a system with multiple computers or processors
Example of an early computer system. Classification of operating systems. Operating systems can be grouped into the following categories: Supercomputing.
Presentation transcript:

iCER User Meeting 3/26/10

Agenda What’s new in iCER (Wolfgang) Whats new in HPCC (Bill) Results of the recent cluster bid Discussion of buy-in (costs, scheduling) Other

What’s New in iCER

New iCER Website Part of VPRGS – News – Showcased Projects – Supported Funding – Recent Publications

User Dashboard Common Portal to User Resources – FAQ – Documentation – Forums – Research Opportunities – Known Issues

Current Research Opportunities NSF Postdoc Fellowships for Transformative Computational Science using CyberInfrastructure Website – Proposals – Classes – Seminars – Papers – Jobs

50/50 match from iCER for a postdoc for large grant proposals (multi-investigator, inter-disciplinary) Currently only three matches picked up – Titus Brown – Scott Pratt – Eric Goodman Several other matches promised, but grants not decided yet More opportunities! Postdoc Matching

New Hire! Eric McDonald – System Programmer – Partnership with NSCL (Alex Brown et al.) Personnel

Interdisciplinary graduate education in high- performance-computing & science Big Data Leads: – Dirk Colbry – Bill Punch IGERT Grant Proposal

NSF STC – Funded, starting in June – $5M/year for 5 years New joint space with iCER & HPCC – First floor BPS – Former BPS library space BEACON

What’s New in HPCC

Graphics Cluster 32 node cluster 2 x Quad 2.4GHz 18GB ram Two Nvidia M1060 no Infinband (Ethernet only)

Result of a Buyin 21 of the nodes were purchased by funds from users Can be used by any HPCC user

Each nVidia Tesla M1060 Number of Streaming Processor Cores 240 Frequency of processor cores 1.3 GHz Single Precision peak floating point performance 933 gigaflops Double Precision peak floating point performance 78 gigaflops Dedicated Memory 4 GB GDDR3 Memory Speed 800 MHz Memory Interface 512-bit Memory Bandwidth 102 GB/sec System Interface PCIe

Example Script #!/bin/bash –login #PBS –l nodes=1:ppn=1:gfx10,walltime=01:00:00 #PBS –l advres=gpgpu.6364,gres=gpu:1 cd ${PBS_O_WORKDIR} module load cuda myprogram myarguments

CELL Processor 2 Playstation 3’s running linux for experimenting with CELL dev-cell08 and test- cell08 (see the web for more details)

Green Restrictions The machine Green is still up an running, especially after having removed some problematic memory Mostly replaced by AMD fat nodes On April 1 st, it will be reserved for jobs requesting 32 cores (or more) and/or 250 GB of memory (or more) Hope to help people running larger jobs

HPCC Stats Ganglia (off the main web page, Status) is back and working. Gives you a snapshot of the present system We are nearly done with a database of all run jobs that can be queried for all kinds of information. Should be up in the next couple of weeks.

Cluster Bid Results

How it was done HPCC submitted a Request for Quotes for a new cluster system. Targeted: – performance vs. power main concern – Inifinband – 3GB per core of memory – approximately $500K of cluster

Results Received 13 bids from 8 vendors Found 3 options that were suitable for the power, space, cooling and performance we were looking for. Looking for some guidance from you on a number of issues

Choice 1: Infiniband config Two ways to configure Infiniband: series of smaller switches configured in a hierarchy (leaf switches) one big switch (director) leaf switches are cheaper, harder to expand (requires reconfiguration), more wires, more points of failure director is more expandable, convenient, expensive

Choice 2: Buyin Cost buyin cost could reflect just the cost of the compute nodes itself, HPCC provides infrastructure (switches, wires, racks, etc.) buyin cost could reflect the total hardware cost obviously, subsidizing costs means cheaper buyin costs, fewer general nodes.

Remember HPCC is still subsidizing costs, even if hardware is not subsidized still must buy air-conditioning equipment, OS licenses, MOAB (scheduling) licenses, software licenses (Not to mention salaries, power) Combined, “other” hardware will run to about $75K scheduler about $100K for 3 years.

Some Issues 1 node = 8 cores, 1 chassis = 4 nodes. Buyin will be at the chassis level (32 cores)

For 1024 cores Vendor/configTotalPer node/subs idized Per node/full Dell/leaf$418K$2,278 ($9,112) $3,260 ($13,040) HP/leaf$460K$2,482 ($9,928) $3,594 ($14,376) Dell/director$523K$2,278 ($9,112) $4,086 ($16,344)

Scheduling We are working on some better scheduling methods. We think they have promise and would be very useful to the user base For the moment, it will be the Purdue model. We guarantee access to nodes within 8 hours of a request from a buyin user. Still a week max run time (though can be changed)