1 11 March 2013 John Hover Brookhaven Laboratory Cloud Activities Update John Hover, Jose Caballero US ATLAS T2/3 Workshop Indianapolis, Indiana.

Slides:



Advertisements
Similar presentations
Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
Advertisements

Ed Duguid with subject: MACE Cloud
University of Notre Dame
1 13 Nov 2012 John Hover Virtualized/Cloud-based Facility Resources John Hover US ATLAS Distributed Facility Workshop UC Santa Cruz.
Using EC2 with HTCondor Todd L Miller 1. › Introduction › Submitting an EC2 job (user tutorial) › New features and other improvements › John Hover talking.
1 October 2013 APF Summary Oct 2013 John Hover John Hover.
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
Efficiently Sharing Common Data HTCondor Week 2015 Zach Miller Center for High Throughput Computing Department of Computer Sciences.
Testing PanDA at ORNL Danila Oleynik University of Texas at Arlington / JINR PanDA UTA 3-4 of September 2013.
11 SERVER CLUSTERING Chapter 6. Chapter 6: SERVER CLUSTERING2 OVERVIEW  List the types of server clusters.  Determine which type of cluster to use for.
Next Generation of Apache Hadoop MapReduce Arun C. Murthy - Hortonworks Founder and Architect Formerly Architect, MapReduce.
Introduction to Amazon Web Services (AWS)
“Cloud bursting” on SZTAKI Cloud Attila Csaba Marosi Cloud Computing Research Group MTA SZTAKI LPDS 1 Summer School on Grid.
Cloud Computing using AWS C. Edward Chow. Advanced Internet & Web Systems chow2 Outline of the Talk Introduction to Cloud Computing AWS EC2 EC2 API A.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
Parallel Computing The Bad News –Hardware is not getting faster fast enough –Too many architectures –Existing architectures are too specific –Programs.
Module 13: Network Load Balancing Fundamentals. Server Availability and Scalability Overview Windows Network Load Balancing Configuring Windows Network.
glideinWMS: Quick Facts  glideinWMS is an open-source Fermilab Computing Sector product driven by CMS  Heavy reliance on HTCondor from UW Madison and.
The Glidein Service Gideon Juve What are glideins? A technique for creating temporary, user- controlled Condor pools using resources from.
Computing on the Cloud Jason Detchevery March 4 th 2009.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
BESIII distributed computing and VMDIRAC
OSG Site Provide one or more of the following capabilities: – access to local computational resources using a batch queue – interactive access to local.
M.A.Doman Short video intro Model for enabling the delivery of computing as a SERVICE.
BINP/GCF Status Report BINP LCG Site Registration Oct 2009
Ian Alderman A Little History…
Windows Azure Conference 2014 Deploy your Java workloads on Windows Azure.
Campus Grids Report OSG Area Coordinator’s Meeting Dec 15, 2010 Dan Fraser (Derek Weitzel, Brian Bockelman)
Integrating HPC into the ATLAS Distributed Computing environment Doug Benjamin Duke University.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Tekin Bicer Gagan Agrawal 1.
Experiences Using Cloud Computing for A Scientific Workflow Application Jens Vöckler, Gideon Juve, Ewa Deelman, Mats Rynge, G. Bruce Berriman Funded by.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
The Alternative Larry Moore. 5 Nodes and Variant Input File Sizes Hadoop Alternative.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
TeraGrid Advanced Scheduling Tools Warren Smith Texas Advanced Computing Center wsmith at tacc.utexas.edu.
Condor Usage at Brookhaven National Lab Alexander Withers (talk given by Tony Chan) RHIC Computing Facility Condor Week - March 15, 2005.
VApp Product Support Engineering Rev E VMware Confidential.
Pilot Factory using Schedd Glidein Barnett Chiu BNL
| nectar.org.au NECTAR TRAINING Module 5 The Research Cloud Lifecycle.
A Fully Automated Fault- tolerant System for Distributed Video Processing and Off­site Replication George Kola, Tevfik Kosar and Miron Livny University.
Commissioning the CERN IT Agile Infrastructure with experiment workloads Ramón Medrano Llamas IT-SDC-OL
Web Technologies Lecture 13 Introduction to cloud computing.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
Condor + Cloud Scheduler Ashok Agarwal, Patrick Armstrong, Andre Charbonneau, Ryan Enge, Kyle Fransham, Colin Leavett-Brown, Michael Paterson, Randall.
PROGRESS: GEW'2003 Using Resources of Multiple Grids with the Grid Service Provider Michał Kosiedowski.
1 Plans for “Clouds” in the U.S. ATLAS Facilities Michael Ernst Brookhaven National Laboratory.
Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer
Cloud Computing from a Developer’s Perspective Shlomo Swidler CTO & Founder mydrifts.com 25 January 2009.
HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.
PanDA HPC integration. Current status. Danila Oleynik BigPanda F2F meeting 13 August 2013 from.
Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.
Condor Week 09Condor WAN scalability improvements1 Condor Week 2009 Condor WAN scalability improvements A needed evolution to support the CMS compute model.
UCS D OSG School 11 Grids vs Clouds OSG Summer School Comparing Grids to Clouds by Igor Sfiligoi University of California San Diego.
HTCondor Annex (There are many clouds like it, but this one is mine.)
HTCondor Networking Concepts
Current Generation Hypervisor Type 1 Type 2.
Elastic Computing Resource Management Based on HTCondor
HTCondor Networking Concepts
Dag Toppe Larsen UiB/CERN CERN,
Dynamic Deployment of VO Specific Condor Scheduler using GT4
Examples Example: UW-Madison CHTC Example: Global CMS Pool
Dag Toppe Larsen UiB/CERN CERN,
Outline Expand via Flocking Grid Universe in HTCondor ("Condor-G")
ATLAS Cloud Operations
The Scheduling Strategy and Experience of IHEP HTCondor Cluster
Haiyan Meng and Douglas Thain
Managing Services with VMM and App Controller
Condor-G Making Condor Grid Enabled
Presentation transcript:

1 11 March 2013 John Hover Brookhaven Laboratory Cloud Activities Update John Hover, Jose Caballero US ATLAS T2/3 Workshop Indianapolis, Indiana

2 11 March 2013 John Hover Outline Addendum to November Santa Cruz Status Report Current BNL Status – Condor Scaling on EC2 – EC2 Spot Pricing and Condor – VM Lifecycle with APF – Cascading multi-target clusters Next Steps and Plans Discussion

3 11 March 2013 John Hover Condor Scaling 1 RACF recieved a $50K grant from Amazon: Great opportunity to test: – Condor scaling to thousands of nodes over WAN – Empirically determine costs Naive Approach: – Single Condor host (schedd, collector, etc.) – Single process for each daemon – Password authentication – Condor Connection Broker (CCB) Result: Maxed out at ~3000 nodes – Collector load causing timeouts of schedd daemon. – CCB overload? – Network connections exceeding open file limits – Collector duty cycle ->.99.

4 11 March 2013 John Hover Condor Scaling 2 Refined approach: – Split schedd from collector,negotiator,CCB – Run 20 collector processes. Configure startds to randomly choose one. Enable collector reporting, so that all sub- collectors report to one top-level collector (which is not public). – Tune OS limits: 1M open files, – Enable shared port daemon on all nodes: multiplexes TCP connections. Results in dozens of connections rather than thousands. – Enable session auth, so that each connection after the first bypasses password auth check.Result: – Smooth operation up to 5000 startds, even with large bursts. – No disruption of schedd operation on other host. – Collector duty cycle ~.35. Substantial headroom left. Switching to 7-slot startds would get us to slots, with marginal additional load.

5 11 March 2013 John Hover Condor Scaling 3 Overall results: – Ran ~5000 nodes for several weeks. – Production simulation jobs. Stageout to BNL. – Spent approximately $13K. Only $750 was for data transfer. – Moderate failure rate due to spot terminations. – Actual spot price paid very close to baseline, e.g. still less than.$.01/hr for m1.small. – No solid statistics on efficiency/cost yet, beyond a rough appearance of “competitive.”

6 11 March 2013 John Hover EC2 Spot Pricing and Condor On-demand vs. Spot – On-Demand: You pay standard price. Never terminates. – Spot: You declare maximum price. You pay current,variable spot price. If/when spot price exceeds your maximum, instance is terminated without warning. Note: NOT like priceline.com, where you pay what you bid.Problems: – Memory provided in units of 1.7GB, less than ATLAS standard. – More memory than needed per “virtual core” – NOTE: On our private Openstack, we created a 1-core, 2GB RAM instance type--avoiding this problem. Condor now supports submission of spot-price instance jobs. – Handles it by making one-time spot request, then cancelling it when fulfilled.

7 11 March 2013 John Hover EC2 Types TypeMemoryVCores“CUs”CU/Core$Spot/hr Typical $On- Demand/hr Slots? m1.small1.7G m1.medium3.75G m1.large7.5G m1.xlarge15G Issues/Observations: – We currently bid 3 *. Is this optimal? – Spot is ~1/10th the cost of on-demand. Nodes are ~1/2 as powerful as our dedicated hardware. Based on estimates of Tier 1 costs, this is competitive. – Amazon provides 1.7G memory per CU, not “CPU”. Insufficient for ATLAS work (tested). – Do 7 slots on m1.xlarge perform economically?

8 11 March 2013 John Hover EC2 Spot Considerations Service and Pricing – Nodes terminated without warning. (No signal.) – Partial hours are not charged. Therefore, ATLAS needs to consider: – Shorter jobs. Simplest approach. Originally ATLAS worked to ensure jobs were at least a couple hours, to avoid pilot flow congestion. Now we have the opposite need. – Checkpointing. Some work in Condor world providing the ability to checkpoint without linking to special libraries. (But not promising.) – Per-event stageout (event server). If ATLAS provides sub-hour units of work, we could get significant free time!

9 13 Nov 2012 John Hover VM Lifecycle with APF Condor scaling test used manually started EC2/Openstack VMs. Now we want APF to manage this: 2 AutoPyFactory (APF) Queues – First (standard) observes a Panda queue, submits pilots to local Condor pool. – Second observes a local Condor pool, when jobs are Idle, submits WN VMs to IaaS (up to some limit). Worker Node VMs – Condor startds join back to local Condor cluster. VMs are identical, don't need public IPs, and don't need to know about each other. Panda site (BNL_CLOUD) – Associated with BNL SE, LFC, CVMFS-based releases. – But no site-internal configuration (NFS, file transfer, etc).

10 13 Nov 2012 John Hover

11 13 Nov 2012 John Hover VM Lifecycle 2 Current status: – Automatic ramp-up working properly. – Submits properly to EC2 and Openstack via separate APF queues. – Passive draining when Panda queue work completes. – Out-of-band shutdown and termination via command line tool: Required configuration to allow APF user to retire nodes. Next step: – Active ramp-down via retirement from within APF. – Adds in tricky issue of “un-retirement” during alternation between ramp-up and ramp-down. – APF issues condor_off -peaceful -daemon startd -name – APF uses condor_q and condor_status to associate startds with VM jobs. Adds in startd status to VM job info and aggregate statistics. Next step 2: – Automatic termination of retired startd VMs. Accomplished by comparing condor_status and condor_status -master output.

12 13 Nov 2012 John Hover Ultimate Capabilities APF's intrinsic queue/plugin architecture, and code in development, will allow: – Multiple targets E.g., EC2 us-east-1, us-west-1, us-west-2 all submitted to in a load-balanced fashion. – Cascading targets, e.g.: We can preferentially utilize free site clouds (e.g. local Openstack or other academic clouds) Once that is full we submit to EC2 spot-priced nodes. During particularly high demand, submit EC on- demand nodes. Retire and terminate in reverse order. The various pieces exist and have been tested, but final integration in APF is in progress.

13 13 Nov 2012 John Hover Next Steps/Plans APF Development – Complete APF VM Lifecycle Management feature. – Simplify/refactor Condor-related plugins to reduce repeated code. Fault tolerance. – Run multi-target workflows. Is more between-queue coordination is necessary in practice. Controlled Performance/Efficiency/Cost Measurements – Test m1.xlarge (4 “cores”, 15GB RAM) with 4, 5, 6, and 7 slots. – Measure “goodput” under various spot pricing schemes. Is 3* sensible? – Google Compute Engine? Other concerns – Return to refinement of VM images. Contextualization.

14 13 Nov 2012 John Hover Questions/Discussion