Jason Stowe Condor Week 2009 April 22 nd, 2009
Coming to Condor Week since Started as a User
Users hunger for features
AccountingGroups (2004/2005) Configuration w/Pipes (2005/2006) GroupResourcesUsed (2006/2007) Condor in Cloud (2007/2008) Resource Weights (2008/2009) Based upon customer requests
Focus on software development for managing Condor at any scale, and provide services that complement the technology
Universities, Fortune 500s, Government Labs, Small/Medium Businesses, that use Condor
Users like Condor because... It’s open, it works, flexible, (corporations) no lock-in API/Operating System, and...
The Community
Today, let’s talk about a few challenges, solutions
War Story #1: Compute & Data
Whenever you find or solve a computation problem, you discover a data problem.
“Dark” or Latent, Unused Storage on any OS/Device
Empty space dispersed across machines in unusable sizes
“We need more filer space, but we have empty space on all our machines.”
So we looked at Hadoop
New type of storage: Aggregated or “Cloud” Storage
Block Store Architecture
But how do we use it?
1.5 years ago: It works well to access it in Java, but what about mounting?
So we tried WebDAV
Next up, open source FUSE driver
Need: Windows/Linux, Reliable, Large Files, scalable, and Read/Write
Mountable drivers Linux(FUSE) / Windows (IFS)
CloudFS Architecture
When we rolled it out...
Customers Asked for Surprising Features HTTP/REST Protocols similar to Amazon S3 HTTP/REST Protocols similar to Amazon S3Reasons: Installing mountable driver across servers/workstations prohibitive Want similar interface to various cloud storage providers => Internal Cloud FTP Interface – Because it is simple! FTP Interface – Because it is simple!
Status Today
Mountable Multi-platform Drivers. Linux: SUSE 10, RHEL/CentOS 4&5, Windows 2k3 +, OSX 10.3+
Encryption to avoid snooping sensitive data
Data Nodes built on Java: Linux, Windows, OSX, Solaris
RESTful Storage Service & FTP interface
Management interface for controlling storage features (Integrating with CycleServer)
Looking forward to condor_hadoop!
War Story #2: Cloud Calculations
Condor users Peak vs. Median usage Problem
Need for compute power comes up suddenly
Condor Users hunger for resources
Condor users balance “We need more servers for big runs” and “Our servers are 40% utilized”
Many ways to solve this problem using EC2
Use cases do exist for adding nodes to a local condor pool using Amazon EC2
We favored entire pools in cloud
Data Scheduling, Performance issues
Run workflows faster using resources you could never buy...
can test CycleServer at a scale our users have and we don’t
Need 1000 node Condor Pool Wait 15 minutes
Dynamic Resources => Pool can be sized to the jobs Dynamic Resources => Pool can be sized to the jobs
1 corex 1000 hrs = 1000 core x 1 hr = ~$200
Sounds good, but how do we do this for a Workflow like BLAST?
From e-science 2008: For 64x the processors Hadoop Running Blast: 57x mpiBLAST: 52.4x
High-CPU Amazon EC2 nodes have best price/performance
Scalability: 2x CPUs = x 64 CPUS = 60.7x Speed-up
Why High Throughput leads to Efficient Computing
Another User: Worked with Varian - Mass Spectrometers Other High-Tech Lab Equipment
Problem: Coming up on a conference, needed to run a large simulation
Six Weeks On an internal Condor pool
Deployed a Condor pool in CycleCloud
Same 6-week Job
Ran < 1 Day
War Story #3: Management
Condor Tutorial mentions “Why use a personal Condor?” i.e. Condor on few nodes...
Condor on 1 computer Gets you policies, fault-tolerance, Etc.
Similarly, management issues come up even on small pools
Collaborating with U. of W. Madison
Managing Configuration Files (our Config with Pipes CW2006)
Exploring ClassAds/LogFiles becomes problematic
Visualization, Reporting, etc.
Man-decades on development of tools to assist running Condor
Have demo against Madison pool Come see me. We’d love more use cases
Questions? Thank you For more information go to: We constantly see opportunities for talented Condor folks, so please feel free to contact us! Jason Stowe jstowe - cyclecomputing.com