Bringing your favorite analysis applications to iPlant using Docker containers Nirav Merchant
Topic Coverage: Which app can you bring to the Cyverse Where can you run your app ? Choosing the right platform to run your app at Cyverse What is container technology Benefits of running your container at Cyverse Taking your container from laptop to Cyverse Sharing your app with the world (using DE+Docker) Hands on walk through
3 + = Simple Formula for Success
The Reality 4 ++ Excel, R PERL Python ARCGIS Java Ruby Fortran C C# C++ Matlab etc. Excel, R PERL Python ARCGIS Java Ruby Fortran C C# C++ Matlab etc. Amazon Azure Rackspace Campus HPC XSEDE Etc. Amazon Azure Rackspace Campus HPC XSEDE Etc. and lots of glue…..
+ = Simple Formula
Where can you run your apps Look at the capability of your application to use CPU, RAM and run time needed What happens when you run a job in: DE (Regular and HPC) Agave Atmosphere Bisque Pains of bringing your app to Cyverse
Container technology: What is it about ? Allows you to create a self contained package that contains: The specific operating system version (say Ubuntu ) Your application All of the parts your application needs (such as libraries and other dependencies) Ability to share this with other users This single package can now be run on any computing system that supports Container technology (regardless of its own version of operating system)
Container technology: Docker nding-docker/ nding-docker/ Has many interesting features and capabilities Parts of Docker you need to know about: Docker client/command line (CLI) Docker file Docker image Docker registry Docker container Most important concept of working with large amount of data in Docker: The union file system
How does it work together
What happens when you run a job in DE: Condor looks for a machine that matches your criteria (RAM, CPU, Disk Space) Once it find a suitable match: Data placement container runs and brings the data you want to operate on to that node from data store Your app (Docker container) runs (with the data visible to it as union file system) Date placement container for returning data data back to data store
How do you get started Check the step by step instructions in Wiki at: Get Docker setup on your local machine (win,mac,linux) or use Atmosphere Plan your steps i.e what you want to do Carry out those steps and verify that things work Create a Docker file file from those steps Submit the request for a “new tool” Once you hear back design your interface (and profit)
Future Directions Ability to bring any Docker images from private repository, Docker hub, files etc. Share your app/container within your group Ability to bring your own compute for containers and attach it to Cyverse pool and manage who can send jobs using it
Word of Caution Containers are very powerful and has many bells and whistles (only choose parts that you really need !) Avoid storing data inside of containers Keep containers light and nimble, build on provided base images from trusted source (iPlant prefers Ubuntu 14.X and CentOS 7.X from Docker hub) Do not trust a app without Docker file (its not easy to recreate and a blackbox, bad for reproducibility )
Thanks Linux community for containers Docker for making things more useable Cyverse/iPlant Core SW team for integrating Docker Eric Lyons for writing the first tutorial Many users for building Docker files and submitting those !! Whole iPlant and Cyverse team and community