An Introduction to the Open Science Data Cloud Heidi Alvarez Florida International University Robert L. Grossman University of Chicago Open Cloud Consortium.

Slides:



Advertisements
Similar presentations
Joint CASC/CCI Workshop Report Strategic and Tactical Recommendations EDUCAUSE Campus Cyberinfrastructure Working Group Coalition for Academic Scientific.
Advertisements

The Internet2 NET+ Services Program Jerry Grochow Interim Vice President CSG January, 2012.
The Open Science Data Cloud Robert L. Grossman University of Chicago and Open Cloud Consortium April 4, 2012 A 501(c)(3) not-for-profit operating clouds.
GENI: Global Environment for Networking Innovations Larry Landweber Senior Advisor NSF:CISE Joint Techs Madison, WI July 17, 2006.
EInfrastructures (Internet and Grids) US Resource Centers Perspective: implementation and execution challenges Alan Blatecky Executive Director SDSC.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
Supercomputing Center Jysoo Lee KISTI Supercomputing Center National e-Science Project.
DuraCloud A service provided by Sandy Payette and Michele Kimpton.
DuraCloud Managing durable data in the cloud Michele Kimpton, Director DuraSpace.
The InCommon Federation The U.S. Access and Identity Management Federation
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
INTERNET2 COLLABORATIVE INNOVATION PROGRAM DEVELOPMENT Florence D. Hudson Senior Vice President and Chief Innovation.
Computational Scientometrics Studying science by scientific means Dr. Katy Börner Cyberinfrastructure for Network Science Center, Director Information.
CI Days: Planning Your Campus Cyberinfrastructure Strategy Russ Hobby, Internet2 Internet2 Member Meeting 9 October 2007.
Data! Philip E. Bourne Ph.D. Associate Director for Data Science National Institutes of Health.
Climate Sciences: Use Case and Vision Summary Philip Kershaw CEDA, RAL Space, STFC.
BIRN Update Carl Kesselman Professor of Industrial and Systems Engineering Information Sciences Institute Fellow Viterbi School of Engineering University.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Transformation of Research and Education in the 21 st Century Edward Seidel Director, Office of Cyberinfrastructure National Science Foundation
Open Science Grid For CI-Days Elizabeth City State University Jan-2008 John McGee – OSG Engagement Manager Manager, Cyberinfrastructure.
1 Pan-American Advanced Studies Institute (PASI) Program Grid Computing and Advanced Networking Technologies for e-Science Mendoza, Argentina May 15-21,
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes – Bretagne Atlantique Rennes, France
ESIP Federation: Connecting Communities for Advancing Data, Systems, Human & Organizational Interoperability November 22, 2013 Carol Meyer Executive Director.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
Research and Educational Networking and Cyberinfrastructure Russ Hobby, Internet2 Dan Updegrove, NLR University of Kentucky CI Days 22 February 2010.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
EPSCoR Cyberinfrastructure Assessment Workshop North Dakota Jurisdictional Assessment October 15, 2007 Bonnie Neas VP for IT North Dakota State University.
Project Matsu: Large Scale On-Demand Image Processing for Disaster Relief Collin Bennett, Robert Grossman, Yunhong Gu, and Andrew Levine Open Cloud Consortium.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Presented by: Presented by: Tim Cameron CommIT Project Manager, Internet 2 CommIT Project Update.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
NanoHUB.org and HUBzero™ Platform for Reproducible Computational Experiments Michael McLennan Director and Chief Architect, Hub Technology Group and George.
Master of Science in Biological Informatics PROGRAM DESCRIPTION The MS in Biological Informatics program program aims.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Lessons About Sustainability Learned from the Open Science Data Cloud Robert Grossman University of Chicago & Open Cloud Consortium.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
3 December 2015 Examples of partnerships and collaborations from the Internet2 experience Interworking2004 Ottawa, Canada Heather Boyles, Internet2
Exploring ‘Workspaces’ Tom Visser, SARA compute and networking services, Amsterdam Garching Workshop 21 st September 2010.
Cyberinfrastructure: An investment worth making Joe Breen University of Utah Center for High Performance Computing.
National Cybersecurity Center of Excellence Increasing the deployment and use of standards-based security technologies Mid-Atlantic Federal Lab Consortium.
Award # funded by the National Science Foundation Award #ACI Jetstream: A Distributed Cloud Infrastructure for.
The OptIPuter Project Tom DeFanti, Jason Leigh, Maxine Brown, Tom Moher, Oliver Yu, Bob Grossman, Luc Renambot Electronic Visualization Laboratory, Department.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
DuraCloud Open technologies and services for managing durable data in the cloud Michele Kimpton, CBO DuraSpace.
1 Overall Architectural Design of the Earth System Grid.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Integrating Data Mining and Data Management Technologies for Scholarly Inquiry Ray R. Larson University of California, Berkeley Paul Watry Richard Marciano.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Advanced research and education networking in the United States: the Internet2 experience Heather Boyles Director, Member and Partner Relations Internet2.
Globus.org/genomics Globus Galaxies Science Gateways as a Service Ravi K Madduri, University of Chicago and Argonne National Laboratory
NASA Earth Exchange (NEX) A collaborative supercomputing environment for global change science Earth Science Division/NASA Advanced Supercomputing (NAS)
National Archives Center for Advanced Systems and Technologies (NCAST) The National Archives and Records Administration Welcome! Now What? Mark Conrad.
TERENA June 3 rd,2013 Julio Ibarra, PhD. Assistant Vice President of Technology Augmented Research (CIARA) PARTNERSHIP FOR INTERNATIONAL RESEARCH AND EDUCATION.
The Helmholtz Association Project „Large Scale Data Management and Analysis“ (LSDMA) Kilian Schwarz, GSI; Christopher Jung, KIT.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
Using iRODS with the EnginFrame Grid Portal into the GRIDA3 project Francesco Locunto Marco Piras Matteo Vocale.
ICPSR Data Fair November 8, 2010 Katherine McNeill, MIT Libraries
Computing Clusters, Grids and Clouds Globus data service
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
Tools and Services Workshop
Joslynn Lee – Data Science Educator
Computer Science Department, University of Missouri, Columbia
Presentation transcript:

An Introduction to the Open Science Data Cloud Heidi Alvarez Florida International University Robert L. Grossman University of Chicago Open Cloud Consortium October 10, 2013

1. Open Science Data Cloud (OSDC)

Open Science Data Cloud (OSDC) OSDC is a Science Cloud Service Provider (CSP) Operated by not-for-profit Open Cloud Consortium OSDC is a 6 PB / 12,000 core science cloud 1 PB science data for the research community 1 PB of biomedical data for medical research We have been doubling in size each year We run production services for NASA and NIH researchers Interoperate with Amazon Web Services a.k.a. AWS (still rudimentary) Hundreds of users (not thousands) Typical job uses 1000s of core hours over ’s TB

Designed to hold Protected Health Information (PHI) e.g. genomic data, electronic medical records, etc. (HIPAA, FISMA) Earth sciences Biological sciences Social sciences Digital humanities ACL, groups, etc. Science Cloud Biomedical Cloud

What You Get with the OSDC Login with your university credentials via InCommon Launch virtual machines, virtual clusters, access to large Hadoop clusters, etc. Access PB+ of open and protected data Manage files, collections of files, collections of collections Manage users, groups of users Manage accounts, sub-accounts Efficient transfer of large data (UDT, UDR)

8 U.S based not-for-profit corporation. Companies: Cisco, Yahoo!, Infoblox, … Universities: University of Chicago, Northwestern Univ., Johns Hopkins, Calit2, etc. Federal agencies and labs: NASA, LLNL, ORNL International university and government partners Manages cloud computing infrastructure to support scientific research: Open Science Data Cloud. Manages cloud computing testbeds: Open Cloud Testbed.

Our Point of View We want to develop as little technology and software as possible – we want others to develop software and technology. We focus on providing researchers the ability to compute over large and very large datasets. We need open source solutions. Today it is difficult to interoperate with AWS for our protected data cloud, but we expect this to change (someday). Run lights out over multiple data centers connected with 10G (soon 100G) networks.

2. Challenges

OSDC Data Centers and Networks We have three data centers – Chicago with 100G to StarLight – FIU with 10G to StarLight – Livermore Valley Open Campus 10G to StarLight We’re planning one more data center with 100G connection to StarLight We are looking to interoperate the OSDC with international partners over 10G and 100G networks

Challenges We are focusing on the following: – How do we authenticate, authorize and provide access controls to researchers at our international partners to data and to cloud based services (storage and compute) – We need open source implementations of these services – We need trust relationships with our peers We are running a series of interoperability workshops to try to get this right.

PARTNERSHIP FOR INTERNATIONAL RESEARCH AND EDUCATION NSF Award #

National Science Foundation Partnership for International Research and Education 5 year program 2010 – 2014 at $3.5M. Prepares students to compete in the global cyberinfrastructure community Provides international research and education experiences around the world! The student/faculty/scientist research teams help develop large-scale distributed computing capabilities, data and, State-of-the-art services for integrating, analyzing, sharing and archiving scientific data.

Malcolm Atkinson – School of Informatics, Edinburgh University, Scotland, UKSchool of Informatics Paola Grosso & Cees de Laat – Faculty of Science, Informatics Institute, University of Amsterdam, The NetherlandsInformatics Institute Karen Langona and Tereza Cristina Carvalho - LARC – Laboratory of Computer Networks and Architecture at the University of Sao Paulo Brazil- LARC – Laboratory Satoshi Sekiguchi – National Institute of Advanced Industrial Science and Technology (AIST),JapanAIST Chung-I Wu – Beijing Institute of Genomics (BIG), Chinese Academy of SciencesBIG

What? Funded internships for US citizens and residents, which provide the chance to participate in sophisticated international research collaborations. When? Summer of 2014 How long? 6 weeks Where? At any of our international partners.international partners

Websites: news.opensciencedatacloud.org, opensciencedatacloud.org news.opensciencedatacloud.org opensciencedatacloud.org Mailing list, Summer Workshops Mailing listSummer Workshops

Questions?

Thank You!