Computer Science Research Ian Foster University of Chicago & Argonne National Laboratory GriPhyN NSF Project Review 29-30 January 2003.

Slides:



Advertisements
Similar presentations
SWITCH Visit to NeSC Malcolm Atkinson Director 5 th October 2004.
Advertisements

Virtual Data and the Chimera System* Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science.
High Performance Computing Course Notes Grid Computing.
The ADAMANT Project: Linking Scientific Workflows and Networks “Adaptive Data-Aware Multi-Domain Application Network Topologies” Ilia Baldine, Charles.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN GriPhyN: Grid Physics Network and iVDGL: International Virtual Data Grid Laboratory.
Pegasus: Mapping complex applications onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
Knowledge Environments for Science: Representative Projects Ian Foster Argonne National Laboratory University of Chicago
Applied Architectures Eunyoung Hwang. Objectives How principles have been used to solve challenging problems How architecture can be used to explain and.
The Grid as Infrastructure and Application Enabler Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
The Grid is a complex, distributed and heterogeneous execution environment. Running applications requires the knowledge of many grid services: users need.
10/20/05 LIGO Scientific Collaboration 1 LIGO Data Grid: Making it Go Scott Koranda University of Wisconsin-Milwaukee.
Networked Storage Technologies Douglas Thain University of Wisconsin GriPhyN NSF Project Review January 2003 Chicago.
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
INFSO-RI Enabling Grids for E-sciencE The US Federation Miron Livny Computer Sciences Department University of Wisconsin – Madison.
Miguel Branco CERN/University of Southampton Enabling provenance on large-scale e-Science applications.
GriPhyN and Data Provenance The Grid Physics Network Virtual Data System DOE Data Management Workshop SLAC, 17 March 2004 Mike Wilde Argonne National Laboratory.
Pegasus: Planning for Execution in Grids Ewa Deelman Information Sciences Institute University of Southern California.
GriPhyN Status and Project Plan Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Welcome and Condor Project Overview.
Ruth Pordes, Fermilab CD, and A PPDG Coordinator Some Aspects of The Particle Physics Data Grid Collaboratory Pilot (PPDG) and The Grid Physics Network.
Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger September 29, 2009.
Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
The roots of innovation Future and Emerging Technologies (FET) Future and Emerging Technologies (FET) The roots of innovation Proactive initiative on:
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
Major Grid Computing Initatives Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
José D. Zamora, Sean R. Morriss and Manuela Campanelli.
GriPhyN EAC Meeting (Jan. 7, 2002)Carl Kesselman1 University of Southern California GriPhyN External Advisory Committee Meeting Gainesville,
GriPhyN EAC Meeting (Apr. 12, 2001)Paul Avery1 University of Florida Opening and Overview GriPhyN External.
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
Pegasus: Running Large-Scale Scientific Workflows on the TeraGrid Ewa Deelman USC Information Sciences Institute
1 ARGONNE  CHICAGO Grid Introduction and Overview Ian Foster Argonne National Lab University of Chicago Globus Project
Authors: Ronnie Julio Cole David
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
Pegasus: Mapping complex applications onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
The GriPhyN Planning Process All-Hands Meeting ISI 15 October 2001.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Middleware Camp NMI (NSF Middleware Initiative) Program Director Alan Blatecky Advanced Networking Infrastructure and Research.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
The Grid Effort at UF Presented by Craig Prescott.
High Energy Physics and Grids at UF (Dec. 13, 2002)Paul Avery1 University of Florida High Energy Physics.
May 6, 2002Earth System Grid - Williams The Earth System Grid Presented by Dean N. Williams PI’s: Ian Foster (ANL); Don Middleton (NCAR); and Dean Williams.
CEDPS Data Services Ann Chervenak USC Information Sciences Institute.
The Grid Enabling Resource Sharing within Virtual Organizations Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department.
Planning Ewa Deelman USC Information Sciences Institute GriPhyN NSF Project Review January 2003 Chicago.
VDT 1 The Virtual Data Toolkit Todd Tannenbaum (Alain Roy)
Pegasus: Planning for Execution in Grids Ewa Deelman, Carl Kesselman, Gaurang Mehta, Gurmeet Singh, Karan Vahi Information Sciences Institute University.
State of LSC Data Analysis and Software LSC Meeting LIGO Hanford Observatory November 11 th, 2003 Kent Blackburn, Stuart Anderson, Albert Lazzarini LIGO.
2005 GRIDS Community Workshop1 Learning From Cyberinfrastructure Initiatives Grid Research Integration Development & Support
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GriPhyN Project Paul Avery, University of Florida, Ian Foster, University of Chicago NSF Grant ITR Research Objectives Significant Results Approach.
GriPhyN Management Mike Wilde University of Chicago, Argonne Paul Avery University of Florida GriPhyN NSF Project.
U.S. Grid Projects and Involvement in EGEE Ian Foster Argonne National Laboratory University of Chicago EGEE-LHC Town Meeting,
NSF Middleware Initiative Purpose To design, develop, deploy and support a set of reusable, expandable set of middleware functions and services that benefit.
Open Science Grid in the U.S. Vicky White, Fermilab U.S. GDB Representative.
G G riPhyN Project Review Criteria l Relevance to Information Technology l Intellectual Merit l Broader Impacts l ITR Evaluation Criteria (innovation in.
Management & Coordination Paul Avery, Rick Cavanaugh University of Florida Ian Foster, Mike Wilde University of Chicago, Argonne
Internet2 Spring Meeting NSF Middleware Initiative Purpose To design, develop, deploy and support a set of reusable, expandable set of middleware functions.
Realizing the Promise of Grid Computing Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science.
Towards deploying a production interoperable Grid Infrastructure in the U.S. Vicky White U.S. Representative to GDB.
1 USC Information Sciences InstituteYolanda Gil AAAI-08 Tutorial July 13, 2008 Part IV Workflow Mapping and Execution in Pegasus (Thanks.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Cloudy Skies: Astronomy and Utility Computing
Pegasus and Condor Gaurang Mehta, Ewa Deelman, Carl Kesselman, Karan Vahi Center For Grid Technologies USC/ISI.
Presentation transcript:

Computer Science Research Ian Foster University of Chicago & Argonne National Laboratory GriPhyN NSF Project Review January 2003 Chicago

229 Jan 2003 Ian Foster, U.Chicago Computer Science Research l Introduction & Context (Ian Foster: 30 mins) –Vision : Virtual data as e-science enabler –Organization: Structure & interactions –Dissemination: Targets and mechanisms –The nature of future challenges l Computer science research –Virtual data (Mike Wilde: 15) –Scheduling, planning (Ewa Deelman: 15) –Execution (Mike Franklin: 15) –Performance (Valerie Taylor: 15) l Technology delivery (Miron Livny: 15) –Virtual Data Toolkit l Student presentations (60)

329 Jan 2003 Ian Foster, U.Chicago Computer Science Research l Introduction & Context (Ian Foster: 30 mins) –Vision : Virtual data as e-science enabler –Organization: Structure & interactions –Dissemination: Targets and mechanisms –The nature of future challenges l Computer science research –Virtual data (Mike Wilde: 15) –Scheduling, planning (Ewa Deelman: 15) –Execution (Mike Franklin: 15) –Performance (Valerie Taylor: 15) l Technology delivery (Miron Livny: 15) –Virtual Data Toolkit l Student presentations (60)

429 Jan 2003 Ian Foster, U.Chicago PetaScale Virtual Data Grids (1) Virtual Data Tools Request Planning & Scheduling Tools Request Execution & Management Tools Transforms Distributed resources (code, storage, computers, and network ) è Resource è Management è Services Resource Management Services è Security and è Policy è Services Security and Policy Services è Other Grid è Services Other Grid Services Interactive User Tools Production Team Individual Investigator Research group Raw data source  PetaOps  Petabytes Performance

529 Jan 2003 Ian Foster, U.Chicago Petascale Virtual Data Grids (2)

629 Jan 2003 Ian Foster, U.Chicago Computer Science and GriPhyN Computer Science Research Virtual Data Toolkit Partner Physics Projects Larger Science Community Globus, Condor, NMI, EU DataGrid, PPDG Communities Production Deployment Tech Transfer Techniques & software Requirements Prototyping & experiments Other linkages: - Work force - CS researchers - Industry

729 Jan 2003 Ian Foster, U.Chicago Computer Science Challenges (1) l Virtual data –Representation, discovery, & manipulation of workflows and associated data & programs l Planning –Mapping workflows in an efficient, policy- aware manner to distributed resources l Execution –Executing workflows, including data movements, reliably and efficiently l Performance –Monitoring aspects of system performance for scheduling & troubleshooting

829 Jan 2003 Ian Foster, U.Chicago Computer Science Challenges (2) l Engage meaningfully with physics groups l Provide educational opportunities l Develop, package, deliver, and support quality software l Achieve outreach to groups outside partner physics experiments

929 Jan 2003 Ian Foster, U.Chicago Computer Science Research l Introduction & Context (Ian Foster: 30 mins) –Vision : Virtual data as e-science enabler –Organization: Structure & interactions –Dissemination: Targets and mechanisms –The nature of future challenges l Computer science research –Virtual data (Mike Wilde: 15) –Scheduling, planning (Ewa Deelman: 15) –Execution (Mike Franklin: 15) –Performance (Valerie Taylor: 15) l Technology delivery (Miron Livny: 15) –Virtual Data Toolkit l Student presentations (60)

1029 Jan 2003 Ian Foster, U.Chicago GriPhyN Computer Science Team l U.Chicago: Dumitrescu, Foster, Iamnitchi, Milligan, Ranganathan, Ripeanu, Voeckler, Wilde l USC/ISI: Deelman, Kesselman, Mehta, Patil, Singh, Vahi l NWU -> TAMU: Taylor, Yin l UCB: Franklin, Liu l UCSD: Marzullo, Moore, Zhang, Jagatheesan l UW-Madison: Alderman, Arpaci-Dusseau, Arpaci- Dusseau, Bailey, Bent, Kosar, Livny, Roy, Stanley, Thain l UF: Arbee, George, Jiang, Katageri, Ranka, Rodriguez l UT Brownsville: Campanelli, Morris, Zamora l LBNL: Shoshani Faculty/Staff, Student/Postdoc (underlined = present)

1129 Jan 2003 Ian Foster, U.Chicago Computer Science Research: How do We Work? l System architecture & virtual data toolkit as two overarching organizational mechanisms l Project activities all defined in relationship to these organizing principles: –Research: Explore new techniques to guide evolution of the system architecture and VDT –Development: Construct VDT software –Evaluation: Apply and evaluate VDT software and/or new techniques in context of application challenges

1229 Jan 2003 Ian Foster, U.Chicago Computer Science Research: How Are We Coordinated? l The activities of this large, multidisciplinary group are coordinated by frequent and multivalent communications –Face-to-face meetings in large & small groups –Formal and informal documents defining requirements, challenge problems, testbeds – , phone calls, videoconferences –Cooperation on challenge problems and technology and application demonstrations –Cooperation on software releases

1329 Jan 2003 Ian Foster, U.Chicago GriPhyN Architecture/VDT and CS Research Projects Virtual Data Planning Execution Chimera Virtual Data System + Pegasus Planner DAGman Workflow Globus Toolkit, Condor, Ganglia, Etc. Partial Queries (Liu, Franklin) Decentralized scheduling (Ranganathan) Fault-tolerant master-worker (Marzullo) Scalable replica location service (UC, ISI team) Policy-aware scheduling (Dumitrescu) Ontologies (Zhao) NeST Storage mgmt (UW team) Virtual data language design (Voeckler,Wilde) AI Planning (Deelman,Narang) Virtual data language applns (Milligan, Zhao) DAGman enhancements (UW team) Prophesy (Taylor, Yin) HP monitoring (George) VDT Research

1429 Jan 2003 Ian Foster, U.Chicago GriPhyN Arch/VDT—CS Research Degree of Coupling Virtual Data Planning Execution Chimera Virtual Data System + Pegasus Planner DAGman Workflow Globus Toolkit, Condor, Ganglia, Etc. Partial Queries (Liu, Franklin) Decentralized scheduling (Ranganathan) Fault-tolerant master-worker (Marzullo) Scalable replica location service (UC, ISI team) Policy-aware scheduling (Dumitrescu) Ontologies (Zhao) NeST Storage mgmt (UW team) Virtual data language design (Voeckler,Wilde) AI Planning (Deelman,Narang) Virtual data language applns (Milligan, Zhao) DAGman enhancements (UW team) Prophesy (Taylor, Yin) HP monitoring (George) VDT Research Already Underway Pending

1529 Jan 2003 Ian Foster, U.Chicago Examples of Technology Injection: Chimera R&D Timeline Chimera-2 Type model Dataset catalog Metadata Hyperlinks Instance tracking Performance data Chimera-1 Java code & class model XML VDL TR/DV model Compound TRs General Grid exec env Optimized DB schema Chimera-0 Derivations only Grid exec environment (prototype) PERL & PostgresQL Sloan cluster finding APPSAPPS TECHTECH CMS analysis prototype w/ROOT CMS official event simulation Sloan cluster- finding science CMS & ATLAS analysis w/ROOT, CLARENS, JAS LIGO pulsar search ATLAS events- on- demand CMS event simulation prototyping Chimera-3 Knowledge repr. Policy-driven planners VD browsers, composers … 2004 Sloan near- earth object Bio Grid facility …

1629 Jan 2003 Ian Foster, U.Chicago Computer Science Research l Introduction & Context (Ian Foster: 30 mins) –Vision : Virtual data as e-science enabler –Organization: Structure & interactions –Dissemination: Targets and mechanisms –The nature of future challenges l Computer science research –Virtual data (Mike Wilde: 15) –Scheduling, planning (Ewa Deelman: 15) –Execution (Mike Franklin: 15) –Performance (Valerie Taylor: 15) l Technology delivery (Miron Livny: 15) –Virtual Data Toolkit l Student presentations (60)

1729 Jan 2003 Ian Foster, U.Chicago Dissemination: Targets l Researchers and educators –Facilitate creation of new knowledge l Computer science research community –Contribute to knowledge –Engage community in solving our problems l Open source community –Contribute to open Grid technology base l Industry –Contribute to vibrant commercial technology

1829 Jan 2003 Ian Foster, U.Chicago Dissemination: Mechanisms l Software –VDT: adoption by LHC Computing Grid –Globus Toolkit and Condor systems l Publications and talks –XX papers, YY tech reports, ZZ talks l Workshops and meetings –E.g., “Data Derivation & Provenance”, Oct 02 l Community activities –E.g., advisory committees, GGF standards

1929 Jan 2003 Ian Foster, U.Chicago Representative Publications l Annis, J., Zhao, Y., Voeckler, J., Wilde, M., Kent, S., Foster, I., Applying Chimera Virtual Data Concepts to Cluster Finding in the Sloan Sky Survey. SC'2002, l Bent, J., Venkataramani, V., LeRoy, N., Roy, A., Stanley, J., Arpaci-Dusseau, A.C., Arpaci- Dusseau, R.H., Livny, M., Flexibility, Manageability, and Performance in a Grid Storage Appliance, HPDC’11, l Deelman, E., Blackburn, K., Ehrens, P., Kesselman, C., Koranda, S., Lazzarini, A., Mehta, G., Meshkat, L., Pearlman, L., Blackburn, K. and Williams., R., GriPhyN and LIGO: Building a Virtual Data Grid for Gravitational Wave Scientists, HPDC’11, l Foster, I., Voeckler, J., Wilde, M., Zhao, Y., Chimera: A Virtual Data System for Representing, Querying, and Automating Data Derivation, SSDBM, l Iamnitchi, A., Ripeanu, M., Foster, I., Locating Data in (Small-World?) Peer-to-Peer Scientific Collaborations. 1st Intl. Workshop on Peer-to-Peer Systems, l Raman, P., George, A., Radlinski, M., Subramaniyan, R., GEMS: Gossip-Enabled Monitoring Service for Heterogeneous Distributed Systems, Technical Report, UF, l Ranganathan, K. and Foster, I., Decoupling Computation and Data Scheduling in Distributed Data Intensive Applications, HPDC’11, l Ripeanu, M., Foster, I., Iamnitchi, A. Mapping the Gnutella Network: Properties of Large-Scale Peer-to-Peer Systems and Implications for System Design. Internet Computing, 6 (1)

2029 Jan 2003 Ian Foster, U.Chicago Computer Science Research l Introduction & Context (Ian Foster: 30 mins) –Vision : Virtual data as e-science enabler –Organization: Structure & interactions –Dissemination: Targets and mechanisms –The nature of future challenges l Computer science research –Virtual data (Mike Wilde: 15) –Scheduling, planning (Ewa Deelman: 15) –Execution (Mike Franklin: 15) –Performance (Valerie Taylor: 15) l Technology delivery (Miron Livny: 15) –Virtual Data Toolkit l Student presentations (60)

2129 Jan 2003 Ian Foster, U.Chicago The Nature of Future Challenges l GriPhyN R&D is proving very successful –In terms of “new ideas” –In terms of interest & adoption l Our major challenges as we move forward are to scale and sustain the effort –Research scope: virtual data => KR; planning, execution => x1000 larger; …; … –Software support: we need NMIx10! –Infrastructure & application support l See Atkins cyberinfrastructure report!

2229 Jan 2003 Ian Foster, U.Chicago Summary l CS has made significant contributions both to experiments and to knowledge, e.g. –Virtual data concepts and technologies –Scheduling in large-scale distributed systems –DAGman workflow management & execution –Scalable replica location services l VDT (& underlying Globus Toolkit & Condor systems) a good technology transfer vehicle –Adoption by major science projects –Adoption of Grid concepts within industry l Major challenge: exploiting opportunities