Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Science Research Ian Foster University of Chicago & Argonne National Laboratory GriPhyN NSF Project Review 29-30 January 2003.

Similar presentations


Presentation on theme: "Computer Science Research Ian Foster University of Chicago & Argonne National Laboratory GriPhyN NSF Project Review 29-30 January 2003."— Presentation transcript:

1 Computer Science Research Ian Foster University of Chicago & Argonne National Laboratory foster@mcs.anl.gov GriPhyN NSF Project Review 29-30 January 2003 Chicago

2 229 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Computer Science Research l Introduction & Context (Ian Foster: 30 mins) –Vision : Virtual data as e-science enabler –Organization: Structure & interactions –Dissemination: Targets and mechanisms –The nature of future challenges l Computer science research –Virtual data (Mike Wilde: 15) –Scheduling, planning (Ewa Deelman: 15) –Execution (Mike Franklin: 15) –Performance (Valerie Taylor: 15) l Technology delivery (Miron Livny: 15) –Virtual Data Toolkit l Student presentations (60)

3 329 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Computer Science Research l Introduction & Context (Ian Foster: 30 mins) –Vision : Virtual data as e-science enabler –Organization: Structure & interactions –Dissemination: Targets and mechanisms –The nature of future challenges l Computer science research –Virtual data (Mike Wilde: 15) –Scheduling, planning (Ewa Deelman: 15) –Execution (Mike Franklin: 15) –Performance (Valerie Taylor: 15) l Technology delivery (Miron Livny: 15) –Virtual Data Toolkit l Student presentations (60)

4 429 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov PetaScale Virtual Data Grids (1) Virtual Data Tools Request Planning & Scheduling Tools Request Execution & Management Tools Transforms Distributed resources (code, storage, computers, and network ) è Resource è Management è Services Resource Management Services è Security and è Policy è Services Security and Policy Services è Other Grid è Services Other Grid Services Interactive User Tools Production Team Individual Investigator Research group Raw data source  PetaOps  Petabytes Performance

5 529 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Petascale Virtual Data Grids (2)

6 629 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Computer Science and GriPhyN Computer Science Research Virtual Data Toolkit Partner Physics Projects Larger Science Community Globus, Condor, NMI, EU DataGrid, PPDG Communities Production Deployment Tech Transfer Techniques & software Requirements Prototyping & experiments Other linkages: - Work force - CS researchers - Industry

7 729 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Computer Science Challenges (1) l Virtual data –Representation, discovery, & manipulation of workflows and associated data & programs l Planning –Mapping workflows in an efficient, policy- aware manner to distributed resources l Execution –Executing workflows, including data movements, reliably and efficiently l Performance –Monitoring aspects of system performance for scheduling & troubleshooting

8 829 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Computer Science Challenges (2) l Engage meaningfully with physics groups l Provide educational opportunities l Develop, package, deliver, and support quality software l Achieve outreach to groups outside partner physics experiments

9 929 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Computer Science Research l Introduction & Context (Ian Foster: 30 mins) –Vision : Virtual data as e-science enabler –Organization: Structure & interactions –Dissemination: Targets and mechanisms –The nature of future challenges l Computer science research –Virtual data (Mike Wilde: 15) –Scheduling, planning (Ewa Deelman: 15) –Execution (Mike Franklin: 15) –Performance (Valerie Taylor: 15) l Technology delivery (Miron Livny: 15) –Virtual Data Toolkit l Student presentations (60)

10 1029 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov GriPhyN Computer Science Team l U.Chicago: Dumitrescu, Foster, Iamnitchi, Milligan, Ranganathan, Ripeanu, Voeckler, Wilde l USC/ISI: Deelman, Kesselman, Mehta, Patil, Singh, Vahi l NWU -> TAMU: Taylor, Yin l UCB: Franklin, Liu l UCSD: Marzullo, Moore, Zhang, Jagatheesan l UW-Madison: Alderman, Arpaci-Dusseau, Arpaci- Dusseau, Bailey, Bent, Kosar, Livny, Roy, Stanley, Thain l UF: Arbee, George, Jiang, Katageri, Ranka, Rodriguez l UT Brownsville: Campanelli, Morris, Zamora l LBNL: Shoshani Faculty/Staff, Student/Postdoc (underlined = present)

11 1129 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Computer Science Research: How do We Work? l System architecture & virtual data toolkit as two overarching organizational mechanisms l Project activities all defined in relationship to these organizing principles: –Research: Explore new techniques to guide evolution of the system architecture and VDT –Development: Construct VDT software –Evaluation: Apply and evaluate VDT software and/or new techniques in context of application challenges

12 1229 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Computer Science Research: How Are We Coordinated? l The activities of this large, multidisciplinary group are coordinated by frequent and multivalent communications –Face-to-face meetings in large & small groups –Formal and informal documents defining requirements, challenge problems, testbeds –Email, phone calls, videoconferences –Cooperation on challenge problems and technology and application demonstrations –Cooperation on software releases

13 1329 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov GriPhyN Architecture/VDT and CS Research Projects Virtual Data Planning Execution Chimera Virtual Data System + Pegasus Planner DAGman Workflow Globus Toolkit, Condor, Ganglia, Etc. Partial Queries (Liu, Franklin) Decentralized scheduling (Ranganathan) Fault-tolerant master-worker (Marzullo) Scalable replica location service (UC, ISI team) Policy-aware scheduling (Dumitrescu) Ontologies (Zhao) NeST Storage mgmt (UW team) Virtual data language design (Voeckler,Wilde) AI Planning (Deelman,Narang) Virtual data language applns (Milligan, Zhao) DAGman enhancements (UW team) Prophesy (Taylor, Yin) HP monitoring (George) VDT Research

14 1429 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov GriPhyN Arch/VDT—CS Research Degree of Coupling Virtual Data Planning Execution Chimera Virtual Data System + Pegasus Planner DAGman Workflow Globus Toolkit, Condor, Ganglia, Etc. Partial Queries (Liu, Franklin) Decentralized scheduling (Ranganathan) Fault-tolerant master-worker (Marzullo) Scalable replica location service (UC, ISI team) Policy-aware scheduling (Dumitrescu) Ontologies (Zhao) NeST Storage mgmt (UW team) Virtual data language design (Voeckler,Wilde) AI Planning (Deelman,Narang) Virtual data language applns (Milligan, Zhao) DAGman enhancements (UW team) Prophesy (Taylor, Yin) HP monitoring (George) VDT Research Already Underway Pending

15 1529 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Examples of Technology Injection: Chimera R&D Timeline Chimera-2 Type model Dataset catalog Metadata Hyperlinks Instance tracking Performance data 20032002 Chimera-1 Java code & class model XML VDL TR/DV model Compound TRs General Grid exec env Optimized DB schema Chimera-0 Derivations only Grid exec environment (prototype) PERL & PostgresQL Sloan cluster finding APPSAPPS TECHTECH CMS analysis prototype w/ROOT CMS official event simulation Sloan cluster- finding science CMS & ATLAS analysis w/ROOT, CLARENS, JAS LIGO pulsar search ATLAS events- on- demand CMS event simulation prototyping Chimera-3 Knowledge repr. Policy-driven planners VD browsers, composers … 2004 Sloan near- earth object Bio Grid facility …

16 1629 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Computer Science Research l Introduction & Context (Ian Foster: 30 mins) –Vision : Virtual data as e-science enabler –Organization: Structure & interactions –Dissemination: Targets and mechanisms –The nature of future challenges l Computer science research –Virtual data (Mike Wilde: 15) –Scheduling, planning (Ewa Deelman: 15) –Execution (Mike Franklin: 15) –Performance (Valerie Taylor: 15) l Technology delivery (Miron Livny: 15) –Virtual Data Toolkit l Student presentations (60)

17 1729 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Dissemination: Targets l Researchers and educators –Facilitate creation of new knowledge l Computer science research community –Contribute to knowledge –Engage community in solving our problems l Open source community –Contribute to open Grid technology base l Industry –Contribute to vibrant commercial technology

18 1829 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Dissemination: Mechanisms l Software –VDT: adoption by LHC Computing Grid –Globus Toolkit and Condor systems l Publications and talks –XX papers, YY tech reports, ZZ talks l Workshops and meetings –E.g., “Data Derivation & Provenance”, Oct 02 l Community activities –E.g., advisory committees, GGF standards

19 1929 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Representative Publications l Annis, J., Zhao, Y., Voeckler, J., Wilde, M., Kent, S., Foster, I., Applying Chimera Virtual Data Concepts to Cluster Finding in the Sloan Sky Survey. SC'2002, 2002. l Bent, J., Venkataramani, V., LeRoy, N., Roy, A., Stanley, J., Arpaci-Dusseau, A.C., Arpaci- Dusseau, R.H., Livny, M., Flexibility, Manageability, and Performance in a Grid Storage Appliance, HPDC’11, 2002. l Deelman, E., Blackburn, K., Ehrens, P., Kesselman, C., Koranda, S., Lazzarini, A., Mehta, G., Meshkat, L., Pearlman, L., Blackburn, K. and Williams., R., GriPhyN and LIGO: Building a Virtual Data Grid for Gravitational Wave Scientists, HPDC’11, 2002. l Foster, I., Voeckler, J., Wilde, M., Zhao, Y., Chimera: A Virtual Data System for Representing, Querying, and Automating Data Derivation, SSDBM, 2002. l Iamnitchi, A., Ripeanu, M., Foster, I., Locating Data in (Small-World?) Peer-to-Peer Scientific Collaborations. 1st Intl. Workshop on Peer-to-Peer Systems, 2002. l Raman, P., George, A., Radlinski, M., Subramaniyan, R., GEMS: Gossip-Enabled Monitoring Service for Heterogeneous Distributed Systems, Technical Report, UF, 2002. l Ranganathan, K. and Foster, I., Decoupling Computation and Data Scheduling in Distributed Data Intensive Applications, HPDC’11, 2002. l Ripeanu, M., Foster, I., Iamnitchi, A. Mapping the Gnutella Network: Properties of Large-Scale Peer-to-Peer Systems and Implications for System Design. Internet Computing, 6 (1). 50-57. 2002.

20 2029 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Computer Science Research l Introduction & Context (Ian Foster: 30 mins) –Vision : Virtual data as e-science enabler –Organization: Structure & interactions –Dissemination: Targets and mechanisms –The nature of future challenges l Computer science research –Virtual data (Mike Wilde: 15) –Scheduling, planning (Ewa Deelman: 15) –Execution (Mike Franklin: 15) –Performance (Valerie Taylor: 15) l Technology delivery (Miron Livny: 15) –Virtual Data Toolkit l Student presentations (60)

21 2129 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov The Nature of Future Challenges l GriPhyN R&D is proving very successful –In terms of “new ideas” –In terms of interest & adoption l Our major challenges as we move forward are to scale and sustain the effort –Research scope: virtual data => KR; planning, execution => x1000 larger; …; … –Software support: we need NMIx10! –Infrastructure & application support l See Atkins cyberinfrastructure report!

22 2229 Jan 2003 Ian Foster, U.Chicago foster@mcs.anl.gov Summary l CS has made significant contributions both to experiments and to knowledge, e.g. –Virtual data concepts and technologies –Scheduling in large-scale distributed systems –DAGman workflow management & execution –Scalable replica location services l VDT (& underlying Globus Toolkit & Condor systems) a good technology transfer vehicle –Adoption by major science projects –Adoption of Grid concepts within industry l Major challenge: exploiting opportunities


Download ppt "Computer Science Research Ian Foster University of Chicago & Argonne National Laboratory GriPhyN NSF Project Review 29-30 January 2003."

Similar presentations


Ads by Google