A View from the Top Al Geist February 22-23 Houston TX.

Slides:



Advertisements
Similar presentations
Implementing Tableau Server in an Enterprise Environment
Advertisements

The Quantum Chromodynamics Grid James Perry, Andrew Jackson, Matthew Egbert, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
A View from the Top Al Geist February Houston TX.
Presented by Scalable Systems Software Project Al Geist Computer Science Research Group Computer Science and Mathematics Division Research supported by.
HTTP Overview Vijayan Sugumaran School of Business Administration Oakland University.
VMware vCenter Server Module 4.
Enterprise Reporting with Reporting Services SQL Server 2005 Donald Farmer Group Program Manager Microsoft Corporation.
Talend 5.4 Architecture Adam Pemble Talend Professional Services.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting February 24-25, 2003.
A View from the Top End of Year 1 Al Geist October Houston TX.

Progress on Integration, Vote on APIs SC2003, and SW release Al Geist September 11-12, 2003 Rockville, MD.
Resource Management and Accounting Working Group Working Group Scope and Components Progress made Current issues being worked Next steps Discussions involving.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting June 5-6, 2003.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting June 13-14, 2002.
A View from the Top November Dallas TX. Coordinator: Al Geist Participating Organizations ORNL ANL LBNL PNNL PSC.
1 Apache. 2 Module - Apache ♦ Overview This module focuses on configuring and customizing Apache web server. Apache is a commonly used Hypertext Transfer.
Process Management Working Group Process Management “Meatball” Dallas November 28, 2001.
A View from the Top Preparing for Review Al Geist February Chicago, IL.
Resource Management Working Group SSS Quarterly Meeting November 28, 2001 Dallas, Tx.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
National Center for Supercomputing Applications NCSA OPIE Presentation November 2000.
Progress on Release, API Discussions, Vote on APIs, and Quarterly Report Al Geist May 6-7, 2004 Chicago, ILL.
Progress on Release, API Discussions, Vote on APIs, and PI mtg Al Geist January 14-15, 2004 Chicago, ILL.
Crystal Ball Panel ORNL Heterogeneous Distributed Computing Research Al Geist ORNL March 6, 2003 SOS 7.
Oracle 10g Database Administrator: Implementation and Administration Chapter 2 Tools and Architecture.
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
Review report, Vote on APIs Quarterly report, and SW release Al Geist June 5-6, 2003 Chicago, IL.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting January 15-16, 2004 Argonne, IL.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting September 11-12, 2003 Washington D.C.
Scalable Systems Software for Terascale Computer Centers Coordinator: Al Geist Participating Organizations ORNL ANL LBNL.
MEMBERSHIP AND IDENTITY Active server pages (ASP.NET) 1 Chapter-4.
A View from the Top Al Geist June Houston TX.
Web Server.
ClearQuest XML Server with ClearCase Integration Northwest Rational User’s Group February 22, 2007 Frank Scholz Casey Stewart
Don’t Duck Metadata March 2005 Introducing Setting Up a Clearinghouse Node Topic: Introduction to Setting Up a Clearinghouse Node Objective: By.
Process Manager Specification Rusty Lusk 1/15/04.
SciDAC SSS Face-To-Face Erik P. DeBenedictis February 21, 2002 Sandia is a multi-program laboratory operated by Sandia Corporation, a Lockheed Martin Company,
Interstage BPM v11.2 1Copyright © 2010 FUJITSU LIMITED INTERSTAGE BPM ARCHITECTURE BPMS.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
PerfSONAR Schema and Topology Martin Swany. Schema Key Goals: Extensibility, Normalization, Readability Break representation of performance measurements.
BOF-1147, JavaTM Technology and WebDAV: Standardizing Content Management Java and WebDAV Juergen Pill Team Leader Software AG Remy Maucherat Software Engineer.
Ur/Web: A Simple Model for Programming the Web
Successfully Implementing The Information System Systems Analysis and Design Kendall and Kendall Fifth Edition.
Sharing Maps and Layers to Portal for ArcGIS Melanie Summers, Tom Shippee, Ty Fitzpatrick.
Advanced Higher Computing Science
ArcGIS for Server Security: Advanced
Web Technology Solutions
Architecture Review 10/11/2004
Netscape Application Server
Project Center Use Cases Revision 2
Information Services Discussion TeraGrid ‘08
Global Grid Forum GridForge
Open Source distributed document DB for an enterprise
Agenda All-Monday 15 Sep 0800 Welcome - Opening remarks
CRC exercises Not happy with the way the document for testbed architecture is progressing More a collection of contributions from the mware groups rather.
Creating Novell Portal Services Gadgets: An Architectural Overview
OGSA Data Architecture Scenarios
Discussions on group meeting
Scalable Systems Software for Terascale Computer Centers
Patrick Dreher Research Scientist & Associate Director
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen,
Simplified Development Toolkit
Lecture 1: Multi-tier Architecture Overview
Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta
Manuscript Transcription Assistant Initiative
REST APIs Maxwell Furman Department of MIS Fox School of Business
Global Grid Forum (GGF) Orientation
Information Services Claudio Cherubino INFN Catania Bologna
How to install and manage exchange server 2010 OP Saklani.
Presentation transcript:

A View from the Top Al Geist February 22-23 Houston TX

Participating Organizations Coordinator: Al Geist Participating Organizations ORNL ANL LBNL PNNL SNL LANL Ames NCSA PSC SDSC IBM Compaq SGI Scyld Intel Unlimited Scale Main Web Site www.scidac.org/ScalableSystems Recently web server experienced problems

Scalable Systems Software Center Review of Last Meeting Scalable Systems Software Center November 28-29 Dallas TX Details on page 12 Main project notebook

Progress Reports at Nov. mtg Al Geist – working groups, notebooks, telecoms Fred Johnson – Program management is a big focus Projects are under the microscope Interactions chart – encouraged to show utility Wants to see software prototypes not just API Working Group Leaders – What areas their working group is addressing Progress report on what their group has done Present problems being addressed Next steps for the group Discussion items for the larger group to consider

Consensus and Voting: Wire Protocol Proposal: Passed strawvote 10 for / 1 against / 1 abstain ClusterBIOS Proposal: Passed strawvote 10 for / 0 against / 3 abstain Three Data Delivery Models: Discussion, refinement suggestions made to have two modes vote deferred until revised proposal written up XML Schema Formalization: Discussion of issues that must be addressed no vote requested SC2001 Cluster BOFS: suggestion that Scalable Systems have a BOF at SC2002

Scalable Systems Software Center Progress Since Last Meeting Scalable Systems Software Center November - February

January 15-16 SciDAC PI Meeting Al Geist gave 15 minute overview presentation on the Scalable Systems Software Project Paul Hargrove and Al Geist manned a poster session on Morning of January 16 Opportunity to meet new groups who need our stuff NERSC, CCS, and OSC among others expressed interest Meeting with Fred the afternoon of January 16 Open source issues still being worked CCA is the first test case

Five Notebooks in place and filling up A main notebook for general information And individual notebooks for each working group Allows groups to keep track of other groups progress and comment on the items of overlap Allows Center members and interested parties to see what is being defined and implemented Over 130 total pages – 56 added since last meeting Get to all notebooks through main web site www.scidac.org/ScalableSystems Click on side bar or at “project notebooks” at bottom of page

Four Weekly Working Group Telecoms Resource management, scheduling, and accounting Tuesday 3:00 pm (Eastern) 1-800-664-0771 keyword “SSS mtg” Validation and Testing Wednesday 1:00 pm (Eastern) 1-877-540-9892 mtg code 999157 Proccess management, system monitoring, and checkpointing Thursday 1:00 pm (Eastern) 1-877-252-5250 mtg code 160910 Node build, configuration, and information service Friday 3:00 pm (Eastern) 1-888-469-1934 mtg code 58145

Working Group Mailing Lists Resource management working group sss-rm@lyris.pnl.gov Proccess management working group sss-pmwg@lbl.gov Node build, configuration working group sss-infrastructure@mcs.anl.gov Validation and testing working group scidac-tiux@sandia.gov Mailing lists used for notification of new notebook entries meeting setup or changes Not for ideas and proposals

Scalable Systems Software Center This Meeting Scalable Systems Software Center November 28-29,2001

Agenda – February 22 8:00 Breakfast 8:30 Al Geist – “View from the Top” 9:00 Fred Johnson – MICS report Working Group Reports 9:30 Narayan Desi – Node Build, Configure 10:30 Break 11:00 Scott Jackson – Resource Management 12:00 Lunch (on own but go somewhere as group) 1:00 Paul Hargrove – Process Management 2:00 Erik Debenedictis – Validation and Testing 3.00 Break 3:30 Prototype Demos Rusty – process manager, job manager, service directory Ron – Scalable Linux monitor Eric – CPlant XML interface JP – information service Scott Jackson – QBank 5:30 Adjourn Working groups may wish to get together in evening

Agenda – February 23 8:00 Breakfast 8:30 Discussion, proposals, strawvotes Scott – Demo allocation manager Rusty – Security wire protocol 8/3/3 Mike –monitoring method Eric—XML issues Naryan—Service directory 10:30 Break 11:00 Al Geist – Summary next steps overall and for working groups next meeting June 13-14 Houston 12:00 meeting ends

Meeting notes Naryan – Service directory status Ssslib – simple helper functions send_message, receive_message, sd_register, sd_unregister future – multi-protocol support Node manager – functions: power up/down, boot/halt/ reboot, getImage, setImage, rebuild node, configUpdate node ORNL is working on a prototype due by next meeting Information service – scalable data repository intended to store well formed info from other components data management and memory leaks are still being thought about key-value pairs, vs. Db approach (this is perferred) data has to be registered, and schemas need to be defined. Discussion-why not use SQL? Are we reinventing the wheel? PSC uses database and is completely integrated to their system.

Meeting notes Scott Jackson – Proposed component architecture diagrams Creation of XML marshaller/unmarshaller Establish of CVS repository at Ames Scheduler progress – internal Resource manager API, Allocation manager API Initial support for checkpoint/restart – info it needs to know are: what resources tied up, when last Checkpoiont done. Locality requirements. Meta scheduler progress – support for data staging, proximity optimization Uses globus RSL queries thus has ties to Grid Job manager progress – initial study on PBS to determine viability of Dissection possibilities and functionality enhancements. Answer was NO, use all of PBS or use something else, don’t break up PBS Allocation Manager progress – draft requirements document underway prototype working, backend is SQL database. Page 37-38 RMnotebook Current issues— Next work—all components under CVS, design XML interface to RM scheduler demos by next mtg Discussion of “metadata” and validation of XML schema

Meeting notes Bigger questions- do we need SSS-wide CVS? Documentation? Problem tracking? Bit-keeper? Faster than SourceForge. Paul- XML is now a secondary issue, prototypes will sort it out Refined boundaries with RM working group Influencing the monitor discussion Resurrect the job manager. Collect PM steps Process manager – prototype by ANL and stub scheduler to feed it complete set of interfaces defined Checkpoint Manager – a separate component w/ five entry points migration interface is now 2 phase requirements document published as tech report. hard copy handed out for comments Monitors – two basic query types NCSA studying implementation issues initial capabilities will be PBS mon functionality Data Migration – no interface work yet would be invoked by Job Manager, could invoke PM

Meeting notes Next steps – continue work with RMwg Interface work for process manager and checkpoint mgr begin for job manager and monitors Prototyping and refinement Implementation survey report Discussion of two delivery models compressed to one. Single Method(metric, rate, threshhold, extended data) Jose says “one more step and we are done” Eric – XML proposed schema structure and style (proposal tomorrow) Source repository – one or multiple? Copyright of source. Is it accessible to public? Nightly regression tests. Supported machines? Discussion on options for source repository Teleconference being set up – time is tentatively Wed 1:00 eastern

Meeting notes Demos Naryan – service directory started, process manager registers with sd Start miniScheduler which finds PM and starts submitting jobs. Uses basic wire protocol, and ssslib, and the XML interface Next step- get RM group to supply the scheduler XML. Eric- Secure Wire Protocol through XML and Browser Dual mode accepts XML and returns XML accepts https request and returns HTML wrapped XML OpenSSL with 128bit encryption, certificate server, Security Plan can be written. https Web page form has boxes (name) (password) And text box to fill in XML (a default XML form is supplied) Submitted a job on 2 nodes of Cplant Next step-browser interface good (demos, GUI to Cplant, control console)

Meeting notes Demos Matt Sottile – Supermon: Scalable cluster monitoring Reactive and periodic monitoring modes Low Perturbation was a large focus Hierarchical architecture of supermon Describes the “S” (lisp-like) API Shows performance graphs to 20 processors 750hz sample rate Gives demo Would like to get users Al asks about how this could be used in scalable systems center Ron says could make supermonXML component And for scaling to 5000 processors build a k-way tree with k=50

Meeting notes Demos Scott Jackson – Qbank demo – didn’t work. JP Navaro – information service prototype Info service registers itself with service directory thru XML Runs client who finds info service thru directory Does query to show metadata contents of info service Client puts in data with DeclareData(), then InsertData() Wants to hear from WG how to make this more useful. Stephen describes CCS needs for User management Db Build configure group store node information Paul asks about access control to prevent others from deleting Long discussion on security and access

Meeting notes Day 2 Demo Scott Jackson – Allocation manager basic operations Next step building framework with allocations, people, machines, etc. Proposals Rusty – Basic Authentication to go with basic wire protocol Classic Shared secret key challenge/response algorithm Requires no machinery except Unix crypt or MD5 Question about password management and answer this is proposal Just for wire protocol. No assumptions on password management. Discussion of whether this is required for all components Some thought the proposal should be stronger, others thought It should be weaker (just basic wire protocol), some who abstained wanted to it a while before deciding. Proposal that basic wire protocol is augmented by a challenge/response Strawvote – 8 for, 3 against, 3 abstain

Meeting notes Day 2 Proposals Mike – Monitoring Gives his history of monitoring at NCSA Gives live demo of NCSA Platinum Cluster Monitor Looked at scalable protocol design (tested on 550 machines) XML increases data volume by 4X differential ASCII has lowest volume Current design in the pm notebook (diagrams) Single request method to cover event, streaming, polling, query Straw vote to take this approach for developing XML interfaces And components 12 for, 0 against, 1 abstain Naryan – Service Directory Problem: components need to locate each other and how to connect SD allows components to register, deregistration, lookup Name conflict – returns info about all matches Straw vote to have service directory as part of architecture 13/0/1

Meeting notes Day 2 Jose asks: RAS considerations – what happens if component doesn’t respond? Discussion follows. Including high availability considerations. Eric – XML issues Multiple roles for XML schema Global definitions in a Global schema – host, ports, authentication, etc XML namespaces – propose http://www.scidac.org/sss/[global, pm, shed, acct] Proposed type names f.e. host-type, env-type, stdio-type, args-type Style Consistency – decide later Discussion – good idea to have a global namespace need to consider version numbers Straw vote 12 for, 0 against, 2 abstain