1 András Kövi OptXware / BUTE | mit.bme.hu} October 15.2008 OpenSAF from a user’s perspective.

Slides:



Advertisements
Similar presentations
6 Copyright © 2005, Oracle. All rights reserved. Building Applications with Oracle JDeveloper 10g.
Advertisements

©2011 Quest Software, Inc. All rights reserved.. Andrei Polevoi, Tatiana Golubovich Program Management Group ActiveRoles Add-on Manager Overview.
Configuration management
Lecture 4 Basic Scripting. Administrative  Files on the website will be posted in pdf for compatibility  Website is now mirrored at:
Problem Solving Lab – Part B
Over-view of Lab. 1 For more details – see the Lab. 1 web-site There will be a 20 min prelab quiz (based on Assignment 1 and 2) at the start of the lab.
Asynchronous Solution Appendix Eleven. Training Manual Asynchronous Solution August 26, 2005 Inventory # A11-2 Chapter Overview In this chapter,
IERG4180 Tutorial 4 Jim.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 8 Introduction to Printers in a Windows Server 2008 Network.
Microsoft Baseline Security Analyzer INLS 187 Security Software Presentation by Hinár György Polczer
SET UP COMPUTER ** PLEASE BE AWARE SCREENSHOTS MAY NOT MATCH **
Understanding and Managing WebSphere V5
NDT Tools Tutorial: How-To setup your own NDT server Rich Carlson Summer 04 Joint Tech July 19, 2004.
Software Development. Chapter 3 – Your first Windows 8 app.
Enterprise Reporting with Reporting Services SQL Server 2005 Donald Farmer Group Program Manager Microsoft Corporation.
A walkthrough of the SageQuest Mobile Control Online & ESC integration.
PIKA Technologies Inc. Analog Logger Application Sample December 2009.
Apache Tomcat Web Server SNU OOPSLA Lab. October 2005.
Open Source Workshop1 IBM Software Group Working with Apache Tuscany A Hands-On Workshop Luciano Resende Haleh.
1 Web Server Administration Chapter 3 Installing the Server.
Developing Interfaces and Interactivity for DSpace with Manakin Part 2: Technical and Conceptual Overview of Dspace and Manakin Eric Luhrs Digital Initiatives.
IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF ).
Software Tools and Processes Training and Discussion October 16, :00-4:30 p.m. Jim Willenbring.
Web Based Inventory Site Building Room Asset Number Category Type Description Serial Number Manufacturer Model Vendor Name Acquired Date P O Number Budget.
Software Engineering 2003 Jyrki Nummenmaa 1 CASE Tools CASE = Computer-Aided Software Engineering A set of tools to (optimally) assist in each.
SchwartzGBIF Nodes III29 April 2003 DiGIR Portal Installation And Configuration.
1 Web Server Administration Chapter 3 Installing the Server.
Business Unit or Product Name © 2007 IBM Corporation Introduction of Autotest Qing Lin.
1 Apache. 2 Module - Apache ♦ Overview This module focuses on configuring and customizing Apache web server. Apache is a commonly used Hypertext Transfer.
Lecture Set 2 Part B – Configuring Visual Studio; Configuration Options and The Help System (scan quickly for future reference)
Suite zTPFGI Facilities. Suite Focus Three of zTPFGI’s facilities:  zAutomation  zTREX  Logger.
Guide to Programming with Python Chapter One Getting Started: The Game Over Program.
Guide to Linux Installation and Administration1 Chapter 4 Running a Linux System.
A Short Course on Geant4 Simulation Toolkit How to learn more?
Quattor-for-Castor Jan van Eldik Sept 7, Outline Overview of CERN –Central bits CDB template structure SWREP –Local bits Updating profiles.
IT Just Works ©2008 BigFix, Inc. Practical Guide to Relevance Ben Kus – 1/31/2008.
SONIC-3: Creating Large Scale Installations & Deployments Andrew S. Neumann Principal Engineer, Progress Sonic.
Advanced BioPSE NCRR How to Install and Configure J. Davison de St. Germain Chief Software Engineer SCI Institute December 2003 J. Davison.
CS140 Project 1: Threads Slides by Kiyoshi Shikuma.
Debugging Computer Networks Sep. 26, 2007 Seunghwan Hong.
1 Microsoft Windows 2000 Network Infrastructure Administration Chapter 4 Monitoring Network Activity.
ANDROID APPLICATION DEVELOPMENT. ANDROID DEVELOPMENT DEVELOPER.ANDROID.COM/INDEX.HTML THE OFFICIAL SITE FOR ANDROID DEVELOPERS. PROVIDES THE ANDROID SDK.
TODAY Android Studio Installation Getting started Creating your 1 st App Beginning to understanding Intents.
IPS Infrastructure Technological Overview of Work Done.
How to configure, build and install Trilinos November 2, :30-9:30 a.m. Jim Willenbring.
FCT Refresher: Getting the Support You Need By: Lauren Stanisic.
Module 14: Advanced Topics and Troubleshooting. Microsoft ® Windows ® Small Business Server (SBS) 2008 Management Console (Advanced Mode) Managing Windows.
EGEE is a project funded by the European Union under contract IST Installation and configuration of gLite services Robert Harakaly, CERN,
Tomcat Setup BCIS 3680 Enterprise Programming. One-Click Tomcat Setup 2  This semester we’ll try to set up Tomcat with a PowerShell script.  Preparation.
1 Murthy Esakonu June 3rd, 2009 Shenzhen China OpenSAF Developer Days 2009 Writing First OpenSAF Application Session OpenSAF.
OpenSAF Technical Overview Mario Angelic Technical Co-Chair OpenSAF Project June 4 th, 2009.
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
Geant4 Training 2003 A Short Course on Geant4 Simulation Toolkit How to learn more? The full set of lecture notes of this Geant4.
Custom Authentication Providers For DotNetNuke v5.0 Stan Schultes – Sarasota, FL Florida Community Leader
1 Nagendra Kumar Senior Software Engineer, Emerson Network Power, Embedded Computing. Date: June 4 th, 2009 Moving AMF.
Logo 1 Vishal Soni Senior Software Engineer Emerson Network Power – Embedded Computing. OpenSAF MDS, VDS and Build Environment Maintainer.
Google App Engine using Java 1. Outline Getting started Guestbook example Todo example Simplified Madlib 2.
Access Grid Workshop – APAC ‘05 Node Services Development Thomas D. Uram Argonne National Laboratory.
Troubleshooting Directories and Files Debugging
Operating Systems {week 01.b}
Integrating ArcSight with Enterprise Ticketing Systems
Hadoop Architecture Mr. Sriram
Integrating HA Legacy Products into OpenSAF based system
OpenSAF Wanted Architecture TLC view
Chapter 2: System Structures
A Short Course on Geant4 Simulation Toolkit How to learn more?
Introduction to ZBOSS Embedded Systems Software Training Center
A Short Course on Geant4 Simulation Toolkit How to learn more?
A Short Course on Geant4 Simulation Toolkit How to learn more?
Running C# in the browser
Presentation transcript:

1 András Kövi OptXware / BUTE | mit.bme.hu} October OpenSAF from a user’s perspective

Outline How did the story begin? (10) Getting your first cluster work (15) –Typical faults Trying the sample applications (20) Troubleshooting guide (15) –Debugging How to ask when problems arise? (10) Programming tips (10) 2

Outline How did the story begin? Getting your first cluster work Trying the sample applications Troubleshooting guide How to ask when problems arise? Programming tips 3

How did the story begin? First working system – 4 weeks Reasons –Not reading carefully enough the INSTALL file –Ambiguity in guide docs – too many assumptions on user`s knowledge –Problems with the OS (RHEL 4) –No experience 4

The community listens… OpenSAF improves Most significant updates –Speed up start process (1.5min  sec) –More specific instructions in INSTALL –Cleanup of init scripts “hand shake” at shutdown … in progress –Reorganization of directories –Simplification of rde.conf –Management Stable SNMP interface CLI improvement initiative has been developed – LOC today 5

Lessons learned Never ignore the INSTALL file –This is not just configure, make, make install… If something is ambiguous, feel free to ask Read the details –Prevents you from a lot of hassle One Controller is enough for development 6

Live demo Creating your first cluster Running the example applications 7

Create your first cluster I. –Acquire OpenSAF –Read INSTALL –Acquire and install prerequisite packages –Compile OpenSAF configure build_type=controller/payload make make install make rpm –Install the RPMs 8

Create your first cluster II. Install the RPMs –Controller set nodeinit.conf (ethX) set slot_id set rde.conf –Payload set nodeinit.conf (ethX) set slot_id To be safe in the first times cd /etc/opt/opensaf/ mv reboot reboot.old touch reboot 9

Create your first cluster III. Configure/Start controller –Change persistent store load settings with CLI en conf t pssv set playback-option-from-xml-config AVD set playback-option-from-xml-config AVM the practical way –edit /var/opt/opensaf/pssv_spcn_list file –change PSS to XML –Setup AppConfig.xml 10

Controller ~]# /etc/init.d/nis_scxb start Starting Node Initialization Daemon: /opt/opensaf/controller/bin/ncs_nid Moving /var/opt/opensaf/nidlog to /var/opt/opensaf/old_nidlog...Done. Moving /var/opt/opensaf/stdouts to /var/opt/opensaf/old_stdouts...Done. Wed Aug 13 02:02:22 CEST 2008 Starting OpenSAF Services... Starting TIPC service... Done. Starting RDF service... Done. RDF-ROLE for this System Controller is: 0, ACTIVE Starting DTSV service... Done. Starting MASV service... Done. Starting PSSV service... Done. Starting EDSV service... Done. Starting SUBAGT service... Done. Starting IFSVDD service... Done. Starting SCAP service... Done. Node Initialization Successful. SUCCESSFULLY SPAWNED ALL SERVICES!!! Status: SUCCESS Wed Aug 13 02:02:56 CEST 2008 SERVICE Initialization Success. First start… 11 Day 4 Week 2 Week 3

First start… Payload ~]# /etc/init.d/nis_pld start Starting Node Initialization Daemon: /opt/opensaf/payload/bin/ncs_nid Moving /var/opt/opensaf/nidlog to /var/opt/opensaf/old_nidlog...Done. Moving /var/opt/opensaf/stdouts to /var/opt/opensaf/old_stdouts...Done. Tue Oct 14 06:15:23 PDT 2008 Starting OpenSAF Services... Starting TIPC service... Done. Starting PCAP service... Done. Node Initialization Successful. SUCCESSFULLY SPAWNED ALL SERVICES!!! Status: SUCCESS Tue Oct 14 06:15:25 PDT 2008 SERVICE Initialization Success. ~]# /etc/init.d/nis_pld stop Stopping OpenSAF Services... Status: Hand Shake DONE OpenSAF Services Termination Success. 12

Sample applications Message Queue Service - MSG Checkpointing Service - CKPT User Mode Linux cluster simulation environment Availability Service - AMF 13

Troubleshooting ~]# /etc/init.d/nis_scxb start Starting Node Initialization Daemon: /opt/opensaf/controller/bin/ncs_nid Moving /var/opt/opensaf/nidlog to /var/opt/opensaf/old_nidlog...Done. Moving /var/opt/opensaf/stdouts to /var/opt/opensaf/old_stdouts...Done. Wed Aug 13 02:02:22 CEST 2008 Starting OpenSAF Services... Starting TIPC service... Done. Starting RDF service... Done. RDF-ROLE for this System Controller is: 0, ACTIVE Starting DTSV service... Done. Starting MASV service... Done. Starting PSSV service... Done. Starting EDSV service... Done. Starting SUBAGT service... Done. Starting IFSVDD service... Done. Starting SCAP service... Done. Node Initialization Successful. SUCCESSFULLY SPAWNED ALL SERVICES!!! Status: SUCCESS Wed Aug 13 02:02:56 CEST 2008 SERVICE Initialization Success. 14 /opt/TIPC directory slot_id clash ethX in nodeinit.conf rde.conf error TCP/IP connectivity between controllers CONTROLLER2’s IP is not in same subnet net-snmp libs are inappropriate --enable-shared option on RHEL4 slot_id incorrect pssv_spcn_list file

Troubleshooting AMF applications Init/terminate scripts –not xcutable –don’t use relative paths –sudo if need to execute with different user –check scap/pcap stdouts (/var/opt/opensaf/stdouts/…) printf based logging –always flush 15

Troubleshooting AMF applications Timeouts –CSI assignment –Synchronous API calls Configuration errors –analyze with SNMP –check BAM log Virtualization –clock drift, inaccuracy 16

Programmers‘ references Documents per service –Overview, API, running the sample application –Important info on capabilities Wiki –Development guide lines –Design docs –White papers SA Forum –Specifications –Education material 17

Programming tips Read the specs The example apps give a good starting point Avoid platform specific code for portability SAF APIs can return TRY_AGAIN do { result = saAmfComponentRegister(*amfHandle, compName, NULL); } while (result == SA_AIS_ERR_TRY_AGAIN && SLEEP && REPEAT); if (SA_AIS_OK != result) {...} 18

Turning to the community Check list –look through all the items from the previous slides –search through the mail archive newcomers are not familiar with terminology, expressions lots of threads hard to find a topic –problem not identified collect logs & configuration –tools/utilities/collect_logs_.sh try to formalize the problem –say “SCAP is not starting”  check out INSTALL –describe what you did be short but descriptive –include versions, actions avoid ambiguity –send mail to devel/user list 19

Turning to the community You’ve the solution, now please –write a summary mail about the problem and the solution –contribute to the wiki –… 20

If you get into trouble In case you are scuba diving… Stop! Focus on your breath! Think! Take actions! 21

If you get into trouble In case your system misbehaves… INSTALL is the #1 holy grail Look through the logs Check configuration Read PR docs Feel free to ask 22

Ideas for contribution Management –Improve CLI an initiative has been developed in the Summer –Web interface Development –Configuration editor, validator –Code templates –Best practice descriptions –… Eclipse-based whatever 23

My questions to experts What editors, IDEs you use for app development? How do you debug AMF applications? –Core dumps, gdb? How do you update the installed MW? What is a good way to package my application? –scripts, sources/binaries, configuration 24

Thank You! András Kövi OptXware LLC