Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 András Kövi OptXware / BUTE | mit.bme.hu} October 15.2008 OpenSAF from a user’s perspective.

Similar presentations


Presentation on theme: "1 András Kövi OptXware / BUTE | mit.bme.hu} October 15.2008 OpenSAF from a user’s perspective."— Presentation transcript:

1 1 András Kövi OptXware / BUTE kovi@{optxware.com | mit.bme.hu} October 15.2008 OpenSAF from a user’s perspective

2 Outline How did the story begin? (10) Getting your first cluster work (15) –Typical faults Trying the sample applications (20) Troubleshooting guide (15) –Debugging How to ask when problems arise? (10) Programming tips (10) 2

3 Outline How did the story begin? Getting your first cluster work Trying the sample applications Troubleshooting guide How to ask when problems arise? Programming tips 3

4 How did the story begin? First working system – 4 weeks Reasons –Not reading carefully enough the INSTALL file –Ambiguity in guide docs – too many assumptions on user`s knowledge –Problems with the OS (RHEL 4) –No experience 4

5 The community listens… OpenSAF improves Most significant updates –Speed up start process (1.5min  20-30 sec) –More specific instructions in INSTALL –Cleanup of init scripts “hand shake” at shutdown … in progress –Reorganization of directories –Simplification of rde.conf –Management Stable SNMP interface CLI improvement initiative has been developed –620000 LOC today 5

6 Lessons learned Never ignore the INSTALL file –This is not just configure, make, make install… If something is ambiguous, feel free to ask Read the details –Prevents you from a lot of hassle One Controller is enough for development 6

7 Live demo Creating your first cluster Running the example applications 7

8 Create your first cluster I. –Acquire OpenSAF –Read INSTALL –Acquire and install prerequisite packages –Compile OpenSAF configure build_type=controller/payload make make install make rpm –Install the RPMs 8

9 Create your first cluster II. Install the RPMs –Controller set nodeinit.conf (ethX) set slot_id set rde.conf –Payload set nodeinit.conf (ethX) set slot_id To be safe in the first times cd /etc/opt/opensaf/ mv reboot reboot.old touch reboot 9

10 Create your first cluster III. Configure/Start controller –Change persistent store load settings with CLI en conf t pssv set playback-option-from-xml-config AVD set playback-option-from-xml-config AVM the practical way –edit /var/opt/opensaf/pssv_spcn_list file –change PSS to XML –Setup AppConfig.xml 10

11 Controller [root@localhost ~]# /etc/init.d/nis_scxb start Starting Node Initialization Daemon: /opt/opensaf/controller/bin/ncs_nid Moving /var/opt/opensaf/nidlog to /var/opt/opensaf/old_nidlog...Done. Moving /var/opt/opensaf/stdouts to /var/opt/opensaf/old_stdouts...Done. Wed Aug 13 02:02:22 CEST 2008 Starting OpenSAF Services... Starting TIPC service... Done. Starting RDF service... Done. RDF-ROLE for this System Controller is: 0, ACTIVE Starting DTSV service... Done. Starting MASV service... Done. Starting PSSV service... Done. Starting EDSV service... Done. Starting SUBAGT service... Done. Starting IFSVDD service... Done. Starting SCAP service... Done. Node Initialization Successful. SUCCESSFULLY SPAWNED ALL SERVICES!!! Status: SUCCESS Wed Aug 13 02:02:56 CEST 2008 SERVICE Initialization Success. First start… 11 Day 4 Week 2 Week 3

12 First start… Payload [root@localhost ~]# /etc/init.d/nis_pld start Starting Node Initialization Daemon: /opt/opensaf/payload/bin/ncs_nid Moving /var/opt/opensaf/nidlog to /var/opt/opensaf/old_nidlog...Done. Moving /var/opt/opensaf/stdouts to /var/opt/opensaf/old_stdouts...Done. Tue Oct 14 06:15:23 PDT 2008 Starting OpenSAF Services... Starting TIPC service... Done. Starting PCAP service... Done. Node Initialization Successful. SUCCESSFULLY SPAWNED ALL SERVICES!!! Status: SUCCESS Tue Oct 14 06:15:25 PDT 2008 SERVICE Initialization Success. [root@localhost ~]# /etc/init.d/nis_pld stop Stopping OpenSAF Services... Status: Hand Shake DONE OpenSAF Services Termination Success. 12

13 Sample applications Message Queue Service - MSG Checkpointing Service - CKPT User Mode Linux cluster simulation environment Availability Service - AMF 13

14 Troubleshooting [root@localhost ~]# /etc/init.d/nis_scxb start Starting Node Initialization Daemon: /opt/opensaf/controller/bin/ncs_nid Moving /var/opt/opensaf/nidlog to /var/opt/opensaf/old_nidlog...Done. Moving /var/opt/opensaf/stdouts to /var/opt/opensaf/old_stdouts...Done. Wed Aug 13 02:02:22 CEST 2008 Starting OpenSAF Services... Starting TIPC service... Done. Starting RDF service... Done. RDF-ROLE for this System Controller is: 0, ACTIVE Starting DTSV service... Done. Starting MASV service... Done. Starting PSSV service... Done. Starting EDSV service... Done. Starting SUBAGT service... Done. Starting IFSVDD service... Done. Starting SCAP service... Done. Node Initialization Successful. SUCCESSFULLY SPAWNED ALL SERVICES!!! Status: SUCCESS Wed Aug 13 02:02:56 CEST 2008 SERVICE Initialization Success. 14 /opt/TIPC directory slot_id clash ethX in nodeinit.conf rde.conf error TCP/IP connectivity between controllers CONTROLLER2’s IP is not in same subnet net-snmp libs are inappropriate --enable-shared option on RHEL4 slot_id incorrect pssv_spcn_list file

15 Troubleshooting AMF applications Init/terminate scripts –not xcutable –don’t use relative paths –sudo if need to execute with different user –check scap/pcap stdouts (/var/opt/opensaf/stdouts/…) printf based logging –always flush 15

16 Troubleshooting AMF applications Timeouts –CSI assignment –Synchronous API calls Configuration errors –analyze with SNMP –check BAM log Virtualization –clock drift, inaccuracy 16

17 Programmers‘ references Documents per service –Overview, API, running the sample application –Important info on capabilities Wiki –Development guide lines –Design docs –White papers SA Forum –Specifications –Education material 17

18 Programming tips Read the specs The example apps give a good starting point Avoid platform specific code for portability SAF APIs can return TRY_AGAIN do { result = saAmfComponentRegister(*amfHandle, compName, NULL); } while (result == SA_AIS_ERR_TRY_AGAIN && SLEEP && REPEAT); if (SA_AIS_OK != result) {...} 18

19 Turning to the community Check list –look through all the items from the previous slides –search through the mail archive newcomers are not familiar with terminology, expressions lots of threads hard to find a topic –problem not identified collect logs & configuration –tools/utilities/collect_logs_.sh try to formalize the problem –say “SCAP is not starting”  check out INSTALL –describe what you did be short but descriptive –include versions, actions avoid ambiguity –send mail to devel/user list 19

20 Turning to the community You’ve the solution, now please –write a summary mail about the problem and the solution –contribute to the wiki –… 20

21 If you get into trouble In case you are scuba diving… Stop! Focus on your breath! Think! Take actions! 21

22 If you get into trouble In case your system misbehaves… INSTALL is the #1 holy grail Look through the logs Check configuration Read PR docs Feel free to ask 22

23 Ideas for contribution Management –Improve CLI an initiative has been developed in the Summer –Web interface Development –Configuration editor, validator –Code templates –Best practice descriptions –… Eclipse-based whatever 23

24 My questions to experts What editors, IDEs you use for app development? How do you debug AMF applications? –Core dumps, gdb? How do you update the installed MW? What is a good way to package my application? –scripts, sources/binaries, configuration 24

25 Thank You! András Kövi OptXware LLC kovi@optxware.com www.optxware.com


Download ppt "1 András Kövi OptXware / BUTE | mit.bme.hu} October 15.2008 OpenSAF from a user’s perspective."

Similar presentations


Ads by Google