Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.quick-software-line.com High Availability for IBM Power i.

Similar presentations


Presentation on theme: "Www.quick-software-line.com High Availability for IBM Power i."— Presentation transcript:

1 www.quick-software-line.com High Availability for IBM Power i

2 www.quick-software-line.com Troubleshooting  CommunicationsCommunications  Replication jobs Replication jobs  Database Database  IFS IFS  Other objects Other objects  Performances Performances END  How to contact support How to contact support  Audit & journalingAudit & journaling

3 www.quick-software-line.com Communications How the communication daemon works? EDH_xx_SND Sender PMSYSDEM Daemon TCP/IP TCP/IP daemon « PMSYSDEM » is active on the target system The source job EDH_xx_SND is started on the source system; xx is the name of the environment EDH_xx_RCV Receiver It sends a connection request to the target system The daemon receives the request ; it submits the target job EDH_xx_RCV The job EDH_xx_RCV starts. It does a « TakeDescriptor » to take control of the communication with the sender job Then, it does a « GiveDescriptor » to transfer the communication control to the « RCV » job Note: Once communication is established between SND and RCV, the daemon doesn’t intervene into the communication process END

4 www.quick-software-line.com Communications Analysis of communication errors StepFunctionIn case of issue 1 Sender job is submitted, through Quick- EDD/HA menu (S=Start) or by the command PMEDHCTL, or by the command PMEDHSTR Check that the job is submitted correctly and active: WRKACTJOB SBS(PMEDH) 2 Communication with the target system Call is received by the TCP/IP daemon Do your systems communicate ? Perform a PING to check that Is the daemon active ? Use the command PMSYSDEM to check it. Check also with a NETSTAT that the TCP/IP port is in « Listen » status 3 Daemon submits the receiver jobCheck the PMSYSDEMON joblog, active in sub system QSYSWRK to ensure that the request has been received and that a job has been submitted 4 The « communication descriptor » is transmitted Check the PMSYSDEMON joblog, to check the order « Give descriptor » 5 The « communication descriptor » is receivedCheck the receiver joblog, to check the order « Take descriptor » at the beginning of its execution 6 Communications are active between the source and the target The communications scheme is functional  Problem within Quick-EDD/HA or execution issue / Check the joblogs of sender and receiver jobs

5 www.quick-software-line.com Communications Communication daemon PMSYSDEM Additional elements As long as the communications are not properly established, the « sender » and « receiver » jobs of Quick-EDD/HA can be stopped in *IMMED mode, without any specific control The «TCP/IP DAEMON» is necessary only to establish the communications – afterwards it can be stopped and restarted any time you need, without any risk for the running jobs Problem ? The PMSYSDEMON job uses two PMSYSDEM objects, one *DTAQ and one *USRSPC. It might happen that those objects become damaged. It’s the case if the job uses a lot of CPU and if new communications can’t be started. Procedure to follow : - Stop manually the job PMSYSDEMON (ENDJOB) - Destroy the two objects – WRKOBJ QUSRSYS/PMSYSDEM then option 4=Delete - Restart the daemon: PMSYSDEM OPTION(*STR)

6 www.quick-software-line.com Replication jobs END Diagram of replication jobs EDH_xx_SND Sender EDH_xx_RCV Receiver EDH_xx_Jnn Journals reading EDH_xx_Xnn I/Os processing EDH_xx_S01 Synchro 1 EDH_xx_R01 Synchro 1 The journals server(s) feed(s) the job EDH_xx_SND The « SND » job transmits the events to RCV The RCV job transmits the events to the data servers. Once processed, the events are acknowledged In case of negative acknowledgement or for the new objects, a synchronization is performed The synchro is acknowledged thanks to a journal entry

7 www.quick-software-line.com Jobs Analysis END In case the replication has suddenly stopped If the replication stops abnormally, the concerned jobs are EDH_xx_SND on the source system and EDH_xx_RCV on the target system Check the JOBLOG of each one of those jobs You can check also the JOBLOG or the data servers EDH_xx_Xnn on the target system In case a synchronization job stops suddenly If there is a severe error on an object, the synchronization job can stop in a abnormal way. REPLICATION KEEPS RUNNING. A new server will be launched Depending on the kind of mistake, the concerned object will be synchronized again, or a manual action will be needed if the error is « fatal » The error message shows the number of the server – check the JOBLOG of the jobs EDH_xx_Snn on the source system and corresponding EDH_xx_Rnn on the target system, « nn » being the number of the server

8 www.quick-software-line.com Jobs Analysis - Replication END On the source machine, WRKJOB EDH_xx_SND Enter 4, Spools files management. The QPJOBLOG file contains the joblog of EDH_xx_SND On the target system, WRKJOB EDH_xx_RCV Enter 4, Spools files management. The QPJOBLOG file contains the joblog of EDH_xx_RCV NB: WRKJOB JOB(NUMBER/USER/JOB) OUTPUT(*PRINT) OPTION(*JOBLOG) prints the log of an active job

9 www.quick-software-line.com Jobs Analysis – Synchronization Abnormal stop of a job END On the target system, WRKJOB EDH_xx_Rnn Enter 4, Spools files management. The file QPJOBLOG contains the joblog of EDH_xx_Rnn

10 www.quick-software-line.com Jobs Analysis – Synchronization Error without abnormal end of a job - 1 END On the source system, enter + in front of the line where there is one « nok » object Enter F8, then put 1 in the part « Synchro ». The object in error appears in blue reverse video.

11 www.quick-software-line.com Jobs Analysis – Synchronization Error without abnormal end of a job - 2 END Enter M in front of the object in error. The error displays the source synchronization job, here EDH_xx_S01. Enter W to access this job, then its JOBLOG. If the information of the joblog of EDH_xx_Snn is not explicit, you’ll have to check the joblog of EDH_xx_Rnn on the target system.

12 www.quick-software-line.com Audit END During the installation of Quick-EDD/HA the audit of the system is automatically activated Creation of the audit journal and its associated receiver Activation of the system values QAUDCTL and QAUDLVL All the objects included in the perimeter of Quick-EDD/HA will be automatically audited with *CHANGE level (during start in 9 or 0 and when a new object appears) Problem ? In case of general trouble check that the audit is still active on the system and that the audit journal is present  Quick-EDD/HA can’t start if the audit journal is missing – The system value QAUDCTL must not be equal to *NONE In case of issues on some objects check that the object is audited (command DSPOBJD – the audit level must be *CHANGE) The audit journal represents big volumes every day. Check the contents of the journal with the command DSPJRN to determine the kind of journal entries you have.  The most probable cause is a level of audit which is too high (*ALL) on some objects

13 www.quick-software-line.com Journaling END The journaling is mandatory to process the database properly  It’s mandatory to be able to replicate in real time  Quick-EDD/HA supports all options (*AFTER or *BOTH – With or without the Open/Close – MINENTDTA - Journal Cache, …) The journaling is optional for the IFS  Only the applications which are able to UPDATE inside the IFS need journaling:  txt files, Java files, SAP, Movex, JDEdwards, Adobe  Same rules as for the database Problem ? A file is not replicated Check that it is journaled (command DSPFD) Check that the journal is taken into account by Quick-EDD/HA and read in sequence Check that the replication has no delay, generally speaking and for that journal No object is replicated Check the list of journals which are processed by Quick-EDD/HA Check the journals servers jobs EDH_xx_J01,02, … Check the target jobs which apply the entries EDH_xx_X01, 02, … Check the communication jobs EDH_xx_SND et EDH_xx_RCV

14 www.quick-software-line.com Journals receivers management END Journals receivers management Quick-EDD/HA manages the journal receivers automatically :  Entirely  Management rules for the detachment and deletion of the receivers  Partially – for a journal, you can choose: To use the standard rules of management To keep the receivers – detachment management but no deletion No action. The journal is entirely managed externally Problem ? The journal receivers are not deleted on the source system Check the standard management rules. Do you have the issue for only one journal ? Check if the journal has specific options This operation is managed by the « SND » job – check its JOBLOG Check that there is no receiver in partial status for that journal The journal receivers are not deleted on the target system Check the setting « Receivers management » in the target system description to check that option 1 or 2 is activated Check that the receivers library is replicated This operation is managed by « RCV » job – check its JOBLOG

15 www.quick-software-line.com Database END The database replication represents the bigger part of the replication, often more than 90% of the activity. Several issues can appear on those objects : Management of the database object Complex object structure Ex. Fields BLOB, CLOB, … Object dependences (LF, joined file, referential constraints …) Triggers Management Number of access paths, having an impact over performances Data Management Replication in real time – all the journal entries are taken into account SQL management has rules which are different from classical DB/2 Use of the target data To provide R.O.I., Quick-EDD/HA allow you to access (read mode) to the data on the target system The different needs on the target system can create constraints for the real time replication

16 www.quick-software-line.com IFS END The IFS replication is often very simple because, most of the time, it deals with files which are created, then stored (EDI, archiving …). The main difficulty with IFS is the contents control. In fact, the IFS files are often simple and small ; however, coherence controls become tough, because of the tree structure and the numnber of objects (you can have millions of objects on hundreds of levels). Audit and journaling As any other object, the IFS files are audited journaling is rarely mandatory (txt files, Java files, SAP, Movex, JDEdwards, Adobe) Problem ? The replication is not done As for any other object, check the audit level The replication is managed by the synchronization jobs – Display the messages of the objects to find the concerned synchronization job, then check the JOBLOGs of the source and target jobs journaled IFS? In case of a journaled IFS, as for the database, check that the object is journaled properly, then check that the journal is included in the list of Quick-EDD/HA and that it is properly processed

17 www.quick-software-line.com IFS - QDLS END QDLS comes from older releases of the OS and corresponds to a « DOS » structure. It is integrated to the IFS with some specific considerations : Audit and journaling The files of QDLS are audited as any other object The journaling of QDLS is IMPOSSIBLE To use QDLS the user profile must be registered in the system directory (WRKDIRE) Problem ? Replication is not done As for any other object, check the audit level of the object Check that the user profile used for the replication is properly registered in the WRKDIRE As for the IFS, check the messages at the object level, and display the JOBLOG of the corresponding source and target synchonization jobs

18 www.quick-software-line.com Other objects END There are many objects types in the system. However, they all work the same way : For Quick-EDD/HA, all the objects are managed with the same rules Definition inside a group Real time replication of the journal events thanks to the data servers The synchronization servers process all the types of objects, system objects, IFS, system values or spools files Problem ? The replication is not done For any type of object, check the audit level Check the messages at the level of the objects, and display the JOBLOG of the corresponding source and target jobs For the spools files, check that the system value QAUDLVL uses the special value *SPLFDTA

19 www.quick-software-line.com Performances END Performances rely on three distinct points : The ability to read the journal entries on the SOURCE system The communications, with the bandwith of the line between the source and target The ability of the target system to process the I/Os Problem ? Does the SOURCE system have enough ressources for the EDH_xx_Jnn servers Check the memory pool usage. By default, jobs run in the *BASE pool. It can be beneficial to create a dedicated pool. Is the communication line bandwith adapted to the replication needs ? Check the adequation between the line and the replication needs Is the communication line dedicated to the replication ? Check the line usage Does the target system have enough ressources to process the I/Os at the same rythm as the SOURCE system? The disks and number of arms of the Target system are very important and must be equivalent to the ones on the source system. This point if often neglicted: either the target system uses old generation disks, or less arms because of large capacity disks.

20 www.quick-software-line.com Contact the SUPPORT Several ways to contact support : By phone +33 153 102 767 By Email support@traders.fr Via Skype support.traders.fr END

21 www.quick-software-line.com Contact the SUPPORT To register properly your issue, the Support will probably ask the following elements: - release of Quick-EDD/HA used on your systems - release of OS/400 of the Source and Target machines If you have an issue regarding a product abnormality, you’ll have to provide: - the JOBLOG of the source jobs - the JOBLOG of the target jobs Note : - If you have a specific issue, the Support may need the concerned object, in order to the development teams to analyze the issue on our test systems END


Download ppt "Www.quick-software-line.com High Availability for IBM Power i."

Similar presentations


Ads by Google