Transparent Development Demo Day March 2017
Disclaimer HCL’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at HCL’s sole discretion. Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. Do not use any Content provided on developerWorks to develop software or services that provide the same or similar functionality. Please refer to the developerWorks terms of use for more information.
Agenda Beta Program Introduction Demo from the development organization What-if enhancement: Show the impact on critical jobs excluding non-critical successors from the view Improving the monitoring experience with revamped graphical views: Endless navigation and plan exploration of multiple job streams Live mode for tracking job stream updates without manual refresh Enforce workload continuity with new recovery options for failed jobs Rerun a job with all its successors Rerun a job on the same server it previously ran on Job Management Plug-in: Run different recovery actions based on output conditions Automation of iterative workflows
Agenda Beta Program Introduction
Beta Program overview What Collaboration do we offer? Beta Program for the Workload Scheduler product is launching in March The program will give you the chance to: Access new Workload Scheduler code in advance to test our new features Provide your Feedback to help us improve the functionalities The program will be available for Workload Scheduler distributed and z/OS
Beta Program Workload Scheduler distributed Beta Program for Workload Scheduler distributed will be managed via the Workload Automation on Cloud (Software as a Service) platform. To participate, send an email to one of the following contacts. You will need to specify your IBM ID in the request: antonella.godino@hcl.com umberto.caselli@hcl.com claudio.falcone@hcl.com Within 24h, you’ll receive an email with a link to access the SaaS environment. Download a Workload Automation on Cloud agent to connect to a Workload Scheduler environment with the most recent build. Access the Dashboard page: all the UIs are available here to manage your workload. Any questions or feedback can be sent to the contacts. We’ll provide support and answer any questions you may have. The program will be launched on March 15
Beta Program Workload Scheduler for z/OS To participate in the Beta Program for Workload Scheduler for z/OS, send an email to one of the following contacts: antonella.godino@hcl.com umberto.caselli@hcl.com claudio.falcone@hcl.com Within 24h, you’ll receive an email with a link to download the libraries. This link will be active for 24h from the receipt of the email. The downloaded libraries can be installed in your environment. Any questions or feedback can be sent to the contacts. We’ll provide support and answer any questions you may have. The program will be launched in mid-April
Agenda Demo from the development organization What-if enhancement: Show the impact on critical jobs excluding non-critical successors from the view Improving the monitoring experience with revamped graphical views Enforce workload continuity with new recovery options for failed jobs Rerun a job with all its successors Rerun a job on the same server it previously ran on Job Management Plug-in: Run different recovery actions based on output conditions Automation of iterative workflows
Show the impact on critical jobs excluding non-critical successors from the view What-if enhancement
What-if enabled from monitor jobs (z/OS)
Job predecessors on the critical network
Show impact on critical jobs – action result
Show all jobs
Show all jobs – action result
Show impact on critical jobs action The What-if feature is now enabled from Monitor jobs for z/OS engines showing all of the job’s predecessors on the critical network. The new action «Show impact on critical jobs» will show only the selected job and the critical jobs for which the job is a predecessor in order to evaluate the impact. The new action «Show all jobs» displays all of the jobs between the selected job and the critical jobs. The new actions are available for z/OS and distributed engines.
Improving the monitoring experience with revamped graphical views
Agenda Personas Hill Research As-is Scenarios Solution scenario 17
Target user - Jason “As an operator, I need to ensure WA operation continuity 24x7, avoiding involvement of the support team” ~Jason, IWS Operator Copyright What does the user do with the product? Ensures that for each job in error, there is a ticket to manage the failure Provides support during troubleshooting (can restart/reload jobs, etc.), because he is the only one that can change the plan Needs to quickly find issues in the workload, understand impacts and alert the appropriate people Decision factors: Need to keep stuff working, and resolve tickets Need to ensure problem resolution is ongoing Who is affected by the user? Application team is alerted by Jason when something goes worng. Quote - interview 18 18 18
Hill Jason can resolve an abend or can understand the root cause of a workload execution delay from a single Graphical View 19
Research Critical issues identified: Navigation is too complex 20 Too much data, not enough precision The graphical views are showing a lot of information, but are not showing the information I need for performing my tasks. There is no way to hide information that is not important to me in the context of the task I am performing. The organization of the view is not the same as the modelling one. Navigation is too complex Navigation between different job streams, to understand the impact or root cause is complex and is not achievable in just one view. 20
As-is Scenario A job abends, a ticket is automatically opened to Jason. Jason needs to understand impact of the failures on successor jobs. Jason needs to open a lot of windows to understand root causes and impact, because he needs to track the flow across several job dtreams. Jason often needd to hand-write information to form correlations. 21
To-be Scenario A job abends, a ticket is automatically opened to Jason. Jason needs to understand the impact of failures on successor jobs. Jason can simply launch one graphical view, to navigate between different job streams quickly and with suggestions regarding potential issues. 22
To-be Scenario (as story) Now let’s see how this happens through an example scenario: ”Jason receives a call during his daily shift because the Payroll Job Stream has not yet started. He needs to understand why and try to solve the issue.” 23
Demo Scenario Jason searches for the Payroll Job Stream and opens the Job Stream View to find the predecessor that is causing the delay 24
Demo Scenario 25
Demo Scenario 26
Demo Scenario 27
Demo Scenario 28
Demo Scenario 29
Demo Scenario 30
Demo Scenario 31
Demo Scenario 32
Demo Scenario 33
Demo Scenario 34
Pre production plan view Improving the monitoring experience with revamped graphical views Other revamped Graphical Views: Plan view Pre production plan view The plan view and pre-production plan view will be reworked with the new technology 35
Key features and key strengths Endless navigation and plan exploration Live mode: track job stream updates without a manual refresh Layout inheritance from the modelling definition (with re- arrangement allowed for the current working session) Quick filtering and advanced tracking through colors and the «Status bar» 36
Enforce workload continuity with new and improved recovery options for failed jobs Rerun a job with all its successors Rerun a job on the same server 9 March 2017
Recovery job issues identified: Enforce workload continuity with new and improved recovery options for failed jobs Recovery job issues identified: Workload Scheduler implements some advanced scenarios when a job fails, but the recovery procedure is not easy and involves some manual intervention (Manual rerun failed jobs, identify all the successors and manually rerun the successors already completed…) The need is a procedure where job failures are handles without manual intervention.
Enforce workload continuity with new and improved recovery options for failed jobs Developed in sprint 2: Rerun a job with all successors (RFE 57221, 65671) Rerun job on the same server it previously ran on (RFE 130343) Run different recovery actions based on output conditions Already done in sprint 1: Advanced rerun options (RFE 103418)
Rerun a job with all successors (RFE 57221, 65671) Enforce workload continuity with new and improved recovery options for failed jobs Rerun a job with all successors (RFE 57221, 65671) Would you like Workload Scheduler to rerun the failed job automatically for you? And help you deal with successor jobs too? JSEXT4 JOBEXT4 JSEXT1 JOBEXT1 JOBEXT1B JSINTERNAL JOBINT1 JOBINT2 JOBINT4 JOBINT3 JSEXT3 JOBEXT3 ABEND SUCC HOLD
Enforce workload continuity with new and improved recovery options for failed jobs A new action has been added in the DWC Monitor Jobs panel to rerun a job with its successors…
Enforce workload continuity with new and improved recovery options for failed jobs You can easily: Check the number of the INTERNAL and EXTERNAL successors on which you can use the rerun command Check the statuses of the jobs Apply the rerun action only for the INTERNAL successors or for ALL the successors
Enforce workload continuity with new and improved recovery options for failed jobs The rerun action is not supported on jobs with certain job statuses. In this case, the action is disabled and the reason is specified in the Messages column
The same action can be performed by conman Enforce workload continuity with new and improved recovery options for failed jobs The same action can be performed by conman conman "rerunsucc S_MDM#JS_RERSUCC_INT.JOB_RERSUCC_INT1" You are prompted to indicate whether you want to rerun the original job with all its successors (internal and external), or only with its internal successors
Enforce workload continuity with new and improved recovery options for failed jobs From the command line, you can easily monitor the status of all the successors jobs (internal and external successors) conman "listsucc S_MDM#JS_RERSUCC_INT.JOB_RERSUCC_INT1"
listsucc/rerunsucc WORKSTATION#JOBSTREAM.JOB interactive Enforce workload continuity with new and improved recovery options for failed jobs Conman commands: listsucc/rerunsucc WORKSTATION#JOBSTREAM.JOB interactive Do you want… all? Do you want... Internal? listsucc/rerunsucc WORKSTATION#JOBSTREAM.JOB;internal/all listsucc/rerunsucc WORKSTATION#JOBNUMBER listsucc/rerunsucc WKS#JOBSTREAM_SchedID.JOB;schedid listsucc/rerunsucc WKS#JOBSTREAM(scheduledTime).JOB
Rerun job on the same server it previously ran on (RFE 130343) Enforce workload continuity with new and improved recovery options for failed jobs Rerun job on the same server it previously ran on (RFE 130343) A job defined on a POOL or a DYNAMIC POOL can now rerun on the same Dynamic Agent it previously ran on. The new option can be specified: In the Recovery Option section During the manual rerun
Same workstation – Recovery Options Enforce workload continuity with new and improved recovery options for failed jobs Same workstation – Recovery Options You can specify the option in the job definition meaning that each recovery job runs on the same dynamic agent it previously ran on COMPOSER:
Same workstation – Recovery Options Enforce workload continuity with new and improved recovery options for failed jobs Same workstation – Recovery Options
Same workstation – Manual Rerun Enforce workload continuity with new and improved recovery options for failed jobs Same workstation – Manual Rerun You can always specify the option during the manual rerun. Specifying the option during the manual rerun, the option overwrites the one defined in the job definition. If the «same workstation» is down, you can remove the option during the manual rerun so the job can run on an available workstation in the POOL or DPOOL
Same workstation – Manual Rerun Enforce workload continuity with new and improved recovery options for failed jobs Same workstation – Manual Rerun Job Stdlist and conman;props
Enforce workload continuity with new recovery options for failed jobs: Job Management Plug-in Recovery actions based on output conditions Automation of iterative workflows
Job Management Plug-in - Overview The new plug-in can be used to schedule all actions that you can execute manually for a job instance in the plan, like kill, release, cancel, etc.. During the creation of the Job Management plug-in, some fields are already prefilled with variables that are resolved at runtime, or, you can replace them with another value. As with the other plug-ins, there are exportable variables to use in your workflow to be able to pass values between jobs in the same or different job streams.
Job Management Plugin – Example without new plug-in In this example, the job that start an application on server hangs or doesn’t start at all. So, the job stream either remains in execution or ends in stuck and manual intervention is required to recover it. Moreover, the email notification is not sent because the predecessor is still running or has ended in error.
Job Management Plug-in Example: Run different recovery actions based on output conditions Using conditional dependencies and the new Job Management plug-in, we can transform the manual intervention into an “automatic recovery action”. Without the new plug-in, the operator must select the job in the right job stream and execute the action requested (i.e. kill of the job in hang). With the new plug-in, you can insert the right recovery actions in the flow, so that they will run if a condition is satisfied. In the end, an email notification is sent with the result. New New
Job Management Plug-in Example: Run different recovery actions based on output conditions
Job Management Plug-in Example – Automation of iterative workflows Among the different actions available with the new plug-in, there are also Rerun Internal Successors and Rerun Successors actions. These two actions enable the user to implement a loop like the DO-WHILE statement used in programming languages. Combining the Job Management plug-in and the conditional dependencies, you can rerun a branch of your workflow while a condition is met. In this example, the RESTART_LOOP job reruns the CHECK_LOGS and its internal successors. It occurs as long as the condition is true (i.e. if a file to read is present). New