Copyright © 2002 Legato Systems, Inc. Legato Confidential.

Slides:



Advertisements
Similar presentations
Acknowledgments Byron Bush, Scott S. Hilpert and Lee, JeongKyu
Advertisements

Lesson 17: Configuring Security Policies
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 5: Managing File Access.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 11: Monitoring Server Performance.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 5: Managing File Access.
11 SUPPORTING LOCAL USERS AND GROUPS Chapter 3. Chapter 3: Supporting Local Users and Groups2 SUPPORTING LOCAL USERS AND GROUPS  Explain the difference.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 12: Managing and Implementing Backups and Disaster Recovery.
Hands-On Microsoft Windows Server 2003 Networking Chapter 7 Windows Internet Naming Service.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 2: Managing Hardware Devices.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 8: Implementing and Managing Printers.
Hands-On Microsoft Windows Server 2003 Administration Chapter 6 Managing Printers, Publishing, Auditing, and Desk Resources.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 11 Managing and Monitoring a Windows Server 2008 Network.
Backup and Recovery Part 1.
Check Disk. Disk Defragmenter Using Disk Defragmenter Effectively Run Disk Defragmenter when the computer will receive the least usage. Educate users.
9 Copyright © Oracle Corporation, All rights reserved. Oracle Recovery Manager Overview and Configuration.
I Information Systems Technology Ross Malaga 3 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 3-1 SOFTWARE.
Agenda  Overview  Configuring the database for basic Backup and Recovery  Backing up your database  Restore and Recovery Operations  Managing your.
11 MAINTAINING THE OPERATING SYSTEM Chapter 5. Chapter 5: MAINTAINING THE OPERATING SYSTEM2 CHAPTER OVERVIEW Understand the difference between service.
CONTENTS:-  What is Event Log Service ?  Types of event logs and their purpose.  How and when the Event Log is useful?  What is Event Viewer?  Briefing.
1 Semester 2 Module 2 Introduction to Routers Yuda college of business James Chen
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 12: Managing and Implementing Backups and Disaster Recovery.
Managing DHCP. 2 DHCP Overview Is a protocol that allows client computers to automatically receive an IP address and TCP/IP settings from a Server Reduces.
1 Chapter 2 ROUTER FUNDAMENTALS By: Tassos Tassou.
1 Chapter Overview Monitoring Server Performance Monitoring Shared Resources Microsoft Windows 2000 Auditing.
PPOUG, 05-OCT-01 Agenda RMAN Architecture Why Use RMAN? Implementation Decisions RMAN Oracle9i New Features.
Distributed Deadlocks and Transaction Recovery.
Web Based Applications
WaveMaker Visual AJAX Studio 4.0 Training Troubleshooting.
DONE-10: Adminserver Survival Tips Brian Bowman Product Manager, Data Management Group.
Passive Monitoring with Nagios Jim Prins
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 2: Managing Hardware Devices.
Copyright ®xSpring Pte Ltd, All rights reserved Versions DateVersionDescriptionAuthor May First version. Modified from Enterprise edition.NBL.
Gorman, Stubbs, & CEP Inc. 1 Introduction to Operating Systems Lesson 12 Windows 2000 Server.
5 Copyright © 2004, Oracle. All rights reserved. Using Recovery Manager.
Recovery-Oriented Computing User Study Training Materials October 2003.
Conditions and Terms of Use
Copyright © 2002 Legato Systems, Inc. Authentication Version 1 Katrina Illari d1614 Authentication Version 1 Katrina Illari d June 2005 Legato Confidential.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 12: Managing and Implementing Backups and Disaster Recovery.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 5: Managing File Access.
Distributed File Systems
Ch 6. Performance Rating Windows 7 adjusts itself to match the ability of the hardware –Aero Theme v. Windows Basic –Gaming features –TV recording –Video.
Module 7: Fundamentals of Administering Windows Server 2008.
111 EMC CONFIDENTIAL—INTERNAL USE ONLY NMC -- NW Administration NMC Team NetWorker 7.3 TOI July 28, 2005.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
Citrix MPS 3.0 Licensing Douglas A. Brown President
Copyright © 2002 Legato Systems, Inc. AlphaStor 3.1 Support in NW Dan Gajanovic Legato Confidential.
Computer Emergency Notification System (CENS)
© 2013 Cisco System Inc. All rights reserved Cisco Confidential 1 © 2013 Cisco System Inc. All rights reserved. 1 System Backup And Restore Utility.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
NT4 SP4 Security Jack Schmidt - Fermilab
© 2006 Cisco Systems, Inc. All rights reserved.1 Connection 7.0 Serviceability Reports Todd Blaisdell.
Copyright © 2002 Legato Systems, Inc. Version: 1.0 Author: Shu-na Chu Doc d1795 Version: 1.0 Author: Shu-na Chu Doc d1795 8/11/2005 Legato Confidential.
Vinay Paul. CONTENTS:- What is Event Log Service ? Types of event logs and their purpose. How and when the Event Log is useful? What is Event Viewer?
1 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Cisco Unity Connection Notification Jane Rygg Core Services.
Copyright © 2002 Legato Systems, Inc.
3 Copyright © 2006, Oracle. All rights reserved. Using Recovery Manager.
111 EMC CONFIDENTIAL—INTERNAL USE ONLY Copyright © 2002 Legato Systems, Inc. Peter Booth 14July, 2005.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
1 © 2004 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Technical Support Seminar Using the Cisco Technical Support Website.
Unit 7: Event Logging, Reporting, and Monitoring.
CACI Proprietary Information | Date 1 PD² SR13 Client Upgrade Name: Semarria Rosemond Title: Systems Analyst, Lead Date: December 8, 2011.
TNPM v1.3 Flow Control. 2 High Level Instead of each component having flow control settings that govern only its directory, we now have a set of flow.
Functions of Operating Systems V1.0 (22/10/2005).
SQL Database Management
Chapter 2: System Structures
IBM Tivoli Support Technical Exchange Event: Troubleshooting Gateways for TNPMW By: Sharina Shahir September 17, 2009.
Introduction of Week 3 Assignment Discussion
Training Module Introduction to the TB9100/P25 CG/P25 TAG Customer Service Software (CSS) Describes Release 3.95 for Trunked TB9100 and P25 TAG Release.
Presentation transcript:

Copyright © 2002 Legato Systems, Inc. Legato Confidential

Legato Systems, Inc - Confidential and Proprietary 2 Introduction Prerequisites for attending this TOI session Overview and Benefits of the new feature Installation considerations How to configure/enable the feature Using the feature Licensing considerations Architecture and internal Design Debugging techniques and tips Questions and Answers

Legato Systems, Inc - Confidential and Proprietary 3 Prerequisites List any prerequisites to attending this presentation Internal documentation (proprietary) – d1783 d1682 Non-proprietary – c_pr/spec/SMIS_v101.pdf c_pr/spec/SMIS_v101.pdf _v28 DSP0107.pdf

Legato Systems, Inc - Confidential and Proprietary 4 Overview and Benefits NetWorker’s backup control and event management was limited + monitoring and reporting was also sparse Scheduled backup’s failure detection and classification was difficult for NetWorker Administrator History of failures/events was not stored in structured DB. Runtime monitoring of backups was limited Parallelism control was not centralized

Legato Systems, Inc - Confidential and Proprietary 5 Overview and Benefits (cont.) Savegroup reporting had inadequate error reporting Savegroup failures are not easy to detect Failure to backup some files was not treated as errors or warnings Failure reporting was done post completion of saveset Runtime monitoring of savesets was limited Control on individual savesets was limited

Legato Systems, Inc - Confidential and Proprietary 6 Overview and Benefits (cont.) To solve these problem a new jobs framework is utilized The framework utilizes a new daemon called nsrjobd (The jobs daemon). The jobs daemon maintains a repository that stores information about jobs such as: status, indications and job session information. This information is gathered at run time to allow monitoring of active jobs. Jobs are managed and controlled from a central point which provides the ability to stop an individual backup, for example from the GUI. Jobs are queued for central parallelism control in the jobs daemon

Legato Systems, Inc - Confidential and Proprietary 7 Overview and Benefits (cont.) Savegroup starts jobs using new central jobs daemon (nsrjobd) Savegroup receives information while processes run, rather than after the fact. This allows for better inactivity timeout monitoring. Jobs report indications (events) continuously during run Savegroup monitors indications and generates error reporting based on these

Legato Systems, Inc - Confidential and Proprietary 8 Overview and Benefits (cont.) Old Framework

Legato Systems, Inc - Confidential and Proprietary 9 Overview and Benefits (cont.) New framework save or savefs nsrexecd nsrjobd savegrp nsrd ServerClient

Legato Systems, Inc - Confidential and Proprietary 10 Overview and Benefits (cont.) System requirements to use feature Standard requirements Needs more space under /nsr/res

Legato Systems, Inc - Confidential and Proprietary 11 Overview and Benefits Where to learn more D1783 D1786 D NMC TOI

Legato Systems, Inc - Confidential and Proprietary 12 Installation Considerations Changes to installation /nsr/res/jobsdb created at installation New binary on server: nsrjobd RAP database used by nsrjobd does not export an RPC interface, but is viewable on disk.

Legato Systems, Inc - Confidential and Proprietary 13 Configuring the Feature How to enable and/or configure this feature Always enabled (Cannot disable) New attributes in NSR resource - Maximum Jobs DB size - Minimum Retention time New attributes in savegroup Restart window –Time limit for valid restart (default: 12:00 hr) Success threshold –Threshold to determine success/failure based on indication severity (default: Warning)

Legato Systems, Inc - Confidential and Proprietary 14 Using the Feature Daemon started by nsrd, only runs on the server, not storage nodes or clients. Daemon does all the remote execution and gathers information on the client side processes. Information is stored in permanent storage to allow for NMC to use for reporting.

Legato Systems, Inc - Confidential and Proprietary 15 Using the Feature New commands No new command GUI Changes in the GUI Described by NMC TOI

Legato Systems, Inc - Confidential and Proprietary 16 Using the Feature Attributes Minimum retention time Use this to configure the minimum amount of time that records will stay in the jobs database. Maximum Jobsdb size Use this to configure the maximum amount of space that the records will use. (As reported by save –nq) Restart window Use this to set a limit to consider last run as valid backup

Legato Systems, Inc - Confidential and Proprietary 17 Using the Feature (cont.) Success threshold Use the Success threshold to report savesets as failure. If success threshold is set to Warning (default), even if warning indications are generated the savegroup is reported as successful Setting the success threshold to “Success” will mean warnings will be treated and reported as failure

Legato Systems, Inc - Confidential and Proprietary 18 Group Properties - Advanced Using the Feature (cont.)

Legato Systems, Inc - Confidential and Proprietary 19 Using the Feature (cont.) Report changes Summary section NetWorker savegroup: (notice) Default completed, Total 3 client(s), 1 Succeeded with warnings(s), 2 Succeeded. Please see group completion details for more information. Succeeded with warnings: scoop.legato.com Succeeded: greenland.devlab.legato.com, soft Start time: Tue Jul 19 16:00: End time: Tue Jul 19 16:01:

Legato Systems, Inc - Confidential and Proprietary 20 Using the Feature (cont.) Indications --- Unsuccessful Save Sets --- * pa1pberde:c:\SFU\var\adm save: Saving files modified since Thu Feb 24 16:01: * pa1pberde:c:\SFU\var\adm C:\SFU\var\adm\.security * pa1pberde:c:\SFU\var\adm C:\SFU\var\adm\utmpx * pa1pberde:c:\SFU\var\adm C:\SFU\var\adm\wtmpx * pa1pberde:c:\SFU\var\adm C:\SFU\var\adm\ * pa1pberde:c:\SFU\var\adm C:\SFU\var\ * pa1pberde:c:\SFU\var\adm C:\SFU\ * pa1pberde:c:\SFU\var\adm C:\ * pa1pberde:c:\SFU\var\adm / * pa1pberde:c:\SFU\var\adm pa1pberde: c:\SFU\var\adm level=incr, 8 KB 00:00:05 5 files * : File C:\SFU\var\adm\.security could not be opened and was not backed up. (The process cannot access the file because it is being used by another process.) * : File C:\SFU\var\adm\utmpx could not be opened and was not backed up. (The process cannot access the file because it is being used by another process.) * : File C:\SFU\var\adm\wtmpx could not be opened and was not backed up. (The process cannot access the file because it is being used by another process.)

Legato Systems, Inc - Confidential and Proprietary 21 Using the Feature (cont.) Previously completed in Restart --- Previously Completed Save Sets --- aragorn: / level=full, 3831 MB 02:08: files aragorn: /space level=full, 6907 MB 02:48: files dev-nwserv: index:aragorn level=full, 63 MB 00:00:09 9 files

Legato Systems, Inc - Confidential and Proprietary 22 Licensing Considerations This feature is not licensed

Legato Systems, Inc - Confidential and Proprietary 23 Questions and Answers Any questions that have not been answered yet?

Legato Systems, Inc - Confidential and Proprietary 24 savegr p nsrjobd Jobs database Architecture and Internal Design Architectural diagram nsrd nsrexec d sa ve Console/GUI nsrmmd

Legato Systems, Inc - Confidential and Proprietary 25 Architecture and Internal Design (cont.) More notes on internal design Jobs daemon uses session channels wherever possible for doing remote execution and communication with nsrd and savegrp. All jobs get a record in the jobs database, this record remains for a period of time and then is purged based on the attributes set in the NSR resource. The daemon is multi-threaded. Not all threads are persistent. Depending on the OS this means it may appear that more than one nsrjobd is running.

Legato Systems, Inc - Confidential and Proprietary 26 Architecture and Internal Design (cont.) Savegroup opens a bidirectional session channel with nsrjobd at start Savegroup requests nsrjobd to start remote job Nsrjobd opens bidirectional session channel with the client’s nsrexecd Nsrexecd forks the child job and has bidirectional session channel to job Job reports state changes to nsrjobd

Legato Systems, Inc - Confidential and Proprietary 27 Architecture and Internal Design (cont.) Nsrjobd relays the state changes to savegroup Once the job gets media session, the session info is relayed to nsrjobd by nsrd Savegroup monitors for inactivity based on media session info and activity timestamp in nsrjobd database All stdout is redirected by nsrjobd to savegroup for backward compability Savegroup uses the stdout messages for completion reporting

Legato Systems, Inc - Confidential and Proprietary 28 Architecture and Internal Design (cont.) The instrumented client binaries (save) can generate indication events and completion events for the job for errors or warnings These indications are relayed to savegroup by nsrjobd and also stored in the jobs database Savegroup determines the success/failure of the backup based on indication severity

Legato Systems, Inc - Confidential and Proprietary 29 Debugging Techniques and Tips How to obtain debugging or tracking information Uses the standard debugging command of -D and levels 1-9. All debugging and error output is logged to the daemon.log All output will be prepended with a date/time stamp and the daemon name. The database at /nsr/res/jobsdb can be viewed using standard RAP tools. It contains a record of the jobs that have run and as such is a useful repository of information for debugging. Core file location follow the same convention as all other daemons. Use –vvv to get verbose output of remote client -D is not relayed to spawned jobs The verbose output is copied over to daemon.log and the temp file is retained as in 7.2 Indication level Debug is not used in 7.3, but wait for more sleeker and internationalized tracing of remote backup jobs failures in future.

Legato Systems, Inc - Confidential and Proprietary 30 Debugging Techniques and Tips Common pitfalls you or the customer may encounter Server machines need more memory disk space and CPU power than the past. Still NetWorker works with a decent low level server configuration Data reported (file size and times for completion) will not be exactly same as reported by GUI as these come from different sources. So the numbers can be slightly skewed The instrumented client binary is not reporting right level of indication, can still cause a warning to look like error or vice-versa All messages are in client’s locale. So still messages coming from clients from different locale will not be translated to servers locale. (This will be addressed in next release) Too long of a retention period or too large of a maximum size on the jobs database. Client state transitions are lost causing savegroup to seem like hung. (nsrjobd cleans up jobs in incorrect state periodically causing savegroup to recover from the hung situation) Very small restart window will cause the previous backups to be considered invalid and restarts will take longer Very large restart window can cause restart to overlap with next scheduled run. (Ideally restart window should be half of interval) Grouping needs to changed to group clients and savesets which are important and a warning should be considered a failure, into groups with Success threhold of “Success” Loss of reporting information if NMC daemon (gstd) is not run for period greater than minimum retention period. Savegroup unable to spawn processes. Check new authorization settings and the servers files. Customer wondering why there are no nsrexec’s running on the server. This is as designed

Legato Systems, Inc - Confidential and Proprietary 31 Debugging Techniques and Tips (cont.) Error messages customers might see daemon.log messages All jobs did not end gracefully…. –This means some jobs were not aborted at exit and savegroup was forced to exit before waiting for the exit of all jobs. –Completion report will not be valid for all jobs Lost channel with server –This means the communication with the nsrjobd was broken and caused savegroup to abort –If this message is seen repeatedly, nsrjobd is too busy to handle requests or hung (if a restart does not solve the problem, a daemon diagnosis (truss/pstack etc.) of nsrjobd might be needed) Aborting inactive job (%d) –The job is not saving data longer than inactivity timeout –The network bandwidth with the client needs to be checked –If the save process is hung in disk read a retry might resolve the issue.

Legato Systems, Inc - Confidential and Proprietary 32 Known Issues and Limitations Known issues and/or bugs Restarted savegroup does not clone savesets in previous runs (Existing issue in all past releases) Workaround – None (Plan to resolve this in next maintenance release) Limitations Older clients will not have indications All binaries are not fully instrumented to generate new indications (Gradual approach) CPE will be trained to extend existing error messages into indications for 7.3 clients Workaround (clients should be upgraded to 7.3)

Legato Systems, Inc - Confidential and Proprietary 33 Questions and Answers Any questions that have not been answered yet?

Legato Systems, Inc - Confidential and Proprietary 34 Demonstration If time permits - show db layout on disk & browsing of db using nsradmin - show savegrp –D9 and –vvv output and explain how to read new debug messages - show temp files created (& how to cleanup the debug temp files)

Legato Systems, Inc - Confidential and Proprietary 35 Questions and Answers Any questions that have not been answered yet? Thanks for attending