Presentation is loading. Please wait.

Presentation is loading. Please wait.


Similar presentations

Presentation on theme: "AUTOMATED RESTORE TESTING FOR TIVOLI STORAGE MANAGER."— Presentation transcript:



3 Based in Research Triangle area, NC, USA IBM Advanced Business Partner Big fans of Tivoli Storage Manager Broad experience with Fortune 500 clients TSMworks, Inc.

4 The Recovery Gap The Gartner Group: “30% of all backups are not recoverable.” The Yankee Group: “40% were unable to recover data.” Symantec: “…one in four tests fail.” What’s the Problem?

5 Wisdom from TSM experts and User Groups: Node wasn’t backed up, because: It fell off the schedule It was never even registered to TSM (“rogue” server) Node was backed up “successfully”, but: Some critical files were excluded Files (often databases) were in-use or locked Nothing was mounted on the mount point Windows Journaling Service failed Retention was too short... and many more at A few reasons why recoveries fail

6 Node wasn’t backed up, because: YES: It wasn’t scheduled by TSM (or anything else) NO: It was never registered to TSM (“rogue” server) Node was backed up “successfully”, but: NO: Some critical files were excluded YES: Files (often databases) were in-use or locked NO: Nothing was mounted on the mount point NO: Windows Journaling Service failed NO: Retention was too short MOSTLY NO: … many more at So, test the backups. (If you’re serious about recovery). Do Reporting Tools Help?

7 OS Filesystem Database Reboot Test TSM …but unfeasible for 500 machines daily. Bare-metal restore testing is ideal...

8 This technique always uncovers problems. And it’s quite feasible. TSM Sampling restores is very workable dsmc restore

9 : Automated Restore Testing ART restores a few files, chosen at random, from every computer TSM backs up. Restored files come to the ART VM. Production nodes are untouched. ART doesn’t restore huge files, hammer TSM, or disrupt migration, reclamation, etc. And … ART usually uncovers huge amounts of wasted storage.

10 VMware Virtual Machine Linux OS MySQL Web Server dsmc Network LDAP dsmadmc setup TSM config. files run tests annotate upgrade send logs Rails Engine Rails Engine login TSM Servers TSM Servers LDAP Server LDAP Server ssh, curl, etc. SMTP Server SMTP Server TSMworks site TSMworks site ART architecture ART is a self-contained Virtual Appliance.

11 How ART installs Our website Your ESX farm Your TSM Server Download it, Point it at TSM. Start it up, Your ESX team usually does the install. Easy.

12 It’s all we do. Your TSM site TSM servers Client 1Client 2Client 3 Network/SAN dsmadmc dsmc Database Web server 1. Discover all clients 2. For each client, restore one file, selected at random 3. Record results for each test restore 4. Show results on dashboard How ART works

13 Dashboard Each bar shows the results of one “Sweep”, one testing pass through all the nodes. One sweep may take hours or days. You can break it up to run, say, 2 PM to 6 PM daily. (Not usually necessary.)

14 ART for Auditors Storage auditors can use the “Passed” section to see proof that each node was really tested.

15 ART for Auditors From the list of successful nodes, drill down …

16 ART for Auditors … to see that each filespace was tested…

17 ART for Auditors … and that the test succeeded.

18 ART for TSM Admins Unlike Auditors, TSM administrators will want to look at the errors, not the successes.

19 ART for TSM Admins ART shows a short list of root causes, rather than the long list of nodes that failed. Click the message text…

20 ART for TSM Admins … to see activity log detail about the failure. Ex: restore fails due to a missing tape. The sensible Admin will fix all missing tapes, preventing failures on many other nodes.

21 ART for TSM Admins Click the Error Code to get plain-language help and advice on what to do, if you’re not familiar with TSM.

22 ART for TSM Admins If you have Rogue-server detection licensed,

23 ART for TSM Admins ART will find servers that are on your network, but are not registered to TSM. No reporting tool that talks only to TSM will ever find these rogue servers.

24 ART for TSM Admins These IP Addresses are on the network, but TSM doesn’t know about them. Click the IP address to see network analysis of these potential problems…

25 ART for TSM Admins … including what kind of OS they use, and which ports they listen on. This helps distinguish important IPs (servers) from irrelevant ones (printers, laptops, etc.).

26 ART for TSM Admins Document what the irrelevant IPs are, and ignore them on future sweeps.

27 ART for Storage Trimming A side effect of testing: ART finds junk storage. Reduce your storage footprint by 20-50%.

28 Ancient policies are way too conservative Full database dumps retained for a year TDP agents don’t delete old backups Clusters are set up wrong Filesystems get renamed Users back up remote disks Decommissioned machines aren’t deleted. Where does junk come from?

29 ART for Storage Trimming This filesystem uses 127 TB, yet has not been backed up in three weeks. Why not? Call Chris Chiang. Either back it up, or delete it, and reclaim 127 TB.

30 ART for Storage Trimming This 33 GB drive uses 1.3 TB of backups. Why? Call J Williamson and decide.

31 ART for Storage Trimming The Almost Full view shows where OS or Applications may soon crash due to low disk space.

32 ART for Storage Trimming If you change the sorting or filtering, a “save as..” link appears …

33 ART for Storage Trimming So you can give your custom report a new name …

34 ART for Storage Trimming And have it on a new tab.

35 Easy install Supports storage audits Prevents restore problems Finds rogue servers Trims waste storage Pricing: $35/node (less for larger sites). Benefits and Pricing

36 ART’s free trial tests 20% of your site. Use it as a free Health Check, with our compliments. Free trial = Free Health Check


Similar presentations

Ads by Google