Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server.

Similar presentations


Presentation on theme: "Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server."— Presentation transcript:

1 Data Storage CPTE 433 John Beckett

2 The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server farm?” It isn’t about having storage It’s about managing data through its life-cycle The new measurement is price per gigabyte- month

3 Definitions Spindle, platters, heads –Physical arrangement of disk –Of little interest to us, except to help us understand how new technologies will impact us Drive controller –On the hard drive itself –Connected to… Host Bus Adapter

4 RAID Raid LevelMethodsCharacteristics 0Stripe data across multiple drives Faster reads and writes; poor reliability 1Mirrors copy of data across two drives Faster reads; good reliability; failures tend to be catastrophic (JB & SA) 5Distributed parity Any single disk may fail without loss Faster reads; slower writes; more economical 10Mirrored stripes Raid 0 group mirrored onto another group Faster reads; best reliability; most expensive Table 25.1 Dilemma: Can you add hardware without subtracting from reliability? (Only by using very high-quality hardware)

5 Where Is the Data? DAS – Directly Attached Storage (IBM: DASD), connected directly to the server –May be a RAID array NAS – Network-Attached storage –Uses a protocol to transfer data SAN – Storage-Area Network –Separate network segment for storage, connecting servers and drives A SAN is usually made out of NAS devices

6 Structure of a SAN LAN SAN Ctrlr (Server) SAN backbone NAS SAN Ctrlr (Server)

7 Managing Storage Think of storage as a community resource –If it’s personal, does it have any business on company equipment? Determine storage needs of the group Identify an architecture that will satisfy that need Plan an upgrade path for growth in the future Implement inventory and spares policy

8 Standardization Disk drives are as important to standardize as any other component –Spares issue –Warrantee service procedure –Ability to use obsoleted drives Drive lifetime issue: –A drive motor may become unreliable after so many revolutions

9 The Storage SLA Availability Response time Reliability is increased by RAID > 0 –…only if monitored and maintained –…only if RAID method is preserved Network is a part of the reliability picture

10 Backup and RAID RAID is not a backup strategy If >n drives fail, you lose data Controller failure can cause data loss One possibility: RAID mirror as a backup –Requires disconnecting other drive on failure How about: Spare drive, auto backup each night –Maybe including incremental backups

11 Using RAID mirror to effectively speed-up backup Break the RAID pair Back up Re-connect the RAID pair

12 Monitoring How full –Rate of change Broken drives How busy (especially network on NAS) Unused

13 SAN Caveats Benchmarks are problematical Useful versus physical storage size Product life-cycle issues

14 Pipeline Optimization Read – buffered and available immediately Write – buffered and done at leisure –Dangerous if drive fails before update is posted

15 Sync Early versions of an OS usually don’t sync properly if shut down during “quiet” time –Novell – unscheduled shutdown could be catastrophic –Windows – learned some lessons from others Is it safe to turn off power during operation? –A mainframe will be able to handle this

16 Performance Locate simultaneously-used data on different spindles to minimize head thrashing –The more complex your data, the harder this is to do –Restrict this technique to very heavily-used data Beware of compression –Assumes your data is organized a certain way –Assumes your CPU has spare time to spend

17 Disk Access Density I/O Operations per second per gigabyte of capacity How fast can you move the entire drive of data?

18 Fragmentation Don’t fill up your drives! That makes defragging slow Also slows online attempts at limiting fragments

19 Continuous Data Protection Send a log of all changes somewhere other than your disk drive –Tape –Over the network to another location –Another disk drive Back-out and forward recovery


Download ppt "Data Storage CPTE 433 John Beckett. The Paradox “If I can go to a computer store and buy 1000 gigabytes for $50, why does it cost more in your server."

Similar presentations


Ads by Google