Mass Storage & Information Retrieval

Slides:



Advertisements
Similar presentations
Data Storage Solutions Module 1.2. Data Storage Solutions Upon completion of this module, you will be able to: List the common storage media and solutions.
Advertisements

Storage RAID Types of RAID Protocols SAN Microsoft Clustering (MSCS) What is clustering Terminology How we are configured at Grey Grey File Services.
Computer Network Topologies
NAS vs. SAN 10/2010 Palestinian Land Authority IT Department By Nahreen Ameen 1.
1 CSC 486/586 Network Storage. 2 Objectives Familiarization with network data storage technologies Understanding of RAID concepts and RAID levels Discuss.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
Vorlesung Speichernetzwerke Teil 2 Dipl. – Ing. (BA) Ingo Fuchs 2003.
Storage area Network(SANs) Topics of presentation
1 Recap (RAID and Storage Architectures). 2 RAID To increase the availability and the performance (bandwidth) of a storage system, instead of a single.
Storage Area Network (SAN)
Storage Networking. Storage Trends Storage growth Need for storage flexibility Simplify and automate management Continuous availability is required.
Session 3 Windows Platform Dina Alkhoudari. Learning Objectives Understanding Server Storage Technologies Direct Attached Storage DAS Network-Attached.
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
Mass Storage System EMELIZA R. YABUT MSIT. Overview of Mass Storage Structure Traditional magnetic disks structure ◦Platter- composed of one or more.
BACKUP/MASTER: Immediate Relief with Disk Backup Presented by W. Curtis Preston VP, Service Development GlassHouse Technologies, Inc.
Storage Area Networks The Basics. Storage Area Networks SANS are designed to give you: More disk space Multiple server access to a single disk pool Better.
Introduction to Networks Networking Concepts IST-200 VWCC 1.
Managing Storage Lesson 3.
LAN / WAN Business Proposal. What is a LAN or WAN? A LAN is a Local Area Network it usually connects all computers in one building or several building.
© 2001 by Prentice Hall5-1 Local Area Networks, 3rd Edition David A. Stamper Part 2: Hardware Chapter 5 LAN Hardware.
Local Area Networks (LAN) are small networks, with a short distance for the cables to run, typically a room, a floor, or a building. - LANs are limited.
Best Practices for Backup in SAN/NAS Environments Jeff Wells.
Storage Area Network Presented by Chaowalit Thinakornsutibootra Thanapat Kangkachit
Without reference, identify principles relating to Computer Networks with at least 70 percent accuracy.
1 Selecting LAN server (Week 3, Monday 9/8/2003) © Abdou Illia, Fall 2003.
Siposs Arnold Konrad Computer Networks Coordonator: Mr. Dr. Z. Pólkowski.
Hosted by Minimizing the Impact of Storage on Your Network W. Curtis Preston President The Storage Group.
Disk Interfaces Last Update Copyright Kenneth M. Chipps Ph.D. 1.
Storage and Backup Overview 15 February 2016TCS Internal.
STORAGE ARCHITECTURE/ MASTER): Where IP and FC Storage Fit in Your Enterprise Randy Kerns Senior Partner The Evaluator Group.
Storage Networking. Storage Trends Storage grows %/year, gets more complicated It’s necessary to pool storage for flexibility Intelligent storage.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
Network-Attached Storage. Network-attached storage devices Attached to a local area network, generally an Ethernet-based network environment.
Managing Storage Module 3.
XenData SX-10 LTO Archive Appliance
Open-E Data Storage Software (DSS V6)
Storage Area Networks The Basics.
Integrating Disk into Backup for Faster Restores
Video Security Design Workshop:
Network Attached Storage Overview
Computer Network Collection of computers and devices connected by communications channels that facilitates communications among users and allows users.
Enterprise Computing Applications
Local Area Networks, 3rd Edition David A. Stamper
Computer Network Topologies
Direct Attached Storage and Introduction to SCSI
Storage Networking.
What is Fibre Channel? What is Fibre Channel? Introduction
Network Configurations
Chapter 12: Mass-Storage Structure
Latest trends and technologies in Storage Networking
Introduction to Networks
Introduction to Networks
Storage Virtualization
Designing a local area network
Module – 7 network-attached storage (NAS)
Direct Attached Storage and Introduction to SCSI
Storage Networking.
Chapter 12: Mass-Storage Systems
Storage Networks and Storage Devices
An Introduction to Computer Networking
Keith Spayth ACSG 520 Dr. Alzoubi
Storage Networking Protocols
Keith Spayth ACSG 520 Dr. Alzoubi
Web Server Administration
UNIT IV RAID.
Computer communications
Latest trends and technologies in Storage Networking
Unit 11- Computer Networks
Cost Effective Network Storage Solutions
Improving performance
Presentation transcript:

Mass Storage & Information Retrieval Paul J Mazzotte Union University April 02, 2004

Agenda Background Storage Paradigms Storage and Backup What’s Next RAID and JBOD SCSI and FC Storage Paradigms DAS (Direct Attached Storage) NAS (Networked Attached Storage) SAN (Storage Area Networks) Performance and Cost – NAS vs SAN Storage and Backup Backup Software Tape Technologies DAS and Backup SAN and Backup What’s Next Aprile 2, 2004

Background Aprile 2, 2004

RAID and JBOD Aprile 2, 2004

RAID and JBOD JBOD: “Just a Bunch Of Disks” Drives independently attached to the I/O channel Scaleable, but requires server to manage multiple volumes Does not provide protection in case of failure RAID: “Redundant Array of Inexpensive Disks” Fault-tolerant grouping of disks that server sees as a single volume Combination of parity-checking, mirroring, and striping Self-contained manageable unit of storage Inexpensive? 72 GB FC 10K RPM Drive $1,350 from Compaq (2/03) $1,200 from SGI (12/02) Aprile 2, 2004

RAID Multiple RAID Levels to choose from: 0, 1, 2, 3, 4, 5, 6, 10 Each level has certain inherent advantages and disadvantages. RAID 0 - Disk striping (performance) RAID 1 - Disk mirroring (security) RAID 2 - Disk striping with ECC RAID 3 - Disk striping with ECC stored as parity on one drive (better performance for large data block transfers) RAID 4 - Disk striping large blocks; parity stored on one drive (better performance for large data block transfers) RAID 5 - Disk striping with parity distributed across multiple disks (better performance for small data block transfers) RAID 6 - Similar to RAID 5 but with additional parity information to recover from a two drive failure. RAID10 (RAID 0 + 1) - Combination of RAID0 (striping) and RAID1 (mirroring). Aprile 2, 2004

RAID Levels Advantages – Performance when multiple controllers used Data is subdivided and each division is written to a different disk drive. Advantages – Performance when multiple controllers used Disadvantages - Not a true raid Minimum 2 drives Data is written to two different drives. Advantages – 100% Redundant 1 write, 2 reads possible Disadvantages – Highest Disk Overhead Minimum 2 drives Aprile 2, 2004

RAID Levels Advantage – Medium read, High write performance The data block is subdivided ("striped") and written on the data disks. The stripe parity is generated on writes, recorded on the parity disk and checked on reads. Advantage – Medium read, High write performance Disadvantages - Rebuild time (Compared to Raid 1) Minimum 3 drives Each entire data block is written on a data disk; parity for blocks in the same rank is generated on Writes, recorded in a distributed location and checked on Reads. Advantage – High read, medium write performance Disadvantages – Rebuild time (Compared to Raid 1) Minimum 3 drives Aprile 2, 2004

SCSI and FC Aprile 2, 2004

SCSI Version Databus Speed Cable 1 (1986) 8 bit 5 MB/s (slow) 6 meters 2 (1994) 8 bit (narrow) 10 MB/s (fast) 25 meters 16 bit (wide) 20 MB/s 25 meters 3 [Ultra](1995) 8 bit 20 MB/s (fast-20) 25 meters 16 bit 40 MB/s 25 meters [Ultra-2](1998) 8 bit 40 MB/s (fast-40) 25 meters 16 bit 80 MB/s 25 meters [Ultra-3](1999) 8 bit 80 MB/s (fast-80) 25 meters 16 bit 160 MB/s (ultra-160) 25 meters [Ultra-4](2003) 8 bit 160 MB/s (fast-160) 25 meters 16 bit 320 MB/s (ultra-320) 25 meters Small Computer System Interface Version 1 - Single Ended (one wire driven against ground) 50 Pin Centronics type connector (Alternative 2, A-connector) Passive Termination Version 2 - Differential (voltage difference between two wires) HVD (5 Volts) 50 pin high density connector (Alternative 1, A-connector) Active Termination for SE Version 3 - No longer 1 document but a collection of documents SPI1 - 68 pin high density connector (Alternative 3 P-connector) No longer need for two cables for wide SCSI SPI2 - LVD (3 Volts) Most LVD devices are LVD/SE Single Ended cannot go faster than Ultra speeds Very High Density Cable (VHDCI) (Alternative 4 P-connector) SPI3 - Removed HVD Aprile 2, 2004

Fibre Channel Aprile 2, 2004 Point-to-Point 200 MB Point-to-Point 200 MB - Topology is a word borrowed from mathematics and used here to describe the way the nodes on a network are connected. - SAN technology supports three basic topologies: point-to-point, arbitrated loop, and switched fabric. 1.) Point-to-point is a simple topology that allows bi-directional communication between two nodes, in this case a storage system and a server. Point-to-point, like all SAN topologies, benefit from the long reach possible with Fiber Optic connections. 2.) The arbitrated loop is a ring topology where each node passes data to adjacent nodes. Like a Token Ring LAN, the SAN hub arbitrates requests for data to make optimum use of the available bandwidth. 3.) Switched fabric is a SAN term used to describe extensive storage networks where large numbers of servers and storage systems are connected using Fiber Optic switches. Switches can be cascaded and combined with loops to create highly interwoven networks known as fabrics. Arbitrated Loop Switch 200 MB Aprile 2, 2004 Switched Fabric

SCSI and FC Fibre Fibre Parallel Channel Channel AL SCSI Connections 16 Million 126 15 Distance 10 km 10 km 25 m Bandwidth 200 MB/s 200 MB/s 320 MB/s Per connection Shared Shared Hut Plug Yes Yes No Multiple Protocols Yes Yes No Aprile 2, 2004

NOT SCSI vs FC ATM IP SCSI-3 ULP (Upper Level Protocol) FC - 4 FC - 3 FC Link Encapsulation FC - LE ULP (Upper Level Protocol) SCSI-3 SCSI - 3 Command Set Mapping FC - 4 IPI - 3 Command Set Mapping (IPI-3 STD) FC - 3 Common Services FC - 0 FC - 1 FC - 2 Fibre Channel Physical & Signaling Interface ( FC- PH, FC-PH2, FC-PH3 ) Physical Variant Encode / Decode Framing Protocol FC - AL 8B/10B Encoding Copper, Optical FC - AL -2 Aprile 2, 2004

Storage Aprile 2, 2004

DAS, NAS, and SAN Aprile 2, 2004

DAS LAN Client Workstations File I/O (NFS/CIFS) Application Server(s) File Server(s) - Today, DAS is still the most widely used form of storage architecture. - DAS is comprised of multiple storage disks or disk array units that are directly attached to a general-purpose server. - While DAS is traditionally easy to implement and even easier to understand, there are a couple big disadvantages: 1.) DAS yields a greater range of information distribution across a network. Leads to two issues management and utilization. Average utilization of open systems storage is less than 50%. – Gartner “Analyst said that for every $1 spent on tape or disk storage, it costs $4 to $7 more to manage it.” – Lucas Mearian, Computerworld 14 May 2001 2.) DAS structure leads to problems as far as file sharing (especially between platforms). With DAS, file sharing applications (such as NFS, CIFS, or Samba) need to be used on the server where the data resides. Block I/O (SCSI/FC-AL) Definition: DAS is composed of multiple storage disks or disk array units that are directly attached to a general purpose server. Aprile 2, 2004

DAS Issues Proliferation of “server and storage islands” which causes a large management burden File Sharing Issues Aprile 2, 2004

NAS LAN Client Workstations NAS Servers (filers) File I/O (NFS/CIFS) - NAS is a special-purpose storage system that directly attaches to the LAN and responds to file I/O requests coming across the LAN from a device Contains: - Disk - Server (Filer) which has a optimized network operating system usually a UNIX/Linux kernel, which is fine-tuned especially for this one function. Problem 1.) Management /Inefficient Storage Use - The large management burden and inefficient storage utilization problem inherent in the DAS architecture is solved to some extent in a NAS configuration because all storage is centralized in large NAS units. Problem 2) File Sharing NAS solves the file sharing issues since most “good” NAS boxes can share files to both UNIX/PC clients. However, since the data is being shared across the LAN the file sharing protocols (NFS/CIFS) are used. - Performance, by means of network traffic, is the biggest concern. This issue is due to the file-level access protocols (NFS/CIFS) used with NAS subsystems are inherently slow. DAFS (Direct Access File Systems) NAS Servers (filers) Definition: NAS is a special-purpose storage system that directly attaches to the LAN and responds to file I/O requests coming across the LAN from a device. Aprile 2, 2004

Same as DAS – Not Exactly Tuned Network Operating System (NOS) Supports Multiple Protocols (NFS, CIFS, NCP) Aprile 2, 2004

Does NAS Solve DAS Issues Simplify Management – Yes (for the most part) Allows storage to be consolidated but only up to the size of the NAS box (~5 to 15 TB) File Sharing – Yes “True NAS” servers will have support for multiple protocols. Aprile 2, 2004

NAS Issue Performance Network bandwidth / Network Traffic Protocol Inefficiencies Aprile 2, 2004

SAN Client Management Station LAN Application Servers Management Server(s) - SAN is the newest paradigm for attaching and managing storage Made up of: - Disk - Fibre Channel Switchs (or hubs) Problem 1.) Management /Inefficient Storage Use - The large management burden and inefficient storage utilization problem is solved a little bit better in a SAN setup than a NAS setup. SAN storage is seen as one large island instead of a couple of large islands. Problem 2) File Sharing - SAN does not fully address the file sharing issue as NAS does. Basically, SAN is like DAS when file sharing to a client it must use NFS/CIFS (but unlike NAS there is no tuned kernel). However, with some SAN implementations it is possible to “mount” volumes on multiple machines attached to the SAN. Block I/O (FC) FC Network Definition: SAN is a high-speed network dedicated to interfacing storage subsystems to servers. Disk Aprile 2, 2004

Zoning arranges FC connected devices Zoning (1 of 2) Zoning arranges FC connected devices into logical groups FC Switch Network Node Zone X Zone Y Aprile 2, 2004

Zoning (2 of 2) Operation Zone members “see” only other members of the zone Zones are configured dynamically Devices can be members of more than one zone Switched fabric zoning can take place at the port or device level Benefits Secured device access Allows operating system co-existence Aprile 2, 2004

Does SAN Solve DAS Issues Simplify Management – Yes Allows storage to be consolidated (seen as one big island instead of a couple large islands like NAS) File Sharing – Not Yet Still waiting for the development of a CFS. Aprile 2, 2004

SAN and NAS Recap SAN NAS Local storage access Remote file access Private net for storage Storage protocols Centralized management NAS Remote file access Shares user net Network protocols “Centralized” management Good for hosting large databases Good for file sharing (“home directories”) Aprile 2, 2004

SAN/NAS Performance Aprile 2, 2004 System File Server (SFS) committee Standard Performance Evaluation Corporation (SPEC) Aprile 2, 2004 SPEC

SAN/NAS Cost Cost per MB 3 Year TCO (cents per MB) for 2 TB 1. Customer estimates of the number of TB of data that can be managed by a full-time administrator run from 1.5 to 5.0 for DAS, with 6.0 to 13.3 for NAS/SAN. 2. Additionally, while customers report up to 50 percent disk utilization on DAS, that utilization increases to up to 90 percent for SAN and NAS - “The Storage Report - Customer Perspectives & Industry Evolution - 19 June 2001” by Merrill Lynch & Co. and McKinsey & Company, Page 48, Chart 51 Aprile 2, 2004 “The Storage Report - Customer Perspectives & Industry Evolution - 19 June 2001” by Merrill Lynch & Co. and McKinsey & Company, Page 48, Chart 51

SAN/NAS Cost Cents per MB (2.5 TB) Cents per MB (5 TB) Cents per MB Type Platform Netapp FAS960 NAS 7.2 ($176,722) 4.1 ($206,836) N/A SAN Compaq EVA 9.1 ($228,261) 5.5 ($275,266) 3.4 ($406,880) Note: SAN costs include two 16-port switches but no cabling. Aprile 2, 2004

SAN/NAS Business Trend Data storage will account for 75% of all IT spending for the next five (5) years. - IDC (2001) Most external storage will be networked by 2005. - Nick Allen Gartner Group Most enterprises will gain more savings though consolidating storage than through servers until 2002 (0.9 probability). Aprile 2, 2004 “SNIA Presentation - 19 May 1999” by Nick Allen of Gartner Group

SAN/NAS Business Trend Annual vendor revenue $B DAS SAN SAN % 2 4 6 8 10 12 14 16 2000 2001 2002 2003 2004 2005 2006 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Aprile 2, 2004 Source: “Worldwide external raid controller-based storage forecast, 2000-2006”, Gartner, August 2002

Backup Aprile 2, 2004

Backup Software (Mid-Range) Legato (Networker) Veritas (Netbackup) IBM (Tivoli) Aprile 2, 2004

Mid-Range Tape Technologies AIT-3 SuperDLT LTO-1 Mammoth-2 Manufacturer Sony Quantum IBM/S/HP Exabyte Release Q4 2001 Q1 2001 Q3 2000 Q1 2000 Technology Helical Linear Linear Helical Native Capacity (GB) 100 110 100 60 Compressed Capacity (GB) 260 220 200 150 Native Transfer Rate (MB/s) 12 11 15 12 Compress Transfer Rate (MB/s) 31 22 30 30 12 Hr Window Trans Rate (GB) 518.4 475.2 648.0 518.4 MTBF (Hours) 400,000 250,000 250,000 300,000 Head Life (Hours) 50,000 30,000 30,000 50,000 Media Life (Avg Passes) 30,000 1,000,000 1,000,000 20,000 Media Price per Cartridge $135 $134 $110 $89 Price per GB (Native) $1.35 $1.22 $1.10 $1.48 Drive Price $?,?00 $4,400 $4,300 $4,000 SCSI LVD LVD/HVD LVD/HVD LVD/HVD Fibre Channel NO NO YES YES The announced road maps are as follows: [Note: Year(Native Capacity, Compressed Capacity, Native Transfer Rate, Compressed Transfer Rate] Mammoth (M3, M4, M5) 2003(120,300,20,50) 2004(200,500,30,75) 2005(400,1000,60,150) LTO (LTO-2, LTO-3, LTO-4) 2003(200,400,30,60) 2004(400,800,60,120) 2006(800,1600,120,240) AIT (AIT-4, AIT-5, AIT-6) 2003(200,520,24,62) 2005(400,1040,48,124) 2007(800,2080,96,248) DLT (SDLT-2, SDLT-3) 2003(220,440,22,44) 2005(500,1000,44,88) 200?(???,????,??,???) Exabyte Mammoth1 - 20/40 GB (Theory) 35 GB (Actual) 10.8 GB/hr (Theory) 5 - 8 GB/hr (Actual) Exabyte Mammoth2 - 60/150 GB (Theory) 100 GB(Actual) 43.2 GB/hr (Theory) 12 - 18 GB/hr (Actual) IBM LTO - 100/200GB (Theory) ?? GB(Actual) 54 GB/hr (Theory) ?? GB/hr (Actual) Aprile 2, 2004

Small Servers / Desktops DAS and Backup LAN Backup Servers Jukebox More Servers Backup Client Nodes Small Servers / Desktops Aprile 2, 2004

SAN and Backup From Gigabit LAN Backup Server NAS Nodes Server Nodes Files to Backup Backup File Index Disk Blocks SAN and Backup LAN Backup Server NAS Nodes Server Nodes FC Network Servers (Oracle, Mail, etc) From Gigabit Netapp Filers Aprile 2, 2004 SAN Disk Array(s) Tape Library

What’s Next Aprile 2, 2004

In The Near Future Storage iSCSI Backup Disk to Disk Backup Aprile 2, 2004

Review RAID and JBOD SCSI and FC NAS and SAN Backup Aprile 2, 2004

The End Aprile 2, 2004