Oracle 10g Database Storage Demystified Jeff Browning, O.C.P, R.H.C.A. Senior Manager Network Appliance, Inc. OracleWorld 2003 San Francisco
A little history The notion of storage networking SAN and NAS – Current-technology SAN: FCP – Current-technology NAS: IP over GbE RAID: The “packaging” of hard disks – RAID0 – RAID1 – RAID4 – RAID5 – Combinations of RAID levels Emerging storage technologies – ATA RAID – Serial ATA (SATA) – iSCSI – NFS v. 4 (NFS RDMA) Conclusion and wrap up Agenda
A Little History IDE/ATA: The beginning SCSI: A proliferation of standards – SCSI-1 – SCSI-2: The proliferation begins – SCSI-3: A new approach
In the Beginning There Was IDE/ATA Introduced by IBM with the AT/PC in 1984 Supported a master/slave concept Enhanced and adopted by Compaq in 1986 with the Deskpro 386 as the IDE interface – ATA and IDE are now interchangeable terms
What You Could Do with an IDE/ATA Device: Not Much IDE/ATA was slow (4 MB/s to start) It didn’t support many devices (usually 2 hard drives) It wasn’t reliable But it was, and remains, very, very cheap It was never used widely for databases
SCSI: A Proliferation of Standards Invented by Alan Shugart (founder of Seagate) in 1979 Adopted as an ANSI standard in 1986 First version was referred to as SCSI-1
What You Could Do with a SCSI-1 Device: A Bit More SCSI-1 was still pretty slow (5 MB/s) It supported 7 peripheral devices It was more reliable than IDE/ATA It was also more expensive This was the first choice for Sun, HP and other open systems vendors and, notably, the Macintosh
SCSI-2: The Proliferation Begins Fast SCSI: Higher transfer speed (10 MB/s or higher) Wide SCSI: Width of the bus was increased from 16 to 32 bits More devices per bus (from 7 to 15) Other improvements – Improved cables and connectors – Improved signaling – Active termination
SCSI-3: A New Approach With SCSI-3 the approach changed – Cabling and connection layer no longer defined in the basic spec So-called “interconnect” or “physical layer” standards SCSI-3 basic spec only defines a command set and a communication protocol
SCSI-3: The Physical Layer Standards Serial Bus SCSIThis is the form of SCSI-3 found in many hosts today Serial Storage Architecture (SSA) Used by IBM on its larger systems; not common Fibre Channel Protocol (FCP) Defines a standard for SCSI-3 traffic over Fibre Channel networks; by far the most popular form of SCSI-3 today for databases iSCSIEmerging standard for SCSI-3 traffic over IP networks
The Notion of Storage Networking SCSI provided a way to attach disks to a host The need for sharing of disk and tape backup resources led to the idea of “shared SCSI”
Storage Networking for Applications Certain applications required shared disk Shared SCSI evolved as a way to solve this problem
Storage Networking Evolves Storage networking evolved along two paths – SAN: With FCP being the dominant protocol – NAS: With Gigabit Ethernet (GbE) NAS became a viable alternative to FCP for many applications The next section discusses the tradeoffs between these approaches
SAN and NAS Storage Area Networks (SAN) take the approach of making SCSI sharable Network Attached Storage (NAS) uses existing file sharing protocols to connect databases to storage Both approaches have their place: They are different
Fibre Channel Emerges as Dominant SAN Fibre Channel was designed as a SAN protocol It was adopted as an ANSI standard in 1994 It has emerged as the de facto standard for creating a SAN
Typical Fibre Channel SAN
Fibre Channel SAN Tradeoffs Advantages – Bandwidth is good: 2 Gb FC is now common – Host CPU cost per I/O is comparable to SCSI – Latency is low and performance is good – Scalability is good Disadvantages – More expensive than comparable IP network – Interoperability is poor but improving – Highly complex to setup and administer – Difficult to share disk capacity
NAS Emerges as Alternative to SAN NFS was created by Sun in in the early 1980s Version 1 of NFS was widely regarded as inappropriate as a file sharing protocol for databases Version 2 improved enough that Oracle certified NFS for Oracle datafiles in 1997 Version 3 builds upon those improvements Version 4 is emerging (more on this later)
Typical IP/GbE NAS
IP/GbE NAS Tradeoffs Advantages – Bandwidth is pretty good using GbE – Switches/NICs are very inexpensive compared to FC switches/HBAs – Simple and easy to setup and administer – Interoperability is excellent – Disk capacity can be easily shared – even across platforms Disadvantages – Host CPU cost may be higher than FC, depending on load, but not if the load is spindle-bound (NFS v. 4 fixes this in spades) – CPU Scalability (in the sense of CPU count) can be lower than FC (again NFS v. 4 addresses this)
SAN vs. NAS Suitability SAN – Suitable for high-end environments where latency, performance, or CPU cost per I/O are critical – Required by some applications where NAS is not supported NAS – Suitable for low- or mid-end environments where performance or CPU cost is less important than $$ cost – Also suitable for some high-end environments where CPU is compute intensive, not I/O intensive SAN and NAS are converging
RAID: Redundant Array of Inexpensive Disks The problem: – Disks are fragile; they fail – Data is precious and must be protected – Tape or disk backup is too slow or too expensive RAID provides a way to combine disks together with redundancy so that a single disk failure will not lose data Hot spares and auto-promotion make this a viable long-term solution Software RAID vs. hardware RAID
RAID and Its Variants RAID0Simple striping; not truly RAID RAID1Disk-to-disk mirroring RAID4Striping with a parity disk RAID5Striping with striped parity RAID1+0 RAID0+1 RAID5+1 Etc. Combinations of RAID protection; can get complex
RAID0: Striping
RAID0 Tradeoffs Advantages: – Fastest type of RAID; leverages disks well – No disk overhead Disadvantage: – A single disk loss is critical Suitability – Any environment where performance is important, and you do not care about the data, e.g. Datamarts
RAID1: Simple Mirroring
RAID1 Tradeoffs Advantages: – Read capacity is higher than single disk (but lower than striping) – Very fault tolerant; all data is mirrored Disadvantage: – Single disk capacity for writes – Two write per I/O penalty – Doubles disk cost Suitability: – Very commonly used for online redo logs
RAID0+1: Striping with Mirroring
RAID1+0: Mirroring with Striping
RAID0+1/RAID1+0 Tradeoffs Advantages: – Read capacity is high; multiple disks are leveraged – Very fault tolerant; all data is mirrored Disadvantage: – Two write per I/O penalty – Doubles disk cost Suitability: – Very common for storing Oracle datafiles where redundancy is highly valued
RAID4: Striping with Parity Disk
RAID4 Tradeoffs Advantages: – Read Capacity is high; multiple disks are leveraged – Low RAID overhead; almost as good as RAID 0 – RAID protection exists Disadvantage: – Two disks cannot be lost – Parity disk can become a bottleneck (some vendors avoid this issue with buffering, in which case performance is similar to RAID 1) Suitability: – Very common for storing Oracle datafiles where redundancy is needed, and the cost of RAID0+1/RAID1+0 is too high
RAID5: Striping with Striped Parity
RAID5 Tradeoffs Advantages: – Read Capacity is high; multiple disks are leveraged – Low RAID overhead; almost as good as RAID 0 – RAID protection exists Disadvantage: – Two disks cannot be lost – Slowest RAID; CPU cost of parity striping is high Suitability: – Very common for storing Oracle datafiles where redundancy is needed, performance is not critical, and the cost of RAID0+1/RAID1+0 is too high
Emerging Storage Technologies ATA RAID Serial ATA (SATA) iSCSI NFS v. 4 (NFS RDMA)
ATA RAID A repackaging of cheap ATA/IDE disks Used as a tape backup substitute Archive storage is on-line and accessible Faster than tape Almost as cheap as tape, or even cheaper if compression is used
Serial ATA An updating of the ATA/IDE spec to current technology Intel and Dell Targeted for desktops and next generation storage appliances Could become a serious competitor to FCP and serial bus SCSI
iSCSI Implements SCSI-3 protocol over IP networks Intel is a leader Software initiators exist for Windows and Linux HP-UX and AIX initiators are in public beta Targets are available from a variety of vendors Presently immature, but will become viable competitor to FCP – Key is TOE HBAs on both target and initiator Effectively offloads host/target CPU from IP traffic – Cost per port for switches and HBAs is vastly cheaper than FCP – If performance becomes comparable, FCP could be toast
Typical iSCSI SAN
NFS v. 4 (NFS RDMA) Basically, a rewrite of NFS Focused on “local sharing” i.e., database customers and the like, who need to share data across a small, focused network with very good performance Supports Read Direct Memory Access, a very high performance, low latency I/O protocol Supports Infiniband as an I/O interface Leaders are Network Appliance and Sun Will provide a transparent performance upgrade path for NFS database customers
Agenda A little history The notion of storage networking SAN and NAS – Current-technology SAN: FCP – Current-technology NAS: IP over GbE RAID: The “packaging” of hard disks – RAID0 – RAID1 – RAID4 – RAID5 – Combinations of RAID levels Emerging storage technologies – ATA RAID – Serial ATA (SATA) – iSCSI – NFS v. 4 (NFS RDMA) Conclusion and wrap up
Wrap Up