Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design and Performance Evaluation of Networked Storage Architectures Xubin He July 25,2002 Dept. of Electrical and Computer Engineering.

Similar presentations


Presentation on theme: "Design and Performance Evaluation of Networked Storage Architectures Xubin He July 25,2002 Dept. of Electrical and Computer Engineering."— Presentation transcript:

1 Design and Performance Evaluation of Networked Storage Architectures Xubin He (Hexb@ele.uri.edu) July 25,2002 Dept. of Electrical and Computer Engineering University of Rhode Island

2 July 25, 2002High Performance Computing Lab(HPCL),URI Outline Introduction STICS: SCSI-To-IP Cache for Storage Area Networks DRALIC: Distributed RAID & Location Independence Cache vcRAID: Large Virtual NVRAM Cache for Software RAID Performance Eval. on Distributed Web Server Architectures Conclusions

3 July 25, 2002High Performance Computing Lab(HPCL),URI Background Data storage plays an essential role in today’s fast-growing data-intensive network services. Online data storage doubles every 9 months Storage is approaching more than 50% of IT spending.The storage cost will be up to 75% of the total IT cost in year 2003.

4 A Server-to-Storage Bottleneck Source: Brocade

5 July 25, 2002High Performance Computing Lab(HPCL),URI How to deploy data over the network efficiently and reliably?  Disparities between SCSI & IP  SCSI remote handshaking over IP  Processor-disk gap growing  High speed network  Large client memories  Cheap Disk & RAM, expensive NVRAM  RAID5 is reliable, but low performance  E-commerce over the Internet, distributed web servers Motivations STICS DRALIC vcRAID

6 July 25, 2002High Performance Computing Lab(HPCL),URI Introduction STICS: SCSI-To-IP Cache for Storage Area Networks DRALIC: Distributed RAID & Location Independence Cache vcRAID: Large Virtual NVRAM Cache for Software RAID Performance Eval. on Distributed Web Server Architectures Conclusions

7 July 25, 2002High Performance Computing Lab(HPCL),URI Introducing a New Device:STICS Whenever there is a disparity, cache helps Features of STICS: Smooth out disparities between SCSI and IP Localize SCSI protocol and filter out unnecessary traffic reducing bandwidth requirement Nonvolatile data caching Improve performance, reliability, manageability and scalability over current iSCSI systems.

8 System Overview System overview. A STICS connects to the host via SCSI interface and connects to other STICS’ or NAS via Internet. SCSI TCP/IP SCSI STICS 1 TCP/IP NAS SCSI STICS 2 TCP/IP Internet STICS 3STICS N Host 1 Host 2 or Storage Host M or Storage SCSI Disks or SAN

9 STICS Architecture SCSI Interface Processor RAM Log Disk Storage device Network Interface

10 July 25, 2002High Performance Computing Lab(HPCL),URI Internal Cache Structure log Disk Meta Data Memory Cache Data Cache

11 July 25, 2002High Performance Computing Lab(HPCL),URI Basic Operations Write Write requests from the host via SCSI Write requests from another STICS via NIC Read Read requests from the host via SCSI Read requests from another STICS via NIC Destage RAM —> log disk Log disk —> storage device Prefetch Storage device —> RAM

12 July 25, 2002High Performance Computing Lab(HPCL),URI Web-based Network Management Web browser-based Manager HTTP Servlet Management App. TCP/IP Local Manage App.

13 July 25, 2002High Performance Computing Lab(HPCL),URI Implementation Platform A STICS block is a PC running Linux OS: Linux with kernel 2.4.2 Compiler:gcc Interfaces: STICS SCSI IP

14 July 25, 2002High Performance Computing Lab(HPCL),URI Performance Evaluations Methodology iSCSI implementation on Linux by Intel (iSCSI) Initial STICS Implementation on Linux  Two modes:  Immediate report (STICS-Imm)  Report after complete (STICS) Workloads Postmark of Network Appliances: throughput  Two configurations  Small: 1000/50k/436MB  Large: 20k/100k/740MB EMC Trace :response time  More than 230,000 I/O requests  Data set size: >900MB

15 Target (Squid) SCSI NIC Disks Host (Trout) NIC Switch iSCSI commands and data iSCSI configuration. The host Trout establishes connection to target, and the target Squid responds and connects. Then the Squid exports hard drive and Trout sees the disks as local. Cod Target (Squid) SCSI STICS 2 Disks Host (Trout) STICS 1 Switch Block Data STICS configuration. The STICS cache data from both SCSI and network. Cod Experimental Settings

16 PostMark Results: Throughput Ave. ImprovementSTICS-immSTICS Small set226%64% Large set318%97%

17 Where does the benefit come from? <6465-127128- 255 255-511511- 1023 >1024 iSCSI71,937,7249160271,415,912 STICS4431,21616307607,827 Total PacketsSmall Packets (%) Bytes Transferred Bytes per packet iSCSI3,353,82157.8%1,914,566,504571 STICS1039,10041.5%980,963,821944 # Of packets with different sizes (bytes) Network traffic analysis

18 July 25, 2002High Performance Computing Lab(HPCL),URI EMC Trace Results: Response Time a) STICS with immediate report(2.7 ms) b) STICS with report after complete (5.71 ms). c) iSCSI (16.73 ms). Histograms of I/O response times for trace EMC-tel.

19 July 25, 2002High Performance Computing Lab(HPCL),URI Summary A novel cache storage device that adds a new dimension to networked storages Significantly improving performance of iSCSI A cost-effective solution for building efficient SAN over IP Allow easy manageability, maintainability, and scalability

20 July 25, 2002High Performance Computing Lab(HPCL),URI Introduction STICS: SCSI-To-IP Cache for Storage Area Networks DRALIC: Distributed RAID and Location Independence Cache vcRAID: Large Virtual NVRAM Cache for Software RAID Performance Eval. on Distributed Web Server Architectures Conclusions

21 July 25, 2002High Performance Computing Lab(HPCL),URI Web Servers Overhead caused by FS is high Enterprise web server is expensive A Fujitsu Server: More than $5 million PCs are cheap: $1000 Disks: $160/120GB (IBM Deskstar@CompUSA) DRAM:$100/256MB(@Crucial.com)

22 July 25, 2002High Performance Computing Lab(HPCL),URI My Solution Combine or bridge the disk controller and network controller of existing PCs interconnected by a high-speed switch. Share memory and storage among peers

23

24 July 25, 2002High Performance Computing Lab(HPCL),URI Performance analysis B: data block size (8KB) N: number of nodes H lm : Local memory hit ratio H rm : Remote memory hit ratio T lm : Local memory access time T rm : Remote memory access time T raid : access time from the distributed RAID T dralic : Average response time of DRALIC system

25 Preliminary Performance Analysis

26 July 25, 2002High Performance Computing Lab(HPCL),URI Simulation Results DRALICSim: a simulator based on socket communication. Benchmark: PostMark: measures performance in terms of transaction rates provided by Network Appliance Inc. Configurations: 1000 initial files and 50000 transactions (small), 20000/50000(medium) and 20000/100000(large) 4 Nodes running Windows NT

27 July 25, 2002High Performance Computing Lab(HPCL),URI Simulation Results

28 July 25, 2002High Performance Computing Lab(HPCL),URI Summary Combination of HBAs and NICs will reduce the overhead. Share memory and storage among peers Make use of existing resources Our simulator has the performance gain up to 4.2 with 4 nodes

29 July 25, 2002High Performance Computing Lab(HPCL),URI Introduction STICS: SCSI-To-IP Cache for Storage Area Networks DRALIC: Distributed RAID & Location Independence Cache vcRAID: Large Virtual NVRAM Cache for Software RAID Performance Eval. on Distributed Web Server Architectures Conclusions

30 July 25, 2002High Performance Computing Lab(HPCL),URI VC-RAID Hiding the small write penalty of RAID5 by buffering small writes and destaging data back to RAID with parity computation when disk activity is low. A combination of a small portion of the system RAM and a log disk to form a hierarchical cache. This hierarchical cache appearing to the host as a large nonvolatile RAM.

31 July 25, 2002High Performance Computing Lab(HPCL),URI Buffer Cache Main Memory Cache Disk OS kernel Architecture RAID5

32 July 25, 2002High Performance Computing Lab(HPCL),URI Approaches

33 July 25, 2002High Performance Computing Lab(HPCL),URI Performance Results Test environment: Gateway G6-400, 64MB RAM, 4M RAM buffer, 200 MB Cache disk, 4 SCSI disks form a disk array. Benchmarks Postmark by Network Appliance Untar/copy/remove Compared to built-in RAID0 and RAID5

34 July 25, 2002High Performance Computing Lab(HPCL),URI Throughput SeriesRAID 0VC-RAIDRAID 5 Small (1k+50k) 1111941561 Medium (20k+50k) 686330 Large (20k+100k) 312816

35 Response time (second)

36 July 25, 2002High Performance Computing Lab(HPCL),URI Summary Reliable: based on RAID5 Hard drive is more reliable than RAM Cost effective: hard drives are much cheaper than RAM Software, don’t need extra hardware Fast: increasing the cache size

37 July 25, 2002High Performance Computing Lab(HPCL),URI Introduction STICS: SCSI-To-IP Cache for Storage Area Networks DRALIC: Distributed RAID & Location Independence Cache vcRAID: Large Virtual NVRAM Cache for Software RAID Performance Eval. on Distributed Web Server Architectures Conclusions

38 July 25, 2002High Performance Computing Lab(HPCL),URI Observations E-Commerce has grown explosively Static web pages that are stored as files are no longer the dominant web accesses. about 70% of them start CGI, ASP, or Servlet calls to generate dynamic pages. Web server behaviors and the interaction between web server and database servers

39

40 July 25, 2002High Performance Computing Lab(HPCL),URI Benchmark and workloads Workloads Static pages Light CGI: 20% / 80%. Heavy CGI: 90% / 10%. Heavy servlet: 90% / 10%. Heavy database access: 90% /10%. Mixed workload: 7% / 8% / 30% /55% WebBench 3.5 (6010 static pages, 300 cgi, 300 simple servlets, 400 DB servlets using JDBC, 2 databases with 15 and 18 tables)

41

42

43 July 25, 2002High Performance Computing Lab(HPCL),URI Introduction STICS: SCSI-To-IP Cache for Storage Area Networks DRALIC: Distributed RAID & Location Independence Cache vcRAID: Large Virtual NVRAM Cache for Software RAID Performance Eval. on Distributed Web Server Architectures Conclusions

44 July 25, 2002High Performance Computing Lab(HPCL),URI Summary STICS couples reliable and high speed data caching with low overhead conversion between SCSI and IP. DRALIC boosts the web server performance by combining disk controller and NIC to reduce FS overhead. vcRAID presents a reliable and inexpensive solution for data storage. We carried out an extensive performance study on distributed web server architectures under realistic workloads.

45 July 25, 2002High Performance Computing Lab(HPCL),URI Patents (with Dr. Yang)  STICS: SCSI-To-IP Cache Storage, File pending, Serial Number 60/312,471, August 2001  DRALIC: Distributed RAid and Location Independence Cache, Filed pending, May 2001

46 July 25, 2002High Performance Computing Lab(HPCL),URI Publications (Journal) 1. Xubin He, Qing Yang, and Ming Zhang, “STICS: SCSI-To-IP Cache for Storage Area Networks,” Submitted to IEEE Transactions on Parallel and Distributed Systems. 2. Xubin He, Qing Yang, “Performance Evaluation of Distributed Web Server Architectures under E- Commerce Workloads,” Submitted to Journal of Parallel and Distributed Computing. 3. Xubin He, Qing Yang, “On Design and Implementation of a Large Virtual NVRAM Cache for Software RAID,” Special Issue of Journal on Parallel I/O for Cluster Computing, 2002.

47 July 25, 2002High Performance Computing Lab(HPCL),URI Publications (Conference) 1. Xubin He, Qing Yang, and Ming Zhang, “ A Caching Strategy to Improve iSCSI Performance,” To appear in IEEE Annual Conference on Local Computer Networks, Nov. 6-8, 2002. 2. Xubin He, Qing Yang, and Ming Zhang, “Introducing SCSI-To-IP Cache for Storage Area Networks,” ICPP’2002, Vancouver, Canada, August 2002. 3. Xubin He, Ming Zhang, Qing Yang, “DRALIC: A Peer-to-Peer Storage Architecture”, Proc. of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'2001), 2001. 4. Xubin He, Qing Yang, “Characterizing the Home Pages”, Proc. of the 2nd International Conference on Internet Computing (IC’2001), 2001. 5. Xubin He, Qing Yang, “VC-RAID: A Large Virtual NVRAM Cache for Software Do-it-yourself RAID”, Proc. of the International Symposium on Information Systems and Engineering (ISE'2001), 2001. 6. Xubin He, Qing Yang, “Performance Evaluation of Distributed Web Server Architectures under E-Commerce Workloads”, Proc. of the 1 st International Conference on Internet Computing (IC’2000), 2000.

48 Thank You! Dr. Qing Yang @ELE Dr. Jien-Chung Lo @ELE Dr. Joan Peckham @CS Dr. Peter Swaszek @ELE Dr. Lisa DiPippo @CS And more…

49 Special thanks to my daughter, Rachel!

50 July 25, 2002High Performance Computing Lab(HPCL),URI

51 July 25, 2002High Performance Computing Lab(HPCL),URI

52 July 25, 2002High Performance Computing Lab(HPCL),URI


Download ppt "Design and Performance Evaluation of Networked Storage Architectures Xubin He July 25,2002 Dept. of Electrical and Computer Engineering."

Similar presentations


Ads by Google