Presentation is loading. Please wait.

Presentation is loading. Please wait.

CHEP04 Performance Analysis of Cluster File System on Linux Yaodong CHENG IHEP, CAS

Similar presentations


Presentation on theme: "CHEP04 Performance Analysis of Cluster File System on Linux Yaodong CHENG IHEP, CAS"— Presentation transcript:

1 CHEP04 Performance Analysis of Cluster File System on Linux Yaodong CHENG IHEP, CAS chyd@ihep.ac.cn

2 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Outline Introduction Review of cluster file system Data access model Performance analysis formula Performance test Some useful methods

3 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Introduction Cluster systems made up with PCs are more and more popular The improvement of commodity hardware and software CPU, memory, hard disk, network Linux software technology How to use the our existing hardware and software more efficiently

4 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Architecture of a cluster system job Compute node1Compute node N I/O Node 1 disk I/O Node N disk tape High speed network disk

5 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Cluster file system review one of the most important methods to share information of cluster system General characteristics: Single-system image Transparency Good scalability High performance Structure C/S, share-disk, virtual share-disk

6 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Data access model Meta Data Server I/O Servers IO node 1 Disk IO node 2 Disk IO node N Disk ● ● ● Manager Node N e t w o r k Client 1Client 2Client N ● ● ●

7 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Some assumptions Data is processed only in each client Storage nodes only provide storage capacity and deal with file operations The traffic between clients and management nodes is very small The time for dealing with requests of clients is far smaller than the time consumed by transferring data

8 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Performance analysis formula c: the CPU time to compute each byte; D: the total of data; I: network speed; M: the number of I/O nodes; N: the number of clients; P: the number of disks in parallel; R: disk speed T: the minimum access time to total data S: the maximum aggregate bandwidth Limitation: P/M >=1 T = max (D*c/N, D/(N*I), D/(M*I), D/(P*R) ) S = D/T = min (N/c, N*I, M*I, P*R)

9 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland In above formula, if c is very small, the formula becomes: T = max (D/(N*I), D/(M*I), D/(P*R) ) S = D/T = min (N*I, M*I, P*R) and this formula is the basis of performance analysis in this work

10 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Some cases N=1, M>=1 (or N>=1 and M=1), R>I  S depends on I N=1, M>=1 (or N>=1 and M=1), R<I  S depends on I and P*R N>1, M>1, R>I  S depends on the number of clients and I/O nodes

11 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Test environment Twelve PCs I/O nodes, Manager nodes and clients P4 2.8G/512M/DiskWD80G-8M-7200RPM OS CERN Linux 7.3.3 Kernel: 2.4.20-18.7.cernsmp Local file system: ext3 Network: 100M Ethernet Cluster file system OpenAFS 1.2.9, NFS v3, PVFS, CASTOR1.6.1.2

12 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Pre-test Test tools Netperf 2.2pl3 Iozone 3.217 Local area network bandwidth (I): 100M Ethernet: about 94.11Mbits/sec Local file system measurement (R)./iozone -Rab local.xls -g 2048M Recompile IOzone linked with CASTOR RFIO library

13

14 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland One client one server Only one client access files Only one I/O nodes in server configuration Write performance measurement file size: 512MB record size: 64KB-16MB output unit: KB/sec

15 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Results FS Record size (KB) 64128256512102420484096819216384 NFS 111011080311054111251108311042110451110911047 AFS 517353425239513751485335521251755353 PVFS 99531015810103102391075910603106621094810976 CAST OR 102091033510530106221069710722107231070510678

16 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Multi-process test Only one client and one I/O node Many processes access one I/O node simultaneously. Write performance measurement File size: 100MB Record size: 512KB Process number: 1  10 Output unit: KB/sec

17 Results FSNFSAFSPVFSCASTOR Number of process 1 103727878 10806 10680 2 103627889 10752 11255 3 1032310841 10751 11221 4 103111020 10686 11450 5 102579358 10707 11430 6 102589142 10690 11441 7 102558120 10696 11390 8 101738545 10697 11440 9 102408652 10696 11442 10 102507305 10698 11430

18 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Multi-client to multi-server Multiple clients read/write files Multiple I/O nodes provide file storage The output is aggregate bandwidth Only measure CASTOR and PVFS Write performance The size of each file: 200M Record size: 2MByte Output unit: MB/sec

19 Results

20 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Some useful methods In theory, good cluster file system the data is physically balanced among the I/O devices the data requirements are balanced among the application’s tasks network has enough aggregate bandwidth to pass the data between the two without saturating In practice, the following methods are useful

21 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Use high-speed network, for example Gigabit Ethernet or Myrinet Use or develop high performance network file transfer protocol Use multi-server to improve the aggregate bandwidth Improve the read/write speed of disks File stripping and parallel I/O Good file system design Improve the processing ability of manager nodes

22 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Summary Cluster file system review Performance analysis formula Performance test Some methods to improve the performance

23 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland Thank you!!


Download ppt "CHEP04 Performance Analysis of Cluster File System on Linux Yaodong CHENG IHEP, CAS"

Similar presentations


Ads by Google