Google File System Simulator Pratima Kolan Vinod Ramachandran.

Google File System Simulator Pratima Kolan Vinod Ramachandran

Google File System Master Manages Metadata Data Transfer Happens directly between client and chunk server Files broken into 64 MB chunks Chunks replicated across three machines for safety

Event Based Simulation Event 1Event 2Event 3 Component 1 Component 2 Component 3 Priority Queue Simulator Place Event in Priority Queue Get Next High Priority Event from Queue Output of simulated event

Simplified GFS Architecture ClientMaster Server Network Disk 1 Switch Switch: Infinite Bandwidth Network Disk 2 Network Disk 3Network Disk 4Network Disk 5 Represent Network Queues

Data Flow The client queries the master server for a Chunk ID it wants to read. The master server returns a set of disks ids that contain the Chunk. The client requests a disk for the Chunk The disk transfers the data to the client

Experiment Setup We have a client whose bandwidth can be varied from 0…..1000 Mbps We have 5 disks each a having a per disk bandwidth of 40 Mbps We have 3 chunk replicas per chunk of data as a baseline Each client request is for 1 Chunk of data from a disk

Simplified GFS Architecture ClientMaster Server Network Disk 1 Switch Client Bandwidth varied from 0…..1000 Mbps Per Disk Bandwidth : 40 Mbps Switch: Infinite Bandwidth Chunk ID: 0-1000 0-1000 0-2000 1001-2000 1001-2000 Network Disk 2 Network Disk 3Network Disk 4Network Disk 5 Represent Network Queues

Experiment 1 Disk Requests Served With out Load Balancing – In this case we pick the first chunk server from the list of available chunk servers that contains the disk block. Disk Requests Served With Load Balancing – In this case we apply a greedy algorithm and balance the load of incoming requests across the 5 disks

Expectation In the Non load balancing case we expect the effective request/data rate to reach a peak value of 2 disks(80 Mbps) In the load balancing case we expect the effective request/data rate to reach a peak value of 5 disks(200 Mbps)

Load Balancing Graph This graph plots the data rate at client vs. client bandwidth

Experiment 2 Disk Requests Served With No Dynamic Replication – In this case we have a fixed number of replicas(3 in our case) and the server does not create more replication based on statistics for read requests. Disk Requests Served With Dynamic Replication – In this case the server replicates certain chunks based on the frequency of the chunk requests. – We define a replication factor, which is fraction < 1 – No of Replicas For Chunk = (replication factor) * No of requests For The Chunk – We Cap the Max No of Replicas by the Number of disks

Expectation Our Requests are all aimed on the chunks placed in disk 0,disk 1, disk2. In the non replication case we expect the effective data rate at the client to me limited by the bandwidth provided by 3 disks(120 Mbps) In the replication case we expect the effective data rate at the client to me limited by the bandwidth provided by 5 disks(200 Mbps)

Replication Graph This graph plots the data rate at client vs. client bandwidth

Experiment 3 Disk Requests Served with no Rebalancing – In this case we do not implement any rebalancing of read requests based on frequency of chunk requests Disk Requests Served with Rebalancing – In this case we perform rebalancing of read requests by picking a request with highest frequency and transferring it to a disk with a lesser load

Graph 3

Request Distribution Graph

Conclusion and Future Work GFS is a simple file system for large-data intensive applications We studied the behavior of certain read workloads on this file system In the future we would like to come up with optimizations that could fine tune GFS

Google File System Simulator Pratima Kolan Vinod Ramachandran.

Similar presentations

Presentation on theme: "Google File System Simulator Pratima Kolan Vinod Ramachandran."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Google File System Simulator Pratima Kolan Vinod Ramachandran.

Similar presentations

Presentation on theme: "Google File System Simulator Pratima Kolan Vinod Ramachandran."— Presentation transcript:

Similar presentations

About project

Feedback