Download presentation
Presentation is loading. Please wait.
Published byCamron Shelton Modified over 8 years ago
1
File Systems for Cloud Computing Chittaranjan Hota, PhD Faculty Incharge, Information Processing Division Birla Institute of Technology & Science-Pilani, Hyderabad Campus Jawahar Nagar, Shameerpet, Ranga Reddy District, Hyderabad, AP, India hota@hyderabad.bits-pilani.ac.in 16 th March 2013 Computer Sc Dept., Utkal University, Vani Vihar, Bhubaneswar
2
Growth of the Internet Source: Cisco VNI Global Forecast, 2011-2016 Source: Internet world stats
3
Golden era in Computing Powerful multi- core processors General purpose graphic processors Superior software methodologies Virtualization leveraging the powerful hardware Wider bandwidth for communication Proliferation of devices Explosion of domain applications Cloud Futures 2011, Redmond
4
Cloud computing: Is it a hype? from $41 billion in 2011 to $241 billion in 2020
5
Scaling up… SETI
6
What is Cloud Computing?
7
Files Permanent Storage Information sharing Files have data and attributes
8
What Distributed File System Provides Provide accesses to data stored at servers using file system interfaces What are the file system interfaces? o Open a file, check status on a file, close a file o Read data from a file o Write data to a file o Lock a file or part of a file o List files in a directory, delete a directory o Delete a file, rename a file, add a symbolic link to a file etc.
9
DFS Design Issues Mounting Caching Hints Bulk Data Transfer Replica management Writing policies
10
NFS architecture Client computerServer computer UNIX file system NFS client NFS server UNIX file system Application program Application program Virtual file system PC DOS UNIX kernel system calls RPC for (remote operations) UNIX Operations on local files Operations on remote files UNIX kernel Net work
11
Google File System Metadata: namespace, access control, mapping of files to chunks, and current location of chunks 1 2 3 4
12
HDFS Design Files stored as blocks o Default 64MB Reliability through replication o replicated across 3+ DataNodes Single NameNode coordinates access, metadata o Centralized management No data caching o Little benefit due to large data sets, streaming reads
13
Commodity Hardware
14
HDFS Architecture HDFS-Aware Application POSIX APIHDFS API Regular VFS with local and NFS-supported files Specific drivers Separate HDFS view Network stack HDFS NameNode HDFS DataNode
15
HDFS Architecture Namenode B replication Rack1 Rack2 Client Blocks Datanodes Client Write Read Metadata ops Metadata(Name, replicas, …) Block ops
16
HDFS File Read HDFS Client Client Node Distributed FileSystems FSData InputStream 1: open 3: read 6: close NameNode namenode 2: get block location DataNode datanode DataNode datanode DataNode datanode 4: read 5: read
17
Hadoop Clusters
18
Rack Awareness node r1r2 r1rack n2 d1 d2 Data center d=2 n1 d=0 n1 d=4 d=6
19
HDFS Write HDFS Client Client Node Distributed FileSystems FSData OutputStream 1: create 3: write 6: close NameNode namenode 2: create DataNode datanode DataNode datanode DataNode datanode 4: write packet5: ack packet 7: complete Pipeline 4 5 5 4
20
Data Center NODE RACK Replica Placement
21
Computational Grids [Source: IBM TJ Watson Research Center]
22
Load Distribution
23
Map/Reduce
24
SLURM
28
Crowd Sourcing
29
Foxtrot: Associating audio with locations
30
Allen Telescope Array Search for Extra Terrestrial Intelligence
31
Thank You!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.