Download presentation
Presentation is loading. Please wait.
Published byMarianna Barefoot Modified over 9 years ago
1
HDFS and S3 plugins Andrea Manzi Martin Hellmich 13/12/2013
2
Plugins functionalities 13/12/2013 DPM Workshop 2 NFS HTTP/DAV XROOT GridFTP RFIO Namespace Management Pool Management Pool Driver I/O Legacy DPM MySQL HDFS Oracle S3 HDFS Memcache
3
HDFS plugin dmlite plugin implementing I/O, pool driver and namespace functionalities through Apache Hadoop HDFS ensuring: Automatic data replication Fault tolerance to client’s read Dead of Datanode and Namenode Scalability 13/12/2013 DPM Workshop 3
4
Deployment with Lcgdm-dav 13/12/2013 DPM Workshop 4 DPM Head Node Lcgdm-dav + dmlite HDFS-plugin HDFS Namenode HDFS Datanode(s) Lcgdm-dav + dmlite HDFS-plugin
5
Some details 13/12/2013 DPM Workshop 5 HDFS C APIs (libhdfs) do not implement functions to retrieve the available datanodes ( LIVE nodes) Patch implemented and submitted to Hadoop hadoop-libhdfs rpm from our repo First version for Puppet installation is available. To be adapted to recent dav/dmlite module changes
6
On-going issues 13/12/2013 DPM Workshop 6 Tested with new dmlite-based GridFTP plugin Same deployment model as http/dav frontend or single node writing to HDFS But…HDFS does not support multiple write streams / random writes: OSG developed in-memory stream reordering in GridFTP in order to avoid this limitation ( gridftp-hdfs DSI available also in Globus toolkit) To test and understand integration
7
On-going issues 13/12/2013 DPM Workshop 7 SRM frontend does not speak dmlite SRM calls through old dpm daemons do not handle properly new pools (as HDFS) Patch to dpm daemon to be implemented
8
Future steps 13/12/2013 DPM Workshop 8 Distribution: Need to understand how to distribute the plugin HDFS client only in Fedora 20 and Rawhide https://apps.fedoraproject.org/packages/libhdfs https://apps.fedoraproject.org/packages/libhdfs Support for security enabled HDFS clusters ( Kerberos)
9
Performances 13/12/2013 DPM Workshop 9 Tests through LCDM-DAV: HDFS Namespace stat/s half performances compared to Mysql plugin namespace To be optimized with Memcached in front ROOT analysis with massive Vector I/O and TTreeCache Comparable performance with standard disk pools
10
S3 plugin 13/12/2013 DPM Workshop 10
11
Key Facts Data directly to the cloud HTTP/HTTPS only DPM provides the namespace 13/12/2013 DPM Workshop 11 3 2 1
12
Data in the Cloud 12 REDIRECT GET No data through DPM Inherits all capabilities from S3 provider: Amazon: range-header, no multi- range, multi-stream download only, no 3 rd party copy, http access only DATA DPM Workshop 13/12/2013
13
How to install an S3 pool yum install dmlite-plugins-s3 dmlite-shell > pooladd poolaws s3 > poolmodify poolaws bucketsalt xFVlsrg > poolmodify poolaws s3accesskeyid > poolmodify poolaws s3secretaccesskey 13/12/2013 DPM Workshop 13
14
More info HDFS plugin https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/D ev/Dmlite/Plugins/HDFS https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/D ev/Dmlite/Plugins/HDFS S3 plugin https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/D ev/Dmlite/Plugins/S3 https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/D ev/Dmlite/Plugins/S3 13/12/2013 DPM Workshop 14
15
15 Thanks! Questions? DPM Workshop 13/12/2013
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.