Download presentation
Presentation is loading. Please wait.
Published byDominic Glenn Modified over 9 years ago
1
Software Verification and Validation 1 Security over Hadoop - Map-Reduce as a Service Team Sofia Neata Sorin Dascalu Tiberiu Popa Tudor Scurtu November 19, 2012
2
Software Verification and Validation 2 Map Reduce & Hadoop Map Reduce “a programming model and an associated implementation for processing and generating large data sets” Large clusters of commoditiy machines Highly scalable Hadoop An open-source implementation of the MapReduce framework Top project of the Apache Foundation Used in companies like: Yahoo, Cloudera etc
3
Software Verification and Validation 3 Hadoop Problem & Proposed Solution Problem No security mechanism when using Hadoop Solution A web-server which servers both as a layer of security in accessing Hadoop and also as an interface towards solving Map-Reduce problems. Customers medium to large companies that require regular distributed computation and distributed permanent storage capabilities.
4
Software Verification and Validation 4 Solution Hadoop Request Response
5
Software Verification and Validation 5 Solution Hadoop Request Response Hadoop Request Response Authentification Server Request Response
6
Software Verification and Validation 6 Functionality Create new user and environment The costumer contacts the system administrator Delete users and environment The costumer contacts the system administrator Login Logout Create other users inside an existing enviroment Delete other users inside an existing enviroment
7
Software Verification and Validation 7 Functionality(2) Browse data from an environment Download data from an environment Upload data to an environment Schedule map-reduce jobs in an environment
8
Software Verification and Validation 8 User Interface Requirements Underlying storage access Simple GET and POST HTTP methods Schedule Map-Reduce jobs HTTP methods
9
Software Verification and Validation 9 Security Requirements Security layer using the web-server Different access levels for costumers Forms of security/authentification via HTTP HTTPS – using TLS/SSL certificates A signed certificate from a CA Passwords can be kept in clear text SHA hashes for password/other sensitive data Vulnerable to man-in-the-middle attacks NTLM Secure challenge/response mechanism Prevents password capture or replay attacks over HTTP
10
Software Verification and Validation 10 System Requirements Hadoop cluster Minimum five machines Linux operating system Hadoop and HTTP web-server with security capabilities Cluster configuration based on costumer requirements Memory Computing performance Storage capabilities
11
Software Verification and Validation 11 User Requirements Large storage space Cheap – amortized over time Reliable – no data loss in case of software and hardware incidents Large processing capabilities Cheap – amortized over time Fast – distributed Failsafe – guarantee that the processing will complete in case of software or hardware failure Secure access to previously enumerated items No unauthorized access to data or to metainformation Isolated from other clients actions
12
Software Verification and Validation 12 Specification
13
Software Verification and Validation 13 Conclusion Authentification web-server over Hadoop Different access level HTTP methods Security/authentification over Hadoop Disadvantages Tedious management and administration of a private cluster
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.