Software Verification and Validation 1 Security over Hadoop - Map-Reduce as a Service Team Sofia Neata Sorin Dascalu Tiberiu Popa Tudor Scurtu November.

Slides:



Advertisements
Similar presentations
Internet Protocol Security (IP Sec)
Advertisements

Cloud Computing: hadoop Security Design -2009
Kerberos 1 Public domain image of Heracles and Cerberus. From an Attic bilingual amphora, 530–520 BC. From Italy (?).
Chapter 17: WEB COMPONENTS
Amazon Web Services and Eucalyptus
Cryptography Chapter 7 Part 4 Pages 833 to 874. PKI Public Key Infrastructure Framework for Public Key Cryptography and for Secret key exchange.
June 22-23, 2005 Technology Infusion Team Committee1 High Performance Parallel Lucene search (for an OAI federation) K. Maly, and M. Zubair Department.
Trusted Platform Modules: Building a Trusted Software Stack and Remote Attestation Dane Brandon, Hardeep Uppal CSE551 University of Washington.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
Apr 22, 2003Mårten Trolin1 Agenda Course high-lights – Symmetric and asymmetric cryptography – Digital signatures and MACs – Certificates – Protocols Interactive.
IPhone Security: Understanding the KeyChain Nicholis Bufmack and Ryan Thomas CS 691 Summer 2009.
Wide-area cooperative storage with CFS
Operating Systems: Principles and Practice
Making Apache Hadoop Secure Devaraj Das Yahoo’s Hadoop Team.
At the North of England Institute of Mining and Mechanical Engineers Library, Newcastle upon Tyne.
Authentication Approaches over Internet Jia Li
Cloud MapReduce : a MapReduce Implementation on top of a Cloud Operating System Speaker : 童耀民 MA1G Authors: Huan Liu, Dan Orban Accenture.
Committed to Deliver….  We are Leaders in Hadoop Ecosystem.  We support, maintain, monitor and provide services over Hadoop whether you run apache Hadoop,
Building service testbeds on FIRE D5.2.5 Virtual Cluster on Federated Cloud Demonstration Kit August 2012 Version 1.0 Copyright © 2012 CESGA. All rights.
© NeoAccel, Inc. TWO FACTOR AUTHENTICATION Corporate Presentation.
MapReduce April 2012 Extract from various presentations: Sudarshan, Chungnam, Teradata Aster, …
Jaeki Song ISQS6337 JAVA Lecture 16 Other Issues in Java.
Cloud Computing 1. Outline  Introduction  Evolution  Cloud architecture  Map reduce operation  Platform 2.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
LOGO Server. Contents Introduction 1 Problem Definition 2 Proposed Solution 3 Architecture Diagram 4 Server Technology 5 Hardware and Software.
Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
f ACT s  Data intensive applications with Petabytes of data  Web pages billion web pages x 20KB = 400+ terabytes  One computer can read
CSE 548 Advanced Computer Network Security Document Search in MobiCloud using Hadoop Framework Sayan Cole Jaya Chakladar Group No: 1.
XMPP Concrete Implementation Updates: 1. Why XMPP 2 »XMPP protocol provides capabilities that allows realization of the NHIN Direct. Simple – Built on.
Strong Security for Distributed File Systems Group A3 Ka Hou Wong Jahanzeb Faizan Jonathan Sippel.
Introduction The network is the computer By Waseem Anwar Chaudhri.
 Home See all your “Calls”, “Meetings”, and “Leads” for the day.  Leads Your leads will come from the lead source and get entered in as a new opportunity.
Presented by: Reem Alshahrani. Outlines What is Virtualization Virtual environment components Advantages Security Challenges in virtualized environments.
® IBM Software Group © 2007 IBM Corporation Best Practices for Session Management
Case Study.  Client needed to build data collection agents for various mobile platform  This needs to be integrated with the existing J2ee server 
Lesson 19-E-Commerce Security Needs. Overview Understand e-commerce services. Understand the importance of availability. Implement client-side security.
Virtualization and Databases Ashraf Aboulnaga University of Waterloo.
Security Vulnerabilities in A Virtual Environment
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
 Introduction  Architecture NameNode, DataNodes, HDFS Client, CheckpointNode, BackupNode, Snapshots  File I/O Operations and Replica Management File.
Introduction TO Network Administration
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
{ Tanya Chaturvedi MBA(ISM) Hadoop is a software framework for distributed processing of large datasets across large clusters of computers.
Cloud Distributed Computing Environment Hadoop. Hadoop is an open-source software system that provides a distributed computing environment on cloud (data.
LINUX Presented By Parvathy Subramanian. April 23, 2008LINUX, By Parvathy Subramanian2 Agenda ► Introduction ► Standard design for security systems ►
Next Generation of Apache Hadoop MapReduce Owen
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
KAASHIV INFOTECH – A SOFTWARE CUM RESEARCH COMPANY IN ELECTRONICS, ELECTRICAL, CIVIL AND MECHANICAL AREAS
INTRODUCTION TO HADOOP. OUTLINE  What is Hadoop  The core of Hadoop  Structure of Hadoop Distributed File System  Structure of MapReduce Framework.
CloudBerry Explorer for S3. CB Explorer Free to use Browse and manage files PowerShell functions Open and edit files  CloudBerry Explorer is an easy.
What is it and why it matters? Hadoop. What Is Hadoop? Hadoop is an open-source software framework for storing data and running applications on clusters.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
APACHE Apache is generally recognized as the world's most popular Web server (HTTP server). Originally designed for Unix servers, the Apache Web server.
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
File-System Management
Web Applications Security Cryptography 1
Introduction to Distributed Platforms
Hybrid Cloud Architecture for Software-as-a-Service Provider to Achieve Higher Privacy and Decrease Securiity Concerns about Cloud Computing P. Reinhold.
Ministry of Higher Education
EIS Fast-track Revision Om Trivedi Enterprise Information Systems
Hadoop Technopoints.
Introduction to Apache
Unit 1: Introduction to Operating System
Big Data Young Lee BUS 550.
Proposal Presentation
Introduction Apache Mesos is a type of open source software that is used to manage the computer clusters. This type of software has been developed by the.
Designed for powerful live monitoring of larger installations
ACE – Auditing Control Environment
Presentation transcript:

Software Verification and Validation 1 Security over Hadoop - Map-Reduce as a Service Team Sofia Neata Sorin Dascalu Tiberiu Popa Tudor Scurtu November 19, 2012

Software Verification and Validation 2 Map Reduce & Hadoop  Map Reduce  “a programming model and an associated implementation for processing and generating large data sets”  Large clusters of commoditiy machines  Highly scalable  Hadoop  An open-source implementation of the MapReduce framework  Top project of the Apache Foundation  Used in companies like: Yahoo, Cloudera etc

Software Verification and Validation 3 Hadoop Problem & Proposed Solution  Problem  No security mechanism when using Hadoop  Solution  A web-server which servers both as a layer of security in accessing Hadoop and also as an interface towards solving Map-Reduce problems.  Customers  medium to large companies that require regular distributed computation and distributed permanent storage capabilities.

Software Verification and Validation 4 Solution Hadoop Request Response

Software Verification and Validation 5 Solution Hadoop Request Response Hadoop Request Response Authentification Server Request Response

Software Verification and Validation 6 Functionality  Create new user and environment  The costumer contacts the system administrator  Delete users and environment  The costumer contacts the system administrator  Login  Logout  Create other users inside an existing enviroment  Delete other users inside an existing enviroment

Software Verification and Validation 7 Functionality(2)  Browse data from an environment  Download data from an environment  Upload data to an environment  Schedule map-reduce jobs in an environment

Software Verification and Validation 8 User Interface Requirements  Underlying storage access  Simple GET and POST HTTP methods  Schedule Map-Reduce jobs  HTTP methods

Software Verification and Validation 9 Security Requirements  Security layer using the web-server  Different access levels for costumers  Forms of security/authentification via HTTP  HTTPS – using TLS/SSL certificates  A signed certificate from a CA  Passwords can be kept in clear text  SHA hashes for password/other sensitive data  Vulnerable to man-in-the-middle attacks  NTLM  Secure challenge/response mechanism  Prevents password capture or replay attacks over HTTP

Software Verification and Validation 10 System Requirements  Hadoop cluster  Minimum five machines  Linux operating system  Hadoop and HTTP web-server with security capabilities  Cluster configuration based on costumer requirements  Memory  Computing performance  Storage capabilities

Software Verification and Validation 11 User Requirements  Large storage space  Cheap – amortized over time  Reliable – no data loss in case of software and hardware incidents  Large processing capabilities  Cheap – amortized over time  Fast – distributed  Failsafe – guarantee that the processing will complete in case of software or hardware failure  Secure access to previously enumerated items  No unauthorized access to data or to metainformation  Isolated from other clients actions

Software Verification and Validation 12 Specification

Software Verification and Validation 13 Conclusion  Authentification web-server over Hadoop  Different access level  HTTP methods  Security/authentification over Hadoop  Disadvantages  Tedious management and administration of a private cluster