© 2012 IBM Corporation 1 ENSURE: Enabling kNowledge Sustainability, Usability and Recovery for Economic value Presenter: Michael Factor

Slides:



Advertisements
Similar presentations
What is HathiTrust and How Can it Make a Difference? Sourcing and Scaling brought to the collective collection.
Advertisements

Digital Preservation A Matter of Trust. Context * As of March 5, 2011.
Distributed Data Processing
© 2007 IBM Corporation Enterprise Content Management Integrating Content, Process, and Connectivity for Competitive Advantage Malcolm Holden October 2007.
Security, Privacy and the Cloud Connecticut Community Providers’ Association June 20, 2014 Steven R Bulmer, VP of Professional Services.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
11© 2011 Hitachi Data Systems. All rights reserved. HITACHI DATA DISCOVERY FOR MICROSOFT® SHAREPOINT ® SOLUTION SCALING YOUR SHAREPOINT ENVIRONMENT PRESENTER.
© 2004 Visible Systems Corporation. All rights reserved. 1 (800) 6VISIBLE Holistic View of the Enterprise Business Development Operations.
Enterprise Architecture The Arkansas Approach. Key Areas What is enterprise architecture? Why is it important? How you can participate Current status.
Lesson 11-Virtual Private Networks. Overview Define Virtual Private Networks (VPNs). Deploy User VPNs. Deploy Site VPNs. Understand standard VPN techniques.
FI-WARE – Future Internet Core Platform FI-WARE Cloud Hosting July 2011 High-level description.
Chapter 12 Strategies for Managing the Technology Infrastructure.
© 2009 IBM Corporation ® IBM Software Group Introduction to Cloud Computing Vivek C Agarwal IBM India Software Labs.
Symantec Vision and Strategy for the Information-Centric Enterprise Muhamed Bavçiç Senior Technology Consultant SEE.
Wally Kowal, President and Founder Canadian Cloud Computing Inc.
SaaS, PaaS & TaaS By: Raza Usmani
Does "The Cloud" Fit Into Your Organization? Tom Horan Meridian IT Inc. VP, Strategic Markets (847)
Cloud Computing Guide & Handbook SAI USA Madhav Panwar.
An Introduction to DuraCloud Carissa Smith, Partner Specialist Michele Kimpton, Project Director Bill Branan, Lead Software Developer Andrew Woods, Lead.
Securing and Auditing Cloud Computing Jason Alexander Chief Information Security Officer.
Cloud computing Tahani aljehani.
THE DICOM 2013 INTERNATIONAL CONFERENCE & SEMINAR March 14-16Bangalore, India DICOM Medical Image Management the Challenges and Solutions – Cloud as a.
Cloud Attributes Business Challenges Influence Your IT Solutions Business to IT Conversation Microsoft is Changing too Supporting System Center In House.
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | OFSAAAI: Modeling Platform Enterprise R Modeling Platform Gagan Deep Singh Director.
“This presentation is for informational purposes only and may not be incorporated into a contract or agreement.”
Computer Associates Solutions Managing eBusiness Catalin Matei, April 12, 2005
Cloud Computing in Large Scale Projects George Bourmas Sales Consulting Manager Database & Options.
Cloud Computing. 2 A division of Konica Minolta Business Solutions USA Inc. What is Cloud Computing? A model for enabling convenient, on-demand network.
CLOUD COMPUTING & COST MANAGEMENT S. Gurubalasubramaniyan, MSc IT, MTech Presented by.
Introduction to Cloud Computing
© 2011 IBM Corporation Smarter Software for a Smarter Planet The Capabilities of IBM Software Borislav Borissov SWG Manager, IBM.
Adam Leidigh Brandon Pyle Bernardo Ruiz Daniel Nakamura Arianna Campos.
Data Center Infrastructure
Security Professional Services. Security Assessments Vulnerability Assessment IT Security Assessment Firewall Migration Custom Professional Security Services.
Microsoft and Community Tour 2011 – Infrastrutture in evoluzione Community Tour 2011 Infrastrutture in evoluzione.
Private Cloud: Manage Data Center Services Business Priorities Presentation.
© 2011 IBM Corporation 1 (ENSUREing we can) Ride the Wave (on a Cloud) Presenter: Michael Factor, Ph.D. IBM Research – Haifa
Enterprise Architecture, Enterprise Data Management, and Data Standardization Efforts at the U.S. Department of Education May 2006 Joe Rose, Chief Architect.
In the Cloud How to Address Security in the Cloud.
1 NETE4631 Course Wrap-up and Benefits, Challenges, Risks Lecture Notes #15.
Cloud Computing Security Keep Your Head and Other Data Secure in the Cloud Lynne Pizzini, CISSP, CISM, CIPP Information Systems Security Officer Information.
E.Soundararajan R.Baskaran & M.Sai Baba Indira Gandhi Centre for Atomic Research, Kalpakkam.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
A Technical Overview Bill Branan DuraCloud Technical Lead.
3/12/2013Computer Engg, IIT(BHU)1 CLOUD COMPUTING-1.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware A Cloud Computing Methodology Study of.
Web Technologies Lecture 13 Introduction to cloud computing.
State of Georgia Release Management Training
ARIADNE is funded by the European Commission's Seventh Framework Programme Archiving and Repositories Holly Wright.
The OAIS model SEEDS meeting May 5 th, 2015, Lausanne Bojana Tasic.
Taking your Business Technology Further. First Communications: At A Glance Technology Provider since 1998, serving thousands of Businesses throughout.
Metadata Driven Clinical Data Integration – Integral to Clinical Analytics April 11, 2016 Kalyan Gopalakrishnan, Priya Shetty Intelent Inc. Sudeep Pattnaik,
Organizations Are Embracing New Opportunities
By: Raza Usmani SaaS, PaaS & TaaS By: Raza Usmani
CIM Modeling for E&U - (Short Version)
ENSURE: Enabling kNowledge Sustainability, Usability and Recovery for Economic value Architect and build the next generation preservation system, ensuring.
Cisco’s Intelligent Automation for Cloud
Management of Virtual Execution Environments 3 June 2008
Real IBM C exam questions and answers
Technical Capabilities
The business view Operations Business processes Productivity
Cloud Computing: Concepts
High Performance Computing Center – HLRS
Robin Dale RLG OAIS Functionality Robin Dale RLG
IT Management Services Infrastructure Services
Cloud Computing for Wireless Networks
Presentation transcript:

© 2012 IBM Corporation 1 ENSURE: Enabling kNowledge Sustainability, Usability and Recovery for Economic value Presenter: Michael Factor The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/ ) under grant agreement n°

© 2011 IBM Corporation 2 Enabling kNowledge Sustainability, Usability and Recovery for Economic value 3 4 INNOVATIONSUSE CASES Healthcare Clinical Studies Financial Services EVALUATE Cost and Value AUTOMATE Preservation Lifecycle SCALE using ICT innovations PROTECT Content-aware data protection A 3-year IP project started Feb

© 2011 IBM Corporation 3 ENSURE: Key Technical Innovations Evaluate  Automate  Scale  Protect Requirements EvaluateAutomateScaleProtect Access Deploy External Events Flow Events Ontology Cost Value Quality Cloud Virtual appliance Anonymi- zation

© 2011 IBM Corporation 4 ENSURE: Key Technical Innovations Evaluate  Automate  Scale  Protect Requirements EvaluateAutomateScaleProtect Access Deploy External Events Flow Events Ontology Cost Value Quality Cloud Virtual appliance Anonymi- zation

© 2012 IBM Corporation 5 5 Evaluate Cost and Value – InputEvaluate Cost and Value – Output

© 2012 IBM Corporation 6 Evaluate Cost and Value – Process Configurator Economic Performance Engine Preservation Plan Optimizer Translation Rules Quality Engine Cost/risk Engine Data Repositories Configuration Selection Administrator Requirements (Re)Deploy Solution ENSURE Automate

© 2012 IBM Corporation 7 Evaluate cost and value: Preservation Plan Optimizer COE QOE Genetic algorithm generates results based upon engines Really n-dimensions The user chooses a solution from the Pareto frontier No dimension can be improved without degrading at least one other dimension Quality Cost

© 2012 IBM Corporation 8 ENSURE: Key Technical Innovations Evaluate  Automate  Scale  Protect Requirements EvaluateAutomateScaleProtect Access Deploy External Events Flow Events Ontology Cost Value Quality Cloud Virtual appliance Anonymi- zation

© 2012 IBM Corporation 9 Automate Preservation Lifecycle: Preservation Data Aware Lifecycle Management (PDALM) Workflow Engine 9  PDALM: Controls system activities –Manage workflow of the information being preserved –Execute preservation plan (built by the Configurator) –Handle notifications and interaction with the administrator Example: Workflow for ingest

© 2012 IBM Corporation 10 Automate Preservation Lifecycle: Event engine Configurator Event Engine Manages, concurrency, priority and impact/severity of events Listens for preservation related events Notifies relevant ENSURE components PDALM Monitored system behavior Economic Data/format Regulatory Standards Feeds Scale

© 2012 IBM Corporation 11 Automate preservation lifecycle: ontology update Select ontology to update Upload a new version and display potential system impacts Apply new ontology and update system

© 2012 IBM Corporation 12 ENSURE: Key Technical Innovations Evaluate  Automate  Scale  Protect Requirements EvaluateAutomateScaleProtect Access Deploy External Events Flow Events Ontology Cost Value Quality Cloud Virtual appliance Anonymi- zation

© 2012 IBM Corporation 13 Scale: What is a cloud, why is it interesting, and what are the issues? “Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources … that can be rapidly provisioned and released with minimal management effort or service provider interaction.” –US National Institute of Standards and Technology, Information Technology Laboratory Benefits  Cost Savings –Economies of scale, utilization improvement and standardization  Speed and Agility  Pay-as-you-go for usage Issues for preservation  Rich metadata support, e.g., no search  Differences in security models  Encryption may limit preservation actions  Compute near the storage (storlets)  Logical connections among objects in the same and different clouds  Standards Enterprise A Enterprise B Enterprise C Community Cloud Services User A User BUser C User D User E Public Cloud Services Enterprise Data Center Private Cloud Cloud Delivery Models

© 2012 IBM Corporation 14  Map OAIS AIPs and the links among AIPs to the cloud data model  Manage object’s inter-relationship and referential integrity  Map objects to one or more clouds Scale: Mapping Data to Multiple Clouds Cloud A Cloud B Protect

© 2012 IBM Corporation 15 Request to access content with VA Instantiate VA Compute Cloud Private Application Library Storage Cloud Extract content Into VA ENSURE Give user access to VA with content Scale: Accessing Content with a Virtual Appliance (VA)

© 2012 IBM Corporation 16 ENSURE: Key Technical Innovations Evaluate  Automate  Scale  Protect Requirements EvaluateAutomateScaleProtect Access Deploy External Events Flow Events Ontology Cost Value Quality Cloud Virtual appliance Anonymi- zation

© 2012 IBM Corporation 17 Content-aware data protection: Masked/Anonymized Data  Data Owner Requirement: –Data should be anonymized and cannot be associated with a specific individual  Example: –Living people from London who fought in WWII is becoming more and more identifiable hospital bank factory Data Receivers Data Owners Telco Medical Research Software testing Statistical Analysis Pharma Research Full data Masked data Masking Services

© 2012 IBM Corporation 18 Summary  Architect and build the next generation preservation system, ensuring knowledge is sustained and can be recovered for future value  Key Innovations: –Evaluate Cost and Value supporting business decisions –Automate Preservation Lifecycle –Scale using ICT innovations –Content-aware data protection  Three use cases to demonstrate future preservation –Healthcare, clinical trials, and finance use  Status –Initial end to end demo of two use cases in the first year –Emphasis on evolution along time for the second year

© 2012 IBM Corporation 19 Thank You

© 2012 IBM Corporation 20 Backup

© 2012 IBM Corporation 21 Open Archival Information System (OAIS) ISO:14721:2002 Functional Model Information Model SIP = Submission Information Package AIP = Archival Information Package DIP = Dissemination Information Package Archival Information Package

© 2012 IBM Corporation 22 What is a cloud and why is it interesting? “Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.” –US National Institute of Standards and Technology, Information Technology Laboratory Key features:  On-demand  Shared  Automated  Network access Benefits  Cost Savings –Economies of scale, utilization improvement and standardization  Speed and Agility  Pay-as-you-go for usage Investment per GB vs. Quantity of Information

© 2012 IBM Corporation 23 Source:”Cloud will Transform Business as We Know It: The Secret’s in the Source”, Hfs Research, and the London School of Economics, December, 2010 How Much of a Concern are the Following Business Risks Posed by Cloud Business Services to your Business Function, Compared to Your Existing Risks for Non-Cloud Business Services? Security, privacy, lack of control in data placement, lock-in and compliance are key concerns with cloud

© 2012 IBM Corporation 24 I BM’s five cloud delivery models Enterprise owned Either enterprise operation or 3 rd party Fixed price or time and materials services Internal network Dedicated assets 3 rd party owned and operated Centralized, secure delivery center Fixed price, time and materials, or pay as you go Internal network Dedicated assets Mix of shared and dedicated resources Shared facility and staff Pay as you go VPN access or public internet Shared resources Elastic scaling Pay as you go Public internet Enterprise Data Center Private Cloud Enterprise Data Center IBM operated Managed Private Cloud IBM owned and operated Hosted Private Cloud User A User BUser C User D User E Public Cloud Services Enterprise A Enterprise B Enterprise C Shared Cloud Services Community Clouds should be considered by memory institutions

© 2012 IBM Corporation 25 Scale: Cloud Gap Analysis  Clouds considered –Amazon S3 and EC2 (enterprise) –Open Stack Swift and Nova (open source) –VISION Cloud (EC research)  Some common shortcomings for long term preservation –Limited support of user metadata –Lack of support for searches on metadata –Differences in supported security models –Encryption models limit preservation actions –Lack of compute near the storage support –Lake of support for logical connections among objects in the same and different clouds

© 2012 IBM Corporation 26 Scale: Computational Storage  Cloud storage generally utilizes: –server-based storage with powerful CPUs –Serves big data accessed from anywhere over the WAN –==> add computational modules (storlets) to the cloud storage  What is a storlet? –Restricted module performed in the storage close to the data  Why/ When use storlets? –Reduce bandwidth –Security – reduce exposure of private data –Preservation – data in storage may change and be more up-to-date –Expose generic functions that can be used by many applications  Example Storlets: –Transformation –Annonymization –Data Mining –Fixity check –Encryption/Secure delete

© 2012 IBM Corporation 27 Scale: Use of Open Standards and Open Source  jClouds (open source) to access multiple clouds  Cloud Data Management Interface (CDMI) (standard interface) for cloud access and management –Contribute CDMI support to jCloudes  OpenStack Swift (open source) as private cloud infrastructure

© 2012 IBM Corporation 28 Content-aware data protection: Vocabulary of an Access Policy Who are the actors (doctor, nurse, gynecologist,...) What are the actions they can take (create, read, append, update,...) What are the data objects that are subject to access policies (PHR, GI, What are the purposes for which access is given (treatment, research, billing,...) What are the types of conditions mentioned in the access rules (time, place, consent,...) What types of obligations must be fulfilled before access is granted (external: notify, consent,...; data-related: anonymize,...) Actor has permission to take action on data object for the purpose under the conditions with obligations.

© 2012 IBM Corporation 29  Share data with changes:  Data Owner Requirement: –Data should be anonymized and cannot be associated with a specific individual  Example: –Living people from London who fought in WWII is becoming identifiable as years pass by. Content-aware data protection: Compromise hospital bank factory De-Identification Data Receiver Data Owner