Autonomic aspects in cloud data management Alexandra Carpen-Amarie KerData
Contents Monitoring framework for BlobSeer Self-Adaptive data replication Dynamic Provider deployment BlobSeer as a Fair Data Storage Service BlobSeer as a storage backend for Cumulus
Monitoring framework for BlobSeer Proxy service Repository Monitored data Proxy Services MonALISA Services Monitored nodes Types of Monitored nodes: Providers
Monitoring framework for BlobSeer Proxy service Repository Monitored data Proxy Services MonALISA Services Monitored nodes Types of Monitored nodes: Providers
Monitoring framework for BlobSeer Monitoring Database Monitored data MonALISA Services Monitored nodes Types of Monitored nodes: Providers
Monitoring framework for BlobSeer Monitoring Database Monitored data MonALISA Services Monitored nodes Types of Monitored nodes: Providers Mihaela Vlad ( Master Internship in KerData team ) Malicious Clients detection
Monitoring framework for BlobSeer Monitoring Database Monitored data MonALISA Services Monitored nodes Types of Monitored nodes: Providers Monitoring for all BlobSeer components Metadata Providers Version Manager Provider Manager
Self-Adaptive data replication Lucian Cancescu (PUB student) PUB advisor: Alexandru Costan Goals: Maintain the replication factor for each BLOB Automatically adapt the replication factor
Self-Adaptive data replication BlobSeer current status: Specify a replication degree at BLOB creation Write operation - attempts to create all the needed replicas for each page Failures: Do not affect read operations - at least 2 replicas The initial replication degree is not restored Advantage: Data is never updated - replication is easy
Self-Adaptive data replication Maintaining the replication degree (1)
Self-Adaptive data replication Maintaining the replication degree (2)
Self-adaptive data replication Adapting the replication degree Use monitoring information Number of accesses per BLOB Disk space Memory Network User-defined metrics Increase/decrease the replication degree automatically
Dynamic Provider deployment Alexandru Palade (PUB student) PUB advisor: Alexandru Costan Goal: Enable BlobSeer to scale up and down automatically
Dynamic Provider deployment Motivation Cloud Computing - pay-per-use model Optimize resource consumption Challenges Finding the optimal number of resources Maintaining data integrity when scaling down
Dynamic Provider deployment Dynamic Deployment Module: Compute a score for each provider Enable or disable providers
Dynamic Provider deployment Heuristics for computing the providers’ score Factors Physical factors (storage space, bandwidth usage) BlobSeer-specific factors (number of accesses) Weights associated with factors Decision based on thresholds Framework for specifying the scenarios that define the scoring algorithm Flexible Select factors Define conditions for the factors’ values Time interval Extensible Define new scenarios
Dynamic Provider deployment Example of scenario: free disk space is above the 70% threshold read access rate per time unit is small write access rate per time unit is small => The provider can be shut down Factor weights:
BlobSeer as a Fair Data Storage Service Mihai Mircea (PUB student) PUB advisor: Alexandru Costan Goal: Enhance BlobSeer with fairness policies Web-service on top of BlobSeer
BlobSeer as a Fair Data Storage Service Rapidshare-like functionality Reward users that add data to the system Penalize users that just collect data Flexible policies: Rewards: Priorities for upload/download Increased storage space Penalties: Download delays Access restrictions
BlobSeer as a storage backend for Cumulus Cumulus Nimbus storage cloud implementation Compatible with Amazon S3 interface Replaces the GridFTP-based VM repository Upload/download VMs using S3 tools Currently supports POSIX filesystems
BlobSeer as a storage backend for Cumulus Integrating BlobSeer with Cumulus File namespace manager for BlobSeer Python bindings for BlobSeer Enable Cumulus to store data into BlobSeer Enable BlobSeer as a VM repository Extend the BlobSeer backend to support VM uploads/downloads Work with EC2 AMI-tools
Q&A