Collaborative Offloading for Distributed Mobile-Cloud Apps

Slides:



Advertisements
Similar presentations
ECOS: Leveraging Software-Defined Networks to Support Mobile Application Offloading Aaron Gember, Christopher Dragga, Aditya Akella University of Wisconsin-Madison.
Advertisements

A Grid Parallel Application Framework Jeremy Villalobos PhD student Department of Computer Science University of North Carolina Charlotte.
GridFlow: Workflow Management for Grid Computing Kavita Shinde.
ENFORCING PERFORMANCE ISOLATION ACROSS VIRTUAL MACHINES IN XEN Diwaker Gupta, Ludmila Cherkasova, Rob Gardner, Amin Vahdat Middleware '06 Proceedings of.
Workload Management Massimo Sgaravatto INFN Padova.
ThinkAir: Dynamic Resource Allocation and Parallel Execution in Cloud for Mobile Code Offloading Sokol Kosta, Pan Hui Deutsche Telekom Labs, Berlin, Germany.
Exploring the Tradeoffs of Configurability and Heterogeneity in Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable.
Ajou University, South Korea ICSOC 2003 “Disconnected Operation Service in Mobile Grid Computing” Disconnected Operation Service in Mobile Grid Computing.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
An Effective Dynamic Scheduling Runtime and Tuning System for Heterogeneous Multi and Many-Core Desktop Platforms Authous: Al’ecio P. D. Binotto, Carlos.
German National Research Center for Information Technology Research Institute for Computer Architecture and Software Technology German National Research.
Agent-based Device Management in RFID Middleware Author : Zehao Liu, Fagui Liu, Kai Lin Reporter :郭瓊雯.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Matthew Moccaro Chapter 10 – Deployment and Mobility PART II.
A Lightweight Platform for Integration of Resource Limited Devices into Pervasive Grids Stavros Isaiadis and Vladimir Getov University of Westminster
ANDROID Presented By Mastan Vali.SK. © artesis 2008 | 2 1. Introduction 2. Platform 3. Software development 4. Advantages Main topics.
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
May 2004 Department of Electrical and Computer Engineering 1 ANEW GRAPH STRUCTURE FOR HARDWARE- SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS A NEW GRAPH.
1 Time & Cost Sensitive Data-Intensive Computing on Hybrid Clouds Tekin Bicer David ChiuGagan Agrawal Department of Compute Science and Engineering The.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
Euro-Par, A Resource Allocation Approach for Supporting Time-Critical Applications in Grid Environments Qian Zhu and Gagan Agrawal Department of.
Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes Published in: High Performance Computing and Simulation (HPCS), 2013 International.
Net-Centric Software and Systems I/UCRC A Framework for QoS and Power Management for Mobile Devices in Service Clouds Project Lead: I-Ling Yen, Farokh.
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
Compiler and Runtime Support for Enabling Generalized Reduction Computations on Heterogeneous Parallel Configurations Vignesh Ravi, Wenjing Ma, David Chiu.
Software Deployment and Mobility. Introduction Deployment is the placing of software on the hardware where it is supposed to run. Redeployment / migration.
Eduardo Cuervo – Duke University Aruna Balasubramanian - University of Massachusetts Amherst Dae-ki Cho - UCLA Alec Wolman, Stefan Saroiu, Ranveer Chandra,
Aneka Cloud ApplicationPlatform. Introduction Aneka consists of a scalable cloud middleware that can be deployed on top of heterogeneous computing resources.
Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY THERMAL-AWARE RESOURCE.
Data Consolidation: A Task Scheduling and Data Migration Technique for Grid Networks Author: P. Kokkinos, K. Christodoulopoulos, A. Kretsis, and E. Varvarigos.
Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,
1 of 14 Lab 2: Design-Space Exploration with MPARM.
Data-Centric Systems Lab. A Virtual Cloud Computing Provider for Mobile Devices Gonzalo Huerta-Canepa presenter 김영진.
Nguyen Thi Thanh Nha HMCL by Roelof Kemp, Nicholas Palmer, Thilo Kielmann, and Henri Bal MOBICASE 2010, LNICST 2012 Cuckoo: A Computation Offloading Framework.
Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.
© 2012 Eucalyptus Systems, Inc. Cloud Computing Introduction Eucalyptus Education Services 2.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Dynamic Mobile Cloud Computing: Ad Hoc and Opportunistic Job Sharing.
Md Baitul Al Sadi, Isaac J. Cushman, Lei Chen, Rami J. Haddad
CHaRy Software Synthesis for Hard Real-Time Systems
Android Mobile Application Development
Workload Management Workpackage
OPERATING SYSTEMS CS 3502 Fall 2017
Security and Programming Language Work on SmartPhones
Support for Program Analysis as a First-Class Design Constraint in Legion Michael Bauer 02/22/17.
Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.
Dynamo: A Runtime Codesign Environment
基于多核加速计算平台的深度神经网络 分割与重训练技术
Introduction
Parallel Programming By J. H. Wang May 2, 2017.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn.
Abstract Major Cloud computing companies have started to integrate frameworks for parallel data processing in their product portfolio, making it easy for.
Introduction to Cloud Computing
Supporting Fault-Tolerance in Streaming Grid Applications
Class project by Piyush Ranjan Satapathy & Van Lepham
Sentio: Distributed Sensor Virtualization for Mobile Apps
湖南大学-信息科学与工程学院-计算机与科学系
Cooperative System for Free Parking Assignment
Department of Computer Science University of California, Santa Barbara
On-The-Fly Curbside Parking Assignment
Adaptive Code Unloading for Resource-Constrained JVMs
CLUSTER COMPUTING.
Smita Vijayakumar Qian Zhu Gagan Agrawal
Assoc. Prof. Dr. Syed Abdul-Rahman Al-Haddad
Presented By: Darlene Banta
Resource Allocation for Distributed Streaming Applications
Knowledge Sharing Mechanism in Social Networking for Learning
Department of Computer Science University of California, Santa Barbara
Presentation transcript:

Collaborative Offloading for Distributed Mobile-Cloud Apps Hillol Debnath, Giacomo Gezzi*, Antonio Corradi*, Narain Gehani, Xiaoning Ding, Reza Curtmola, Cristian Borcea Department of Computer Science New Jersey Institute of Technology *Department of Computer Science and Engineering University of Bologna

Computation Offloading Mobile devices have limited resources Systems designed to offload resource-demanding tasks to cloud Use cost function to balance cost and benefit of offloading a task based on: Workload of the task, state of resources, software and hardware dependencies Existing works applicable only for single mobile-cloud pair No existing work is designed to work with a distributed app Local offloading decisions do not always result in the best performance Offloading needs to be coordinated among the participating entities for best performance

Motivating Example Offloading decisions can be made collaboratively to reduce overall computation latency App: J1 -> J2 J2 must execute on D2 for privacy reasons Scenario 1: J1 offloaded to cloud App completion time: 20 + 4 + 200 + 10 = 234 Scenario 2: no offloading and WiFi communication App completion time: 60 + 10 + 10 = 80 Complexity increases when there is dependency on devices and other computations J2 10 60 J1 D2 D1 10 20 200 Cloud Services 4 J1

Distributed Mobile-Cloud Apps and Jobs Distributed apps run over heterogeneous devices A distributed app comprises of a set of jobs Job represents a code component (e.g., class, function, etc.) Jobs can be dependent on devices Can depend on sensors or private data located on a device Jobs can be dependent on other jobs (e.g., precedence constraints) Dependency can be modeled as a directed acyclic graph

Challenges Scheduling jobs How to determine what (which job) to offload and where? Static Partitioning Offloading decision is static and predefined Some jobs are pre-configured to run on mobile or cloud Dynamic Partitioning Offloading decision is dynamic and made during runtime Jobs and resources are analyzed during runtime Difficult to generate a schedule of jobs. For n jobs, m devices: Exponential search space: mn number of <device, job> combinations NP-hard problem Difficult to efficiently execute jobs according to the schedule

CASINO Overview Dynamic and collaborative offloading framework for distributed mobile-cloud apps CASINO’s job scheduler Works in real-time Considers global resource conditions and job/device dependencies Generates a job schedule for a distributed app that minimizes the app completion time Deployment framework profiles resource information from participating devices to help the job scheduler Programming framework provides simple API that can be used by programmers to partition apps statically and dynamically Prototype implemented on Android mobiles and Android X86 virtual machines

Outline Introduction Framework Design Job Scheduling Scheduling Validation Prototype Implementation Experimental Evaluation Conclusion

Leveraging Avatar Platform for CASINO Each mobile device has an associated avatar (e.g., VM) in the cloud Avatars execute app components on behalf of users’ mobile devices Apps running on Avatar platform have Low latency/response time Energy efficiency High availability Cloud avatar3 avatar1 avatar4 avatar2 Mobiles and avatars of users participating in a distributed app form the set of machines for scheduling

Framework Design (1) Works on top of Moitree, the Avatar platform’s middleware Has components at mobiles and avatars Simple annotation based API API library linked to apps Code components annotated with @Local or @Remote for static partitioning @Offloadable for dynamic partitioning Code interceptors analyze annotations

Framework Design (2) Job Scheduler is a cloud service Maintains a job queue Uses global resource conditions Resolves job/device dependencies Schedules jobs Device profilers Profile CPU, memory, network Send profiling data to job scheduler Execution manager Receives jobs assigned by scheduler Executes scheduled jobs Synchronizes state needed for offloading

Job Scheduling NP hard scheduling problem, according to Lawler et al: 𝑄 𝑚 𝑝𝑟𝑒𝑐 𝑗=1 𝑛 𝐶 𝑗 Qm indicates m uniform parallel machines |prec| indicates job dependencies 𝑗=1 𝑛 𝐶 𝑗 indicates minimizing the total completion time as an optimization goal CASINO reduces the exponential search space to polynomial range Considers job and device dependencies Uses topological sorting and greedy technique CASINO schedules jobs in batches. In each batch: Considers currently submitted jobs Resolves dependencies Collects profiling data Runs the scheduling algorithm

Scheduling Algorithm Time complexity: O(mn2logm) procedure schedule(J, M, E, R) L ← topologicalSort(E) COMP[][]← estimateComputationCost(L,M) totalCost ← 0 for each job j in L if j is dependent on device d then schedule j on device d calculate (comm+comp) cost for j on d update totalCost continue for each device x in M estimate cost of executing j on x schedule j on the min cost device return schedule s end procedure Time complexity: O(mn2logm)

Validation of the Job Scheduler (1) Validated using simulated but realistic data Device dependency matrix Job dependency matrix J0 J1 J2 J3 J4 J5 J6 J7 J0 J1 J2 J3 J4 J5 J6 J7 J0 M0 J1 M1 J2 M2 J3 Av0 J4 Av1 J5 Av2 J6 J7

Validation of the Job Scheduler (2) M0 M1 M2 Av0 Av1 Av2 M0 M1 Communication cost matrix based on typical WiFi/cellular latency M2 Av0 Av1 Av2 J0 J1 J2 J3 J4 J5 J6 J7 M0 M1 M2 Computation cost matrix estimated based on instructions counts Av0 Av1 Av2

Results Total completion time J0 J1 J2 J3 J4 J5 J6 J7 M0 Ordered list of jobs: [J0, J1, J2, J3, J4, J5, J6, J7 ] M1 M2 Av0 Av1 Av2 Generated Schedule Total completion time Everything on mobile: 3416 ms Everything on avatars: 1837 ms Scheduled by CASINO: 1614 ms Validation shows the capability of dynamic partitioning

Outline Introduction Framework Design Job Scheduling Scheduling Validation Prototype Implementation Experimental Evaluation Conclusion

CASINO Implementation Prototype build for Android devices and Android x86 virtual machines CASINO software and apps run as different processes Android’s Binder interface used for Inter-Process Communication CASINO uses AspectJ and Java’s annotation processing AspectJ injects code for processing annotations during compile time Alternatively, Java Dynamic Proxy mechanism was tested Works at run-time, uses Java reflection Slow, more overhead

Profiler Implementation CASINO profiles system resources on each device Network profiler uses Android’s NetworkInfo and TrafficStats API CPU and memory profiler reads native system files for getting capability and usage "/proc/stat“ and “/sys/devices/system/cpu/…” Battery profiler uses a combination of Android API and reading native system files Android’s BatteryManager API Reads “/sys/class/power_supply” Other tools such as Battery Historian, Network Monitor, and HPROF Analyzer were used to analyze resource usage

Evaluation of Photo Filtering App Applies Gaussian blur filter on a photo The filtering job is annotated with @offloadable Tested for one mobile-cloud pair Tested with both Java and native C++ implementation of the filter Dynamic offloading improves execution time substantially Benefits of offloading increase with image size Java based implementation of the filter C++ implementation of the filter

Evaluation of Execution Manager State Size (Kb) Execution Time Including Offloading (ms) Overhead – Interception and State Initialization (ms) Overhead – State Sync (ms) Overhead Percentage 237.31 3212 2.11 0.06 0.06% 61.36 664 2.75 0.07 0.40% 36.44 490 2.93 0.08 0.61% 6.70 145 2.79 1.97% Overhead is low and remains below 2% Overhead percentage decreases as the state size increases

Conclusion The first computation offloading framework for distributed apps running on mobile-cloud platforms CASINO provides an API for both static and dynamic partitioning of code We designed a job scheduler for minimizing the total completion time Polynomial time approximation algorithm for an NP-hard problem All jobs are scheduled according to their constraints We designed and implemented in Android a development and deployment framework that performs efficiently

Thanks! http://cs.njit.edu/~borcea/avatar Acknowledgment: NSF Grants No. CNS 1409523, CNS 1054754, DGE 1565478, and SHF 1617749; DARPA/AFRL Contract No. A8650-15-C-7521