Cross-Layer Scheduling in Cloud Computing Systems Authors: Hilfi Alkaff, Indranil Gupta
Motivation Many cloud computing frameworks out there – Batch Processing Framework: Hadoop – Stream Processing Framework: Storm Current applications are not aware of underlying network topology – Might schedule tasks on machines with low bandwidth.
Challenges Need to expose underlying network topology efficiently to applications Huge state space to search – Thousands of machines in a cluster – Users demand more interactive jobs Multiple possible data-path representation – Want to have generic schedulers
Data-Path: Map-Reduce
Data-Path: Stream
Proposed Solution Cross-Layer Scheduling Framework – First-level scheduler in application Level – Second-level scheduler in routing level Use Simulated Annealing at each level – Probabilistic framework – Idea: If neighboring state is better, always move there but if it is not, move there with probability P(T) that decreases with time
Proposed Architecture Application Master SDN Controller Cross-Layer Scheduling
Algorithm: Pre-computation
Algorithm: Main
Algorithm: genState() Heuristic Too many neighboring states – Not possible to traverse all of them Application Level – Prefer node that has higher # of sink vertices – Prefer node that has higher # of source vertices Routing Level – Prefer paths that have lower number of hops – Prefer paths that have higher amount of available bandwidth
Emulab Result: Throughput
Simulation Result: Computation Time
Simulation Results: CDF
Le Questions?
Algorithm: Failures Link-Failures – Need to re-allocate flows using that link – Keep a separate hash-table where key=edge, value=flows – Get another path from Topology-Map. Machine-failures – Re-run main algorithm on