Meta-Simulation Design and Analysis for Large Scale Networks David W Bauer Jr Department of Computer Science Rensselaer Polytechnic Institute.

Meta-Simulation Design and Analysis for Large Scale Networks David W Bauer Jr Department of Computer Science Rensselaer Polytechnic Institute

OUTLINE  Motivation  Contributions  Meta-simulation  ROSS.Net  BGP4-OSPFv2 Investigation  Simulation  Kernel Processes  Seven O’clock Algorithm  Conclusion

“…objective as a quest for general invariant relationships between network parameters and protocol dynamics…” High-Level Motivation: to gain varying degrees of qualitative and quantitative understanding of the behavior of the system-under-test Parameter Sensitivity Protocol Stability and Dynamics Feature Interactions

Meta-Simulation: capabilities to extract and interpret meaningful performance data from the results of multiple simulations Individual experiment cost is high Developing useful interpretations Protocol performance modeling Experiment Design Goal: identify minimum cardinality set of meta-metrics to maximally model system

Contributions: Meta-Simulation: OSPF Problem: which meta-metrics are most important in determining OSPF convergence? Search complete model space Negligible metrics identified and isolated Step 2 Optimization-based ED: 750 experiments Full-Factorial ED (FFED): 16384 experiments Step 3 Our approach within 7% of Full Factorial using 2 orders of magnitude fewer experiments Re-parameterize Re-scale Step 1

Contributions: Meta-Simulation: OSPF/BGP Ability: model BGP and OSPF control plane Problem: which meta-metrics are most important in minimizing control plane dynamics (i.e., updates)? Optimized with respect to various metrics -- equivalent to a particular management approach. Importance of parameters differ for each metric. For minimal total updates: –Local perspectives are 20-25% worse than the global. For minimal total interactions: –15-25% worse can happen with other metrics OB updates are more important than BO updates (i.e. ~0.1% vs. ~50%) Meta-Simulation Perspective: complete view of all domains OB: ~50% of total updates BO: ~0.1% of total updates Global perspective 20-25% better than local perspectives – BO: BGP-caused OSPF update – OB: OSPF-caused BGP update All updates belong to one of four categories: –OO: OSPF-caused OSPF (OO) update –BO: BGP-caused OSPF update Minimize total BO+OB 15-25% better than other metrics

Contributions: Simulation: Kernel Process Parallel Discrete Event Simulation Conservative Simulation Wait until it is safe to process next event, so that events are processed in time-stamp order Optimistic Simulation Allow violations of time-stamp order to occur, but detect them and recover Benefits of Optimistic Simulation: i.Not dependant on network topology simulated ii.As fast as possible forward execution of events

Contributions: Simulation: Kernel Process Problem: parallelizing simulation requires 1.5 to 2 times more memory than sequential, and additional memory requirement affects performance and scalability Decreased scalability as model size increases: due to increased memory required to support model Model Size Increasing 4 Processors Used Solution: Kernel Processes (KPs) new data structure supports parallelism, increases scalability

Contributions: Simulation: Seven O’clock Problem: distributing simulation requires efficient global synchronization Inefficient solution: barrier synchronization between all nodes while performing computation Efficient solution: pass messages between nodes, and sycnhronize in background to main simulation Seven O’clock Algorithm: eliminate message passing  reduce cost from O(n) or O(log n) to O(1)

ROSS.Net: Big Picture Goal: an integrated simulation and experiment design environment ROSS.Net (simulation & meta-simulation Protocol Design Protocol metrics Protocol parameters Measurement Data-sets (Rocketfuel) Measured topology data, traffic and router stats, etc. Modeling Protocol Models: OSPFv2, BGP4, TCP Reno, IPv4, etc

ROSS.Net Design of Experiments Tool (DOT) Parallel Discrete Event Network Simulation Input Parameters Output Metric(s) Meta- Simulation Simulation Experiment design Statistical analysis Optimization heuristic search –Recursive Random Search Sparse empirical modeling Optimistic parallel simulation –ROSS Memory efficient network protocol models ROSS.Net: Big Picture

Design of Experiments Tool (DOT) Traditional Experiment Design (Full/Fractional Factorial) Statistical or Regression Analysis (R, STRESS) Metric(s) Parameter Vector Small-scale systems Linear parameter interactions Small # of params Empirical model Design of Experiments Tool (DOT) Optimization Search Statistical or Regression Analysis (R, STRESS) Metric(s) Parameter Vector Large-scale systems Non-Linear parameter interactions Large # of params – curse of dimensionality Sparse empirical model ROSS.Net: Meta-Simulation Components

Router topology from Rocketfuel tracedata –took each ISP map as a single OSPF area –Created BGP domain between ISP maps –hierarchical mapping of routers AT&T’s US Router Network Topology 8 levels of routers: –Levels 0 and 1, 155Mb/s, 4ms delay –Levels 2 and 3, 45Mb/s, 4ms delay –Levels 4 and 5, 1.5Mb/s, 10ms delay –Levels 6 and 7, 0.5Mb/s, 10ms delay Meta-Simulation: OSPF/BGP Interactions

OSPF –Intra-domain, link-state routing –Path costs matter Border Gateway Protocol (BGP) –Inter-domain, distance-vector, policy routing –Reachability matters BGP decision-making steps: –Highest LOCAL PREF –Lowest AS Path Length –Lowest origin type ( 0 – iBGP, 1 – eBGP, 2 – Incomplete) –Lowest MED –Lowest IGP cost –Lowest router ID iBGP connectivity eBGP connectivity OSPF domain Meta-Simulation: OSPF/BGP Interactions

Intra-domain routing decisions can effect inter-domain behavior, and vice versa. All updates belong to either of four categories: –OSPF-caused OSPF (OO) update –OSPF-caused BGP (OB) update – interaction –BGP-caused OSPF (BO) update – interaction –BGP-caused BGP (BB) update Link failure or cost increase (e.g. maintenance) Destination OB Update 8 10 Meta-Simulation: OSPF/BGP Interactions

Intra-domain routing decisions can effect inter- domain behavior, and vice versa. Identified four categories of updates: –OO: OSPF-caused OSPF update –BB: BGP-caused BGP update –OB: OSPF-caused BGP update – interaction –BO: BGP-caused OSPF update – interaction eBGP connectivity becomes available Destination BO Update These interactions cause route changes to thousands of IP prefixes, i.e. huge traffic shifts!! Meta-Simulation: OSPF/BGP Interactions

Three classes of protocol parameters: –OSPF timers, BGP timers, BGP decision Maximum search space size 14,348,907. RRS was allowed 200 trials to optimize (minimize) response surface: –OO, OB, BO, BB, OB+BO, ALL updates Applied multiple linear regression analysis on the results Meta-Simulation: OSPF/BGP Interactions

Optimized with respect to OB+BO response surface. BGP timers play the major role, i.e. ~15% improvement in the optimal response. –BGP KeepAlive timer seems to be the dominant parameter.. – in contrast to expectation of MRAI! OSPF timers effect little, i.e. at most 5%. –low time-scale OSPF updates do not effect BGP. Meta-Simulation: OSPF/BGP Interactions ~15% improvement when BGP timers included in search space

Varied response surfaces -- equivalent to a particular management approach. Importance of parameters differ for each metric. For minimal total updates: –Local perspectives are 20-25% worse than the global. For minimal total interactions: –15-25% worse can happen with other metrics OB updates are more important than BO updates (i.e. ~0.1% vs. ~50%) Meta-Simulation: OSPF/BGP Interactions Important to optimize OSPF OB: ~50% of total updates BO: ~0.1% of total updates Global perspective 20-25% better than local perspectives Minimize total BO+OB 15-25% better than other metrics

Meta-Simulation Conclusions: –Number of experiments were reduced by an order of magnitude in comparison to Full Factorial. –Experiment design and statistical analysis enabled rapid elimination of insignificant parameters. –Several qualitative statements and system characterizations could be obtained with few experiments.

OUTLINE  Problem Statement  Contributions  Meta-simulation  ROSS.Net  BGP4-OSPFv2 Investigation  Simulation  Kernel Processes  Seven O’clock Algorithm  Conclusion

Simulation: Overview Parallel Discrete Event Simulation –Logical Process (LPs) for each relatively parallelizable simulation model, e.g. a router, a TCP host Local Causality Constraint: Events within each LP must be processed in time-stamp order Observation: Adherence to LCC is sufficient to ensure that parallel simulation will produce same result as sequential simulation Conservative Simulation -Avoid violating the local causality constraint (wait until it’s safe) I.Null Message (deadlock avoidance) (Chandy/Misra/Byrant) II.Time-stamp of next event Optimistic Simulation -Allow violations of local causality to occur, but detect them and recover using a rollback mechanism I.Time Warp Protocol (Jefferson, 1985) II.Limiting amount of opt. execution

ROSS: Rensselaer’s Optimistic Simulation System tw_event message receive_ts src / dest_lp user data message free event list tail free event list head event queue cancel queue lp_list tw_pe pe lp number type proc ev queue head proc ev queue tail tw_lpROSS free event list[ ][ ]GTW message cancel queue lplist[MAX_LP] PEState event queue PEState GState[NPE]... message init proc ptr rev proc ptr final proc ptr LPState process ptr message Event lp number Example Accesses GTW: Top down hierarchy lp_ptr = GState[LP[i].Map].lplist[LPNum[i]] ROSS: Bottom up hierarchy lp_ptr = event->src_lp; or pe_ptr = event->src_lp->pe; Key advantages of bottom up approach: reduces access overheads improves locality and processor cache performance Memory usage only 1% more than sequential and independent of LP count.

“On the Fly” Fossil Collection Processor 0 FreeList[1] FreeList[0] LP ALP BLP C 5.0 10.0 15.0 Snapshot of PE 0’s internal state at time 15.0 Snapshot of PE 0’s internal state after rollback of LP A and re-execute Processor 0 FreeList[1] FreeList[0] LP ALP BLP C 5.0 10.0 15.0 5.010.015.0 Key Observation: Rollbacks cause the free list to become UNSORTED in virtual time. Result: event buffers that could be allocated are not. user must over-allocate the free list OTFFC works by only allocating events from the free list that are less than GVT. As events are processed they are immediately placed at the end of the free list....

KP Kernel Processes Contributions: Simulation: Kernel Process LP... (Logical Processes) 95 8731 9 642 Fossil Collection / Rollback PE (Processing Element per CPU utilized)

Advantages: i.significantly lowers fossil collection overheads ii.lowers memory usage by aggregation of LP statistics into KP statistics iii.retains ability to process events on an LP by LP basis in the forward computation. Disadvantages: i.potential for “false rollbacks” ii.care must be taken when deciding on how to map LPs to KPs ROSS: Kernel Processes

ROSS: KP Efficiency Not enough work in system… Small trade-off: longer rollbacks vs faster FC

ROSS: KP Performance Impact # KPs does not negatively impact performance

ROSS: Performance vs GTW ROSS outperforms GTW 2:1 in sequential ROSS outperforms GTW 2:1 at best parallel

Optimistic approach –Relies on global virtual time (GVT) algorithm to perform fossil collection at regular intervals –Events with timestamp less than GVT: Will not be rolled back Can be freed GVT calculation –Synchronous algorithms: LPs stop event processing during GVT calculation Cost of synch. may be higher than positive work done per interval Processes waste time waiting –Asynchronous algorithms: LPs continue processing events while GVT calculation continues in the background *Goal: creating a consistent cut among LPs that divides the events into past and future the wall-clock time Two problems: (i) Transient Message Problem, (ii) Simultaneous Reporting Problem Simulation: Seven O’clock GVT

Construct cut via message- passing Cost: O(log n) if tree, O(N) if ring ! If large number of processors, then free pool exhausted waiting for GVT to complete Simulation: Mattern’s GVT

Construct cut using shared memory flag Cost: O(1) ! Limited to shared memory architecture Sequentially consistent memory model ensures proper causal order Simulation: Fujimoto’s GVT

Sequentially consistent does not mean instantaneous Memory events are only guaranteed to be causally ordered Is there a method to achieve sequentially consistent shared memory in a loosely coordinated, distributed environment? Simulation: Memory Model

Key observations: –An operation can occur atomically within a network of processors if all processors observe that the event occurred at the same time. –CPU clock time scale (ns) is significantly smaller than network time- scale (ms). Network Atomic Operations (NAOs): –an agreed upon frequency in wall-clock time at which some event logically observed to have happened across a distributed system. –subset of the possible operations provided by a complete sequentially consistent memory model. wall-clock time Compute GVT Update Tables Simulation: Seven O’clock GVT wall-clock time

GVT 7 5 109 LVT: 7LVT: 5 LVT: min(5,9) GVT: min(5,7) ABCDE

 t max is not necessary when a message-passing system w/ acks is available. Transient Message Problem –Since  t max, is known, senders account for messages sent in the time interval [NAO-  t max, NAO]. –Since no messages can take longer than  t max to transfer over the network, there cannot be any transient message. Simultaneous Reporting Problem –Prevented since all processors see the cut at exact instant on wall-clock time. –In case, there is a clock synch error, any message sent in the error time period will be accounted for since the clock synch error is far less than  t max. Simulation: Seven O’clock GVT

Itanium-2 Cluster r-PHOLD 1,000,000 LPs 10% remote events 16 start events 4 machines –1-4 CPUs –1.3 GHz Round-robin LP to PE mapping Simulation: Seven O’clock GVT Linear Performance

Netfinity Cluster r-PHOLD 1,000,000 LPs 10, 25% remote events 16 start events 4 machines –2 CPUs, 36 nodes –800 GHz Simulation: Seven O’clock GVT

Itanium-2 Cluster 1,000,000 LPs –each modeling a TCP host (i.e. one end of a TCP connection). 2 or 4 machines –1-4 CPUs on each –1.3 GHz Poorly mapped LP/KP/PE Simulation: Seven O’clock GVT: TCP Linear Performance

Netfinity Cluster 1,000,000 LPs –each modeling a TCP host (i.e. one end of a TCP connection). 4-36 machines –1-2 CPUs on each –Pentium III –800MHz Simulation: Seven O’clock GVT: TCP

Sith Itanium-2 cluster 1,000,000 LPs –each modeling a TCP host (i.e. one end of a TCP connection). 4-36 machines –1-2 CPUs on each –900MHz Simulation: Seven O’clock GVT: TCP

Summary –Seven O’Clock Algorithm Clock-based algorithm for distributed processors –creates a sequentially consistent view of distributed memory Zero-Cost Consistent Cut –Highly scalable and independent of event memory limits Fujimoto’sSeven O’ClockMattern’sSamadi’s Cut Calculation Complexity O(1) O(n) or O(log n) Parallel / Distributed PP & D Global InvariantShared Memory Flag Clock Synchronization Message Passing Interface Independent of Event Memory NYNN Simulation: Seven O’clock GVT

Summary: Contributions  Meta-simulation  ROSS.Net: platform for large-scale network simulation, experiment design and analysis  OSPFv2 protocol performance analysis  BGP4/OSPFv2 protocol interactions  Simulation  kernel processes  memory efficient, large-scale simulation  Seven O’clock GVT Algorithm  zero-cost consistent cut  high performance distributed execution

Summary: Future Work  Meta-simulation  ROSS.Net: platform for large-scale network  incorporate more realistic measurement data, protocol models  CAIDA, Multi-cast, UDP, other TCP variants  more complex experiment designs  better qualitative analysis  Simulation  Seven O’clock GVT Algorithm  compute FFT and analyze “power” of different models  attempt to eliminate GVT algorithm by determining max rollback length

Meta-Simulation Design and Analysis for Large Scale Networks David W Bauer Jr Department of Computer Science Rensselaer Polytechnic Institute.

Similar presentations

Presentation on theme: "Meta-Simulation Design and Analysis for Large Scale Networks David W Bauer Jr Department of Computer Science Rensselaer Polytechnic Institute."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Meta-Simulation Design and Analysis for Large Scale Networks David W Bauer Jr Department of Computer Science Rensselaer Polytechnic Institute.

Similar presentations

Presentation on theme: "Meta-Simulation Design and Analysis for Large Scale Networks David W Bauer Jr Department of Computer Science Rensselaer Polytechnic Institute."— Presentation transcript:

Similar presentations

About project

Feedback