Download presentation
Presentation is loading. Please wait.
Published byNorma Ray Modified over 9 years ago
1
Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring Tasks for the Borealis Stream Processing Engine
2
Slide no. 2 Outline Problem description Application domains Data stream management system (DSMS) Borealis Design Experiment Setup Implementation Evaluation Conclusion Future Work Network monitoring tasks
3
Slide no. 3 Problem Description Design, Implementation, and Evaluation of Network Monitoring Tasks for the Borealis Stream Processing Engine Network Monitoring Tasks: –Task-1: Verify Borealis load shedding mechanisms. –Task-2: Measure the average load of packets and network load per second over a one minute interval. –Task-3: How many packets have been sent to certain ports during the last five minutes? –Task-4: How many bytes have been exchanged on each connection during the last ten seconds? –Task-5: Identify possible SYN flood attacks
4
Slide no. 4 Application Domains Network monitoring (Controlling and measuring the Internet or parts of it) –Challenges Traffic volumes Get relevant data Privacy –On-line network measurements Passive: Our network tasks Active: E.g. Traceroute and Ping –Off-line network measurements Passive: E.g. InTraBase (Siekkinen, 2006) Active: Pandora FMS(Pandora, 2007) N.M Private network DB Looks at all passing packets Push - based
5
Slide no. 5 Cont. Application Domains Sensor networks –TinyDB Financial tickers –Traderbot Pull-based Push-based
6
Slide no. 6 DSMS Stream Data Model –Definition: A data stream is a real-time, continuous, ordered sequence of items (Golab, 2003) n
7
Slide no. 7 Cont. DSMS Requirements –Continuous query language –Data reduction techniques Sampling Load shedding Aggregations with window techniques Without sliding windows aggregations would be a blocking operator, since one never will see the whole stream at once –Adaptive –Integration with a traditional database –Low latency and high throughput Hopping windows Tumbling windows Overlapping windows Window techniques: Windows are either time-based or tuple-based Streaming tuples should only be kept in main memory, never written to disk (too slow)
8
Slide no. 8 Cont. DSMS Existing systems: Name:Language: TelegraphCQ (Berkeley Uni.)SQL-like STREAM (Stanford Uni.)SQL-like Aurora (Brown, M.I.T++)Boxes and arrows Medusa (Brown, M.I.T++)Boxes and arrows Borealis (Brown, M.I.T++)Boxes and arrows Gigascope ($ AT&T)SQL-Like
9
Slide no. 9 Borealis Stream processing engine (SPE) –Academic research / Public domain –Distributed queries –General purpose Multi-player first person shooter game Network monitoring Continuous query language –Operator boxes and stream arrows –XML + GUI –E.g., operators: Map, Aggregate, Join, Filter, Random Drop and operators for integration with statically stored tables n2n5n3n4 n1 n6 Distributed query Data stream Result tuples High Availability
10
Slide no. 10 Design Task 2 - Version 1 –Average load and packet count Task 1 - Version 1 – Mapping
11
Slide no. 11 Cont. Design Task 3 - Version 2 – Port destination cont Task 4 - Version 2 – Exchanged bytes
12
Slide no. 12 Cont. Design Task 5 - Version 1 –SYN Flood attack (Several hosts initiate half-open connections to a server so that it has to deny service to others) –Identifies the relation between the count of SYN packets and normal packets (Non-SYN). Joins aggregated tuples if SYN count is twice or more the normal packet count.
13
Slide no. 13 Cont. Design <parameter name="predicate" value = "left.count * 2 < right.count and left.count > 0" />
14
Slide no. 14 Experiment Setup Scripts executes the different stages of each experiment TG: Generates traffic fyaf: Filters packet headers from NIC. Counts the number of packets retrieved by the C.A C.A: Transforms the packet headers into tuples. I/O to the Q.P Q.P: Performs the query on the tuples retrieved from C.A System resource consumption is logged by the execution scripts.. fyaf calculates the number of lost packets.. TG controls the amount of generated traffic per second..
15
Slide no. 15 Borealis Implementation Client application main-method: int main( int argc, const char *argv[] ) {... sock = get_connection(); NOTICE << "Socket opened: " << sock; status = marshal.open(); if ( status ) { WARN << "Could not deply the network."; } else { //Start the timer.. timer = Time::now(); // Send the first batch of tuples. Queue up the next round with a delay. marshal.sentPacket(); // Run the client event loop. Return only on an exception. marshal.runClient(); }... } fyafQuery processor Results Data stream Client application
16
Slide no. 16 Evaluation Results for Task 1 ( The map task ) CPU Maximums Drop box can lead to increased CPU utilization
17
Slide no. 17 Cont. Evaluation Results for Task 2 - (the simple task) (Lost packets at different network loads) 40 Mbit/s
18
Slide no. 18 Cont. Evaluation Results for Task 2 - (the simple task) (Task result - Measured Load) A c 98% A c 93% A c 96%
19
Slide no. 19 Cont. Evaluation Results for Task 3 - Memory Consumption Low memory consumption. (31 Mbyte). No changes when increasing load. Static tables causes increased memory consumption, but not much.
20
Slide no. 20 Cont. Evaluation TaskNetwork LoadMemory Consumption Task 130,40 Mbit/s31 Mbyte Task 240 Mbit/s31 Mbyte Task 310, 30 Mbit/s31, 33 Mbyte Task 420 Mbit/s31 Mbyte Task 520 Mbit/s30, 50+ Mbyte
21
Slide no. 21 Conclusion Support complex network monitor queries Borealis can handle network loads: –40 Mbit/s for simple tasks –20 - 30 Mbit/s for complex tasks –10 Mbit/s when comparing input packets with several thousands of statically stored tuples. Load Shedding –Not fully working, does not identify overload situations –random_drop box does not significantly increase supported network load Low memory consumption –System code parameters might affect performance
22
Slide no. 22 Future Work Distribution of queries Expand client application (fyaf and load shedding) Optimization of source code system parameters New version of Borealis (Winter 2007) Comparison with results from TelegraphCQ (Søberg, 2006) and STREAM (Hernes, 2006)
23
Slide no. 23 Bibliography (Søberg, 2006) - Design, implementation, and evaluation of network monitoring tasks with the TelegraphCQ data stream management system,Master’s Thesis 2006. (Hernes, 2006) - Design, implementation, and evaluation of network monitoring tasks with the STREAM data stream management system, Master’s Thesis 2006. (Siekkinen, 2006) - Root Cause Analysis of TCP Throughput: Methodology, Techniques, and Applications, Dr. Scient. Thesis 2006. (Golab, 2003) - Issues in Data Stream Management, Lukasz Golab and M. Tamer Ötzu, 2003 (Pandora, 2007) - http://pandora.sourceforge.net
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.