Download presentation
Presentation is loading. Please wait.
1
Clustering DNS Problems
Applying Control Theory to Data Stream Processing Systems Wei Xu Bill Kramer Peter Bodik Problem: TCQ drops tuples when result queue is full Goal of control: By controlling data rate to TCQ node Regulate queue length on TCQ node Prevent dropping tuples Maximize throughput (and adapts when disturbance happens) Output Rate Controller Controlled Data Source Queue Length Monitor ? raw log data Failure Detection Data Collection Automatic analysis online service feedback loop preprocessing Sanitized Data Repository Buffer Source TCQ Result Q Buffer Source TCQ Result Q Preprocessing Data Logs are in different format Information we need may be implicit Merge information from various sources Sampling Sanitize the data Data stream processing Continuous queries Using Telegraph CQ Preprocessing expressed as SQL queries Queries over a sliding time window Run multiple instances for scalability Problem: Actual output is not the same as desired rate for various reasons Goal: Providing an accurate data source using feedback control by controlling the “desired data rate” setting on the output thread Feature Selection Clustering Visualization See Poster Clustering DNS Problems load splitter combiner 4 1 TCQ query Q P Controller with Pre-compensation PI Controller 4 1 5 2 6 3 SLT 1 6 5 4 3 2 1 6 5 4 3 2 1 Client write duration is an outlier bytes-served <= 67958 | R_error-code = yes | | R_content-type = yes: true (253/6) | | R_content-type != yes: false (17) | R_error-code != yes | | gmt = :01:07: true (54) | | gmt != :01:07 | | | user-id = : true (99/6) | | | user-id != | | | | gmt = :23:28: true (45) | | | | gmt != :23:28 | | | | | visit-url = : true (43) | | | | | visit-url != : false (18005) bytes-served > 67958: true (17733/55) Error Code bytes-served <= 195: 145 (135/9) bytes-served > 195 | R_content-len = yes: 32 (98) | R_content-len != yes | | R_not-cached-reason = yes: 32 (45/19) | | R_not-cached-reason != yes | | | duration <= 15.2 | | | | bytes-received <= 2680: -13 (39) | | | | bytes-received > 2680 | | | | | bytes-received <= 2805: 131 (30/7) | | | | | bytes-received > 2805: -13 (85/13) | | | duration > 15.2: 131 (69/6) Decision Trees 5 2 6 5 4 3 2 1 6+5+4 3+2+1 TCQ query R SLT 2 6 3 Scalable Software Architecture for Data Stream Processing Buffer Source TCQ Result Q Queue Length Controller Desired Queue length Actual Queue Length Data Rate to TCQ Controlled Output Thread (Code Reuse) Output Y from simulation If not careful with feedback control … System can become unstable under normal load Control theory analysis help make correct design
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.