Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of Iowa | Mobile Sensing Laboratory CSense: A Stream-Processing Toolkit for Robust and High-Rate of Mobile Sensing Applications IPSN 2014 Farley.

Similar presentations


Presentation on theme: "University of Iowa | Mobile Sensing Laboratory CSense: A Stream-Processing Toolkit for Robust and High-Rate of Mobile Sensing Applications IPSN 2014 Farley."— Presentation transcript:

1 University of Iowa | Mobile Sensing Laboratory CSense: A Stream-Processing Toolkit for Robust and High-Rate of Mobile Sensing Applications IPSN 2014 Farley Lai, Syed Shabih Hasan, Austin Laugesen, Octav Chipara Department of Computer Science

2 University of Iowa | Mobile Sensing Laboratory | Mobile Sensing Applications (MSAs) CSense Toolkit 2 Speaker Models Speech Recording VAD Feature Extraction HTTP Upload Sitting Standing Walking Running Climbing Stairs … Bluetooth Data Collection Feature Extraction Activity Classification Speaker Identification Activity Recognition

3 University of Iowa | Mobile Sensing Laboratory | Mobile sensing applications are difficult to implement on Android devices – concurrency – high frame rates – robustness Resource limitations and Java VM worsen these problems – additional cost of virtualization – significant overhead of garbage collection Challenges CSense Toolkit 3

4 University of Iowa | Mobile Sensing Laboratory | Support for MSAs – SeeMon, Coordinator: constrained queries – JigSaw: customized pipelines  CSense provides a high-level stream programming abstraction general and suitable for a broad range of MSAs CSense builds on prior data flow models – Synchronous data flows: static scheduling and optimizations e.g., StreamIt, Lustre – Async. data flows: more flexible but have lower performance e.g., Click, XStream/Wavescript Related Work CSense Toolkit 4

5 University of Iowa | Mobile Sensing Laboratory | Programming model Compiler Run-time environment Evaluation CSense Toolkit 5

6 University of Iowa | Mobile Sensing Laboratory | Applications modeled as Stream Flow Graphs (SFG) – builds on prior work on asynchronous data flow graphs – incorporates novel features to support MSA Programming Model CSense Toolkit 6 addComponent("audio", new AudioComponentC(rateInHz, 16)); addComponent("rmsClassifier", new RMSClassifierC(rms)); addComponent("mfcc", new MFCCFeaturesG(speechT, featureT))... link("audio", "rmsClassifier"); toTap("rmsClassifier::below"); link("rmsClassifier::above", "mfcc::sin"); fromMemory("mfcc::fin");... addComponent("audio", new AudioComponentC(rateInHz, 16)); addComponent("rmsClassifier", new RMSClassifierC(rms)); addComponent("mfcc", new MFCCFeaturesG(speechT, featureT))... link("audio", "rmsClassifier"); toTap("rmsClassifier::below"); link("rmsClassifier::above", "mfcc::sin"); fromMemory("mfcc::fin");... create components wire components

7 University of Iowa | Mobile Sensing Laboratory | Goal: Reduce memory overhead introduced by garbage collection and copy operations Pass-by-reference semantics – allows for sharing data between components Explicit inclusion of memory management in SFGs – focuses programmer’s attention on memory operations – enables static analysis by tracking data exchanges globally – allows for efficient implementation Memory Management CSense Toolkit 7

8 University of Iowa | Mobile Sensing Laboratory | Data flows from sources, through links, to taps Implementation: – sources implement memory pools that hold several frames – references counters used to track sharing of frames – taps decrement reference counters Memory Management CSense Toolkit 8 Audio data MFCCs Filenames

9 University of Iowa | Mobile Sensing Laboratory | Goal: Expressive concurrency model that may be analyzed statically Components are partitioned into execution domains – components in the same domain are executed on a thread – frame exchanges between domains are mediated using shared queues Other data sharing between components are using a tuple space Concurrency is specified as constraints –NEW_DOMAIN / SAME_DOMAIN – heuristic assignment of components to domains to minimize data exchanges between domains Static analysis may identify some data races Concurrency Model CSense Toolkit 9

10 University of Iowa | Mobile Sensing Laboratory | CSense Toolkit 10 Concurrency Model getComponent("audio").setThreading(Threading.NEW_DOMAIN); getComponent("httpPost").setThreading(Threading.NEW_DOMAIN); getComponent("mfcc").setThreading(Threading.SAME_DOMAIN); getComponent("audio").setThreading(Threading.NEW_DOMAIN); getComponent("httpPost").setThreading(Threading.NEW_DOMAIN); getComponent("mfcc").setThreading(Threading.SAME_DOMAIN); Compiler transformation

11 University of Iowa | Mobile Sensing Laboratory | Goal: Promote component reuse across MSAs A rich type system that extends Java’s type system – most components use generic type systems – insight: frame sizes are essential in configuring components detect configuration errors / optimization opportunities Type System CSense Toolkit 11 VectorC energyT = TypeC.newFloatVector(); energyT.addConstraint(Constraint.GT(8000)); energyT.addConstraint(Constraint.LT(24000)); VectorC speechT = TypeC.newFloatVector(128); VectorC featureT = TypeC.newFloatVector(11); VectorC energyT = TypeC.newFloatVector(); energyT.addConstraint(Constraint.GT(8000)); energyT.addConstraint(Constraint.LT(24000)); VectorC speechT = TypeC.newFloatVector(128); VectorC featureT = TypeC.newFloatVector(11);

12 University of Iowa | Mobile Sensing Laboratory | Not all configurations may be implemented efficiently Flow Analysis CSense Toolkit 12 Constraints: energyT > 8000 energyT < 24000 speechT = 128 featuresT = 11 Constraints: energyT > 8000 energyT < 24000 speechT = 128 featuresT = 11 energyTspeechT Inefficient10,000128 Efficient10,240 (128 * 80)128

13 University of Iowa | Mobile Sensing Laboratory | Not all configurations may be implemented efficiently Flow Analysis CSense Toolkit 13 Constraints: energyT > 8000 energyT < 24000 speechT = 128 featuresT = 11 Constraints: energyT > 8000 energyT < 24000 speechT = 128 featuresT = 11 energyTspeechT Inefficient10,000128 Efficient10,240 (128 * 80)128 M rms =1 M mfcc =80 An efficient implementation exists when M rms * energyT = M mfcc * speechT An efficient implementation exists when M rms * energyT = M mfcc * speechT

14 University of Iowa | Mobile Sensing Laboratory | Goal: determine configurations have efficient frame conversions Problem may be formulated as an integer linear program – constraints: generated from type constraints – optimization: minimize total memory usage – solution: specifies frame sizes and multipliers for application An efficient frame conversion may not exist – the compiler relaxes conversion rules Flow Analysis CSense Toolkit 14

15 University of Iowa | Mobile Sensing Laboratory | Static analysis: – composition errors, memory usage errors, race conditions Flow analysis: – whole-application configuration and optimization Stream Flow Graph transformations: – domain partitioning, type conversions, MATLAB component coalescing Code generation: – Android application/service, MATLAB (C code + JNI stubs) CSense Compiler CSense Toolkit 15

16 University of Iowa | Mobile Sensing Laboratory | Components exchange data using push/pull semantics Runtime includes a scheduler for each domain – task queue + event queue – wake lock – for power management CSense Runtime CSense Toolkit 16 Scheduler1 Task Queue Event Queue Scheduler2 Task Queue Event Queue Memory Pool

17 University of Iowa | Mobile Sensing Laboratory | Micro benchmarks evaluate the runtime performance – synchronization primitives + memory management Implemented the MSA using CSense – Speaker identification – Activity recognition – Audiology application Setup – Galaxy Nexus, TI OMAP 4460 ARM A9@1.2 GHz, 1 GB – Android 4.2 – MATLAB 2012b and MATLAB Coder 2.3 Evaluation 17 CSense Toolkit

18 University of Iowa | Mobile Sensing Laboratory | Scheduler: memory management + synchronization primitives Memory management options – GC: garbage collection – MP: memory pool Concurrent access to queues and memory pools – L: Java reentrant lock – C: CSense atomic variable based synchronization primitives Producer-Consumer Benchmark 18 CSense Toolkit

19 University of Iowa | Mobile Sensing Laboratory | 19 Producer-Consumer Throughput Garbage collection overhead limits scalability Concurrency primitives have a significant impact on performance 30% 13.8x CSense Toolkit 19x

20 University of Iowa | Mobile Sensing Laboratory | Reentrant locks incurs GC due to implicit allocations CSense runtime has low garbage collection overhead Producer-Consumer GC Overhead 20 no garbage collection (in this benchmark) CSense Toolkit

21 University of Iowa | Mobile Sensing Laboratory | Benefits of flow analysis Runtime overhead MFCC Benchmark 21 CSense Toolkit

22 University of Iowa | Mobile Sensing Laboratory | Flow analysis eliminates unnecessary memory copy Benefits of larger but efficient frame allocations – reduced number of component invocations and disk I/O overhead – Increased cache locality MFCC Benchmark CPU Usage 22 45% decrease CSense Toolkit

23 University of Iowa | Mobile Sensing Laboratory | Runtime overhead is low for a wide range of data rates MFCC Runtime Overhead 23 1.83% 2.39% CSense Toolkit

24 University of Iowa | Mobile Sensing Laboratory | Programming model – efficient memory management – flexible concurrency model – rich type system Compiler – whole-application configuration & optimization – static and flow analyses Efficient runtime environment Evaluation – implemented three typical MSAs – benchmarks indicate significant performance improvements 19X throughput boost compared with naïve Java baseline 45% CPU time reduced with flow analysis Low garbage collection overhead Conclusions 24 CSense Toolkit

25 University of Iowa | Mobile Sensing Laboratory | National Science Foundation (NeTs grant #1144664 ) Carver Foundation (grant #14-43555 ) Acknowledgements 25 CSense Toolkit

26 University of Iowa | Mobile Sensing Laboratory | Runtime scheduler overhead of a complex 6-domain application that accesses both phone sensors and remote shimmer motes over bluetooth ActiSense Benchmark 26 CSense Toolkit

27 University of Iowa | Mobile Sensing Laboratory | Runtime scheduler overhead of a complex 6-domain application that accesses both phone sensors and remote shimmer motes over bluetooth ActiSense Benchmark 27 CSense Toolkit

28 University of Iowa | Mobile Sensing Laboratory | Overall domain scheduler overhead is small despite a longer pipeline ActiSense CPU Usage 28 50 Hz 60 Hz CSense Toolkit

29 University of Iowa | Mobile Sensing Laboratory | AudioSense 29 CSense Toolkit

30 University of Iowa | Mobile Sensing Laboratory | AudioSense 30 CSense Toolkit


Download ppt "University of Iowa | Mobile Sensing Laboratory CSense: A Stream-Processing Toolkit for Robust and High-Rate of Mobile Sensing Applications IPSN 2014 Farley."

Similar presentations


Ads by Google