© 2010 IBM Corporation IBM InfoSphere Streams Enabling a smarter planet Roger Rea InfoSphere Streams Product Manager Sept 15, 2010
© 2010 IBM Corporation 2 Moore’s Law drives new waves of technology … 2 Technology Waves Welcome to the Decade of Smart Multicore Chips Embedded Chips ,000 Billions of Units Shipped ,000 The “Internet of Things” S/360 IBM PC World Wide Web Source: IDC, SSR and IBM Market Insights
© 2010 IBM Corporation 3 Time is ripe for a new era of computing in support of Big Data Emerging trends create need for new languages Scientific programming Fortran Business programming Cobol Systems programming at higher level C Increased productivity C++ Web programming Java Streaming data sources and multicore architectures Streams Processing Language
© 2010 IBM Corporation 4 IBM InfoSphere Streams Streaming analytic applications Multiple input streams Advanced streaming analytics Eclipse based IDE Define sources, apply operators, define intermediary and final output sinks User defined operators in Java or C++ Optimizing compiler automates deployment and connections Extremely low latency Cluster of up to 125 nodes InfoSphere Streams Studio (IDE for Streams Processing Language) Source Adapters Sink Adapters Operator Repository Automated, Optimized Deploy and Management (Scheduler)
© 2010 IBM Corporation 5 Scalable stream processing InfoSphere Streams provides A programming model and IDE for defining data sources and software analytic modules called operators that are fused into process execution units (PEs) infrastructure to support the composition of scalable stream processing applications from these components deployment and operation of these applications across distributed x86 processing nodes, when scaled processing is required stream connectivity between data sources and PEs of a stream processing application
© 2010 IBM Corporation 6 Streams offers tremendous deployment flexibility With only a simple re-compile of application: All on one machine fused into one multi-threaded process All on one machine; each operator in its own process Each operator in its own process, each process on its own machine
© 2010 IBM Corporation 7 ANISE: Active Network for Information from Synchrotron Experiments High speed network to process data from synchrotrons in Canada and US using the CANARIE network Canadian Light Source, Canada Argonne Lab. US Stream Computing
© 2010 IBM Corporation 8 TerraEchos Adelos™– Covert Intrusion Detection State-of-the-art covert surveillance based on Streams platform Acoustic signals from buried fiber optic cables are monitored, analyzed and reported in real time to locate intruders Currently designed to scale up to 1600 streams of raw binary data
© 2010 IBM Corporation 9 Forecasting Space Weather at LOFAR Outrigger in Scandinavia (LOIS) Triaxial AntennaInfoSphere Streams Radio signal input and data preparation Signal detection and noise filtering Strength and 3D directional analysis Swedish Institute of Space Physics Solar Flares Space Weather prediction regarding impact on satellites and electric grids ++ =
© 2010 IBM Corporation 10 Real Time Marine Mammal Position and Behavior Modeling Analytics & Sensors Advanced Acoustical Analytics InfoSphere Streams Filter wind & wave noise Model Marine Mammal environment Correlate to Galway Bay ecosystem ++ =
© 2010 IBM Corporation 11 What are key advantages of Streams? Compiling groups of operators into single processes enables: Efficient use of cores Distributed execution Very fast data exchange Can be automatic or tuned Can be scaled with the push of a button Language built for Streaming applications: Reusable operators Rapid application development Continuous “pipeline” processing Extremely flexible and high performance transport: Very low latency High data rates Easy to extend: Built in adaptors Extend with C++ and Java Extend running applications Use the data that gives you a competitive advantage: Can handle virtually any data type Use data that is too expensive and time sensitive for other approaches
© 2010 IBM Corporation 12 QUESTIONS ?