Presentation is loading. Please wait.

Presentation is loading. Please wait.

Preparing for the Internet of Things 50 Trillion Gigabyte Challenge

Similar presentations


Presentation on theme: "Preparing for the Internet of Things 50 Trillion Gigabyte Challenge"— Presentation transcript:

1 Preparing for the Internet of Things 50 Trillion Gigabyte Challenge
Pat McGarry Ryft Systems, Inc. We are in the midst of the biggest challenge to IT since the emergence of the Internet, as businesses scramble to mine actionable insights from everything around us – phones, cameras, cars, appliances, accessories, and industrial sensors. As the world marches toward the 50 trillion Gigabyte data mark, the gap between the data that can be collected and the data that can be analyzed in time to be useful is growing wider by the day. When it comes to carving out your competitive advantage with the IoT, you can never be too fast or too efficient. In today’s keynote presentation, I’m going to provide practical insights into the unique challenges associated with IoT data and why it presents so many hurdles to conventional Enterprise and cloud infrastructures. I will also cover new heterogeneous computing architectures that are enabling companies to modernize their information infrastructure to overcome roadblocks to IoT success.

2 The IoT 50 Trillion GB Challenge: The Largest Opportunity & Threat Since the Internet
Internet of Things (IoT) sensors can collect huge amounts of data. One of the key challenges is deciding how to manage all this data. Even though it's possible to collect immense amounts of data, this approach is not always ideal because of unnecessary data management costs that would be incurred. A more economical approach to IoT sensor data management is to employ local, embedded data analysis to filter very large data samples and send alerts only on a significant change in the data. SOURCE: WIKIBON BIG DATA VENDOR REVENUE & MARKET FORECAST

3 Data Dynamics: Critical Differences in IoT Data What You Need to Know About IoT Data and Its Impact on Information Infrastructure Variety: an explosion of types and formats Structure: unstructured and messy Volume: too much for most platforms to analyze Velocity: fast and furious Value: expires quickly Location: widely distributed 1. Variety: IoT Creates an Explosion of Data Types and Formats Traditionally, IoT data has been associated with monitoring and collecting data from devices. However, organizations are looking to do more than simply monitor data. They see the significant value in being able to get a clearer view of markets and buyers to boost profits, enhance customer service and improve employee productivity. IoT devices such as mobile phones, sensors, RFID tags, video cameras or mobile phones produce a wide variety of data from structured to unstructured, small to large file sizes, text and images. Today’s analytics platforms were not designed to analyze these different types and formats of data together to produce relevant insights in real-time, especially at the network edge where the data is gathered and used. To fit the strict parameters of conventional analytics tools, these different types of data must be transformed prior to analysis. Yet the ETL and data preparation process slows down the data pipeline creating bottlenecks and delaying the insights needed to act in time. 2. Velocity: IoT Data Comes at You Fast and Furious Unlike structured data, typically stored by enterprises in a centralized data warehouse, IoT data is generated by a sprawl of different devices from any number of locations, coming from stationary or mobile devices. Finding value in IoT data can be like panning for gold in the ocean. Connected devices stream massive amounts of data very quickly, and extracting insights from IoT feeds usually requires some type of correlation between that streaming data and historical data stored in a centralized data warehouse or lake. The challenge with dynamic, poorly-structured data is that legacy relational database systems have a difficult time keeping up with it. And a variety of NoSQL approaches have similar problems for semi-structured and structured data, especially at the network’s edge. The transformation or indexing that is needed to even begin to analyze the data can often take hours, days, or even weeks. This is a serious problem for contemporary analytics products. Aside from the obvious network and transport inefficiencies, the correlations often required to gain critical insights from streaming and historical data often come too little too late—or not at all. 3. Location: Data Sources are Widely Distributed Before we ever dreamed of calling it the Internet of Things or the Internet of Everything, businesses were deploying remote sensors in the wild—from retail outlets to remote oil rigs to planes, trains and automobiles. Billions of devices, sensors and networks connected to the Internet and creating and receiving data around the clock. ABI Research estimated that the volume of data captured by IoT-connected devices exceeded 200 exabytes in 2014, and will grow to 1.6 zettabytes by ABI also estimates that more than 90% of IoT-generated data is stored or processed locally, rendering it inaccessible for real-time analytics. This translates to tremendous amounts of data that must be transferred to another location for storage and analysis. As the number of devices expands and the volume of data increases, the costs of data transport and data storage can quickly become prohibitive.  

4 Common Barriers to IoT’s Popular Use Cases
WHAT ENTERPRISES NEED TO THRIVE WHAT ENTERPRISES HAVE TODAY Real-time insights as events occur, close to the source of data Advanced-scale performance & storage to analyze data from a variety of IoT devices Compact & efficient infrastructure Easy to deploy, use & maintain ecosystems Minimal disruption to existing ecosystems Low operational costs No security or performance trade-offs Analysis slowed by data ETL & movement Persistent compute, I/O & storage bottlenecks Data types that must be analyzed in silos Sprawling, inefficient analytics infrastructures Frequent software ecosystem updates Persistent data privacy & security issues

5 The Heart of Popular IoT Use Cases
Real-time Image Recognition Fraud Detection Biometric Recognition Voice Recognition Behavior Monitoring Optical Character Recognition Similarity Search Financial Compliance Malicious Pattern Matching Cyber Security Persistent Barriers to IoT Success: Complex, Often Impossible, Analytic Functions Analytics functions—like fuzzy search—often require more or different computing power than is available in today’s analytics infrastructures Traditional analytics ecosystems require massive indexes and data preparation functions The combination of data preparation time and analysis limitations don’t allow for real-time analytics that capture all relevant insights Indexing Isn’t the Solution. It’s the Problem. Why Cloud is Necessary but Not Sufficient. Any search engine can match a query string to data stored in the database – but only if it's exact. What happens when there are variations or differences in either? Even more challenging: what if there are different errors in both? Efforts under the theme “Big data”  has solved many IoT analytics challenges. Especially, the system challenges related to large-scale data management, learning, and data visualizations. Data for “Big data”, however, came mostly from computer based systems (e.g. transaction logs, system logs, social networks,  and mobile phones). IoT data, in contrast, will come from the natural world, would be more detailed, fuzzy, and large. Nature of that data, assumptions, and use cases differ between old Big data and new IoT data. IoT analytics designers can build on top of big data, yet work is far from being done. This allows for real-time and localized feedback in an efficient manner and reserves wireless bandwidth consumption for sensor fusion applications.

6 Thriving in the IoT Era: Fast Data Analysis Powered by New Hybrid FPGA/x86 Compute Architectures
“Systems built on GPUs and FPGAs will function more like human brains that are particularly suited to be applied to deep learning and other pattern-matching algorithms that smart machines use. FPGA-based architecture will allow further distribution of algorithms into smaller form factors, with considerably less electrical power in the device mesh, thus allowing advanced machine learning capabilities to be proliferated into the tiniest IoT endpoints, such as homes, cars, wristwatches and even human beings. — David Cearley, Gartner “Intel’s $16.7 Billion Altera Deal Is Fueled by Data Centers.” “Microsoft Supercharges Bing Search with Programmable Chips.”

7 Hybrid Compute: The Right Engine for the Job
CPU GPU FPGA General purpose computing Sequential in nature Nondeterministic performance —Interrupts —Memory allocation Problems broken into sequential operations & processed serially Some general purpose computing Excels at mathematically complex algorithms Image rendering, some image analysis Generally more parallel than CPUs, since GPUs have more cores Generally more power efficient than CPU Not general purpose — Purpose built algorithms — Can be reprogrammed via firmware Data analysis — Search, fuzzy search, image and video analysis, deep learning Inherently parallel — Can execute many hardware- parallel operations in one clock cycle — More output with less power — Can complete the same problem at 100X the performance of x86/CPU

8 Requirements for Success: Compute-agnostic API
Performance CPU FPGA GPU CPU FPGA GPU Open API Closed analytics systems are expensive, hard to use and require huge teams to implement Open source frameworks are easier to use, but their performance is limited by the commodity x86 servers they run on Organizations have been forced to sacrifice performance or simplicity

9 The Future Is Intelligence at the Network Edge
Find the right data–even when it’s incomplete–whenever & wherever you need it. EDGE NODE This allows for real-time and localized feedback in an efficient manner and reserves wireless bandwidth consumption for sensor fusion applications. Accelerate Time-to-Results: Ensure complete, accurate information can be found instantly – no matter the data type or language – without special tuning Reduce Time & Effort to Make Data Usable: Free domain experts from manually fixing problematic data and allow them to focus on efforts that add business value Quick Start: Search functionalities are seamlessly integrated – without impacting the performance of existing applications – and can be up and running in a matter of days Instant Feedback: Real-time indexing and updates ensure match requests showcase the most up-to-date information

10 Questions? Visit the Ryft IoT SLAM booth Pat McGarry


Download ppt "Preparing for the Internet of Things 50 Trillion Gigabyte Challenge"

Similar presentations


Ads by Google