SAN DIEGO SUPERCOMPUTER CENTER, UCSD DataTurbine for science Paul Hubbard June 5 2008 Cyberinfrastructure Lab for Environmental Observing.

Slides:



Advertisements
Similar presentations
RBNB DataTurbine (Ring Buffered Network Bus )
Advertisements

XProtect ® Express Integration made easy. With support for up to 48 cameras, XProtect Express is easy and affordable IP video surveillance software with.
SAN DIEGO SUPERCOMPUTER CENTER NEAR REAL TIME VISUALIZATION OF USGS INSTANTANEOUS DATA: INTEGRATION OF OPEN SOURCE DATA TURBINE IN CUAHSI HIS Thomas Whitenack.
ConferenceXP and Distance Learning at UW CSE Fred Videon & Rod Prieto.
Copyright 2004 Monash University IMS5401 Web-based Systems Development Topic 2: Elements of the Web (g) Interactivity.
Notes to the presenter. I would like to thank Jim Waldo, Jon Bostrom, and Dennis Govoni. They helped me put this presentation together for the field.
Operating Systems CS451 Brian Bershad
INTERNET DATABASE Chapter 9. u Basics of Internet, Web, HTTP, HTML, URLs. u Advantages and disadvantages of Web as a database platform. u Approaches for.
Electrical Engineering Department Software Systems Lab TECHNION - ISRAEL INSTITUTE OF TECHNOLOGY Meeting recorder Application based on Software Agents.
Interpret Application Specifications
SM3121 Software Technology Mark Green School of Creative Media.
AXIS Camera Station Comprehensive video management software for monitoring, recording, playback and event management.
Digital Surveillance. Danish Software Company specialized in development of security solutions. Started in year 2000 with HQ in Copenhagen. All solutions.
Barracuda Networks Confidential1 Barracuda Backup Service Integrated Local & Offsite Data Backup.
Esri International User Conference | San Diego, CA Technical Workshops | Esri Tracking Solutions: Working with real-time data Adam Mollenkopf David Kaiser.
© 2010 VMware Inc. All rights reserved VMware ESX and ESXi Module 3.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
March 2004 At A Glance ITOS is a highly configurable low-cost control and monitoring system. Benefits Extreme low cost Database driven - ITOS software.
1 JCM 106 Computer Application for Journalism Lecture 1 – Introduction to Computing.
TC2-Computer Literacy Mr. Sencer February 8, 2010.
Introduction. Readings r Van Steen and Tanenbaum: 5.1 r Coulouris: 10.3.
INTERNET CHAPTER 12 Information Available The INTERNET contains a huge amount of information a huge amount of information information on any topic you.
For more notes and topics visit:
ABSTRACT Real-time systems applied to seismic data acquisition, asynchronous processing, and data archiving tasks have clearly demonstrated their utility.
Discussion and conclusion The OGC SOS describes a global standard for storing and recalling sensor data and the associated metadata. The standard covers.
By Mihir Joshi Nikhil Dixit Limaye Pallavi Bhide Payal Godse.
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
NV V5.7 Product Presentation. Brand New Professional GUI  Multiple User Interface for different look and feel  Audio indicator on camera (play audio.
Feb. 19, 2008 CU-NEES 2008 FHT Workshop Network Infrastructure; Real time viewer; Telepresence; Data Management Kent Polkinghorne Information Technology.
Chapter 9: Novell NetWare
Portable SSH Brian Minton EKU, Dept. of Technology, CEN/CET)‏
NATIONAL INSTITUTE OF SCIENCE & TECHNOLOGY Presented by: Santosh kumar Swain Technical Seminar Presentation by SANTOSH KUMAR SWAIN Roll # CS
1 COMPSCI 110 Operating Systems Who - Introductions How - Policies and Administrative Details Why - Objectives and Expectations What - Our Topic: Operating.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
Mehdi Ghayoumi Kent State University Computer Science Department Summer 2015 Exposition on Cyber Infrastructure and Big Data.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
Installation and Development Tools National Center for Supercomputing Applications University of Illinois at Urbana-Champaign The SEASR project and its.
Introduction to the Adapter Server Rob Mace June, 2008.
Streaming Media A technique for transferring data on the Internet so it can be processed as a steady and continuous stream.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
A Preview of NEESgrid 3.0 Capabilities NEES Consortium Annual Meeting San Diego, CA May 2004.
SAN DIEGO SUPERCOMPUTER CENTER, UCSD DataTurbine at SDSC Paul Hubbard Cyberinfrastructure Lab for Environmental Observing Systems Science R&D SDSC/UCSD.
ELEMENTS OF A COMPUTER SYSTEM HARDWARE SOFTWARE PEOPLEWARE DATA.
How* to Win the #BestMicrosoftHack Shahed Chowdhuri Sr. Technical WakeUpAndCode.com *Hint: Use the Cloud.
Enterprise Cloud Computing
1 Title: Introduction to Computer Instructor: I LTAF M EHDI.
Computer Software Types Three layers of software Operation.
Internet2 AdvCollab Apps 1 Access Grid Vision To create virtual spaces where distributed people can work together. Challenges:
Introduction TO Network Administration
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Introduction to EJB. What is an EJB ?  An enterprise java bean is a server-side component that encapsulates the business logic of an application. By.
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
Information Systems in Organizations 5.2 Cloud Computing.
RBNB DataTurbine (Ring Buffered Network Bus ) Released under Apache 2.0 Open Source License Solution for accessing both streaming and static data, from.
SAN DIEGO SUPERCOMPUTER CENTER Welcome to the 2nd Inca Workshop Sponsored by the NSF September 4 & 5, 2008 Presenters: Shava Smallen
August Video Management Software ViconNet Enterprise Video Management Software Hybrid DVR Kollector Strike Kollector Force Plug & Play NVR HDExpress.
Learning Aim B.  In this section, you will consider the resources necessary for designing your website.  You will also think about any constraints that.
ITP 109 Week 2 Trina Gregory Introduction to Java.
Virtual Machines Module 2. Objectives Define virtual machine Define common terminology Identify advantages and disadvantages Determine what software is.
ITMT 1371 – Window 7 Configuration 1 ITMT Windows 7 Configuration Chapter 8 – Managing and Monitoring Windows 7 Performance.
 Cloud Computing technology basics Platform Evolution Advantages  Microsoft Windows Azure technology basics Windows Azure – A Lap around the platform.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
DataTurbine for science
StratusLab Final Periodic Review
StratusLab Final Periodic Review
Software Defined Networking (SDN)
New Tools In Education Minjun Wang
Client/Server Computing and Web Technologies
Division of Engineering Computing Services
Presentation transcript:

SAN DIEGO SUPERCOMPUTER CENTER, UCSD DataTurbine for science Paul Hubbard June Cyberinfrastructure Lab for Environmental Observing Systems Science R&D SDSC/UCSD

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Outline A bit about streaming data Introduction to DataTurbine Science motivations Where to use DataTurbine Where not to use DataTurbine Example applications and related topics Can I use it? Please ask questions at any time!

SAN DIEGO SUPERCOMPUTER CENTER, UCSD A bit about streaming data TCP/IP makes sending data from point A to point B easy. What’s the big deal? Add more listeners Add ability to have listeners subscribe to different streams What happens to the data if and when the network fails? What happens if one listener is slower than the others? The canonical answer to the general problem is a ‘ring buffer’. Multicast solves some but not all of this. Mainly useful for distributed systems...

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Science motivations for DataTurbine You have... Collaborators spread out, or remote sites You need... To see data as soon as possible To integrate equipment from different vendors You want... To integrate video or photographs with your numeric data The ability to collaboratively annotate the experiment as it happens To help us debug our audio support To play around with live data + Google Earth

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Why do you want a ring buffer? Ring buffers (a.k.a circular buffers) keep a finite history for each stream Accommodate slow clients, post-processing, replay The buffers can be stored on disk and therefore be arbitrarily large Programs can subscribe to streams; very efficient There’s no infinite buffer, so circular is the best you can do. Make each buffer as long as you want; disk is cheap!

SAN DIEGO SUPERCOMPUTER CENTER, UCSD When to use something else Low-availability network connections E.g. MBARI ocean buoys that dial up over satellite networks. We have no existing mechanism for this sort of batch transfer; JMS would probably be better. Simple topology, single vendor Point A -> Point B -> files -> MATLAB is often a complete solution Signal-limited data If you absolutely need every data point, a transactional system (database, JMS) provides stronger guarantees.

SAN DIEGO SUPERCOMPUTER CENTER, UCSD What is DataTurbine? DataTurbine is an open source, Java based network ring buffer for all sorts of data. You can use memory + disk for the ring and it runs on almost any JVM. Started life as a NASA telemetry project The basic division of work looks like this:

SAN DIEGO SUPERCOMPUTER CENTER, UCSD ‘A series of tubes’ The web is very similar: Browser, webserver and back-end data from a variety of sources.

SAN DIEGO SUPERCOMPUTER CENTER, UCSD A complex example

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Sounds complex. Is it fast? Yes! This is a Macbook pro to T2000 over gigabit - ~30MB/sec from MATLAB. * We burst 65MB/sec on a single gigabit link

SAN DIEGO SUPERCOMPUTER CENTER, UCSD More about DataTurbine Sources can have multiple channels with varied types - numeric (e.g. sensors), video, audio, text, binary blobs. We have a variety of sources and sinks: In- house, from the original author Creare and also community contributed Can also use plugins for tightly-coupled computations such as image processing. Runs on J2ME, J2EE and 64-bit JVM as well. Extremely scalable.

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Deployments of DataTurbine I’ve included some screenshots of DataTurbine in use to give a flavor of current utilization. I’m hoping to give you some ideas how you might use it yourself, so please ask or interrupt.

SAN DIEGO SUPERCOMPUTER CENTER, UCSD CLEOS/HPWREN deployment at Santa Margarita Ecological Reserve

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Santa Margarita Ecological Reserve

SAN DIEGO SUPERCOMPUTER CENTER, UCSD NCHC Kenting (Taiwan) Kenting National Park and Yuan-Yang Lake, pictures from Fang- Pang Lin and Ebbe Strandell

SAN DIEGO SUPERCOMPUTER CENTER, UCSD More Kenting

SAN DIEGO SUPERCOMPUTER CENTER, UCSD SUNY Buffalo - Earthquake engineering

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Insight Racing - video DARPA autonomous vehicle competition Insight is using DataTurbine for their vehicle video in their Lotus North Carolina State University, using multiple Axis 206 cameras, 30fps each

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Lehigh - 3D viewer

SAN DIEGO SUPERCOMPUTER CENTER, UCSD NASA Dryden Flight Center Intelligent Network Data Server (INDS) Fusion of DataTurbine, Google Earth and live telemetry Instruments flown on ER-2 (U2) and DC-8

SAN DIEGO SUPERCOMPUTER CENTER, UCSD One more NASA

SAN DIEGO SUPERCOMPUTER CENTER, UCSD You can also view data via the web

SAN DIEGO SUPERCOMPUTER CENTER, UCSD GWT-based web interface

SAN DIEGO SUPERCOMPUTER CENTER, UCSD System monitoring “A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable." --Lamport Baseline: Web-accessible system for monitoring hardware, software and network Better: Keep a history of up/down/metrics Best: Infer state from events and automatically act. E.g., “All services and ICMPs are down from X. I infer from this that the network has failed.” We have ‘better,’ but not ‘best’: Inca, Monit, M/Monit

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Monit in action - basic state

SAN DIEGO SUPERCOMPUTER CENTER, UCSD m/monit: time-series monitoring

SAN DIEGO SUPERCOMPUTER CENTER, UCSD INCA monitoring GLEON

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Snippit of detailed Inca view

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Can I use it? Current device support Data acquisition (DAQ) National Instruments (NI-DAQ, DAQmx, Compact RIO) via Java proxy Campbell Scientific: File-based, via LoggerNet, up to 1Hz tested. Dataq Instruments (serial connect via C & DaqToRbnb) PUCK, (Programmable Underwater Connector with Knowledge) Seabird/Seacat Vaisala weather station Video and still cameras Anything with motion JPEG via URL:Axis, Panasonic, etc Still images via WebDAV, HttpMonitor Accelerometers ADXL202 and Apple laptop

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Can I use it? Software support Primary interface is via the Java-based API Other avenues If you’re on Windows, there’s ActiveX TCP/UDP proxy (some code required) WebDAV/HTTP (Not as fast, but cross-platform and very flexible) Java-based proxy on TCP/IP - we use this a lot MATLAB API provided, including performance metrics and test suite. Geotagging and Google Earth...works now!

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Is my data captive? Definitely not: For permanent storage, we have tools to stream data to files on disk or to an SQL database. You can also load data from disk (numeric or video) back in to DataTurbine You can also do real-time analysis from within MATLAB, reading and writing data from the DataTurbine

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Caveats, gotchas and limitations All data is timestamped, therefore clocks have to be synchronized, usually via NTP As previously explained, a modular distributed system is not appropriate for all research Basic host-based security; no encryption or per- user access control No need so far, can be tunneled over VPN, VLAN or SSH Java is easiest, but not always possible. Flexible, modular => learning curve.

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Upcoming developments Working on a VMWare virtual machine based on Debian Linux that has DataTurbine, RDV, source programs, INCA, Tomcat, plugins, etc. Download and run on Linux, Mac, Windows Evaluation using Tomcat/HTML as a GUI for command-line source programs Google Earth integration Add lat/long/elevation metadata to sources Write server-side pages for graphs on demand

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Tomcat as a GUI

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Motivations, or ‘Who pays the bills?’ In summer 2007, the CLEOS group at SDSC won a two-year NSF award under the SDCI (Software Development for Cyberinfrastructure Improvement) to work on DataTurbine. Also funded other other grants (Moore foundation, GLEON, MoveBank) Our agenda is ‘software for science.’... and we’re always interested in new uses of streaming data that move science forward.

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Where to learn more? Main site is Mailing list, docs, FAQ, links to Subversion, code, etc. mailing list is the best place to ask My is

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Quick demo if time (and interest) On laptop, run DataTurbine java -jar $RBNB_HOME/bin/rbnb.jar SMS-RBNB to measure and stream 3-axis accelerometer cd $SMS; ant run ISightToRbnb to stream onboard video camera java ISightToRbnb RDV to display and explore the data java -jar rdv.jar SMS-RBNB is Mac-only, rest are for any platform

SAN DIEGO SUPERCOMPUTER CENTER, UCSD In case the demo barfs...slides on a plane!

SAN DIEGO SUPERCOMPUTER CENTER, UCSD Brainstorming uses