Metadata Management of Terabyte Datasets from an IP Backbone Network: Experience and Challenges Sue B. Moon and Timothy Roscoe.

Slides:



Advertisements
Similar presentations
Monitoring very high speed links Gianluca Iannaccone Sprint ATL joint work with: Christophe Diot – Sprint ATL Ian Graham – University of Waikato Nick McKeown.
Advertisements

Traffic Dynamics at a Commercial Backbone POP Nina Taft Sprint ATL Co-authors: Supratik Bhattacharyya, Jorjeta Jetcheva, Christophe Diot.
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
New Release Announcements and Product Roadmap Chris DiPierro, Director of Software Development April 9-11, 2014
1 OBJECTIVES To generate a web-based system enables to assemble model configurations. to submit these configurations on different.
An Overview of Software-Defined Network Presenter: Xitao Wen.
FLAME: A Flow-level Anomaly Modeling Engine
Honey Pots: Natures Dessert or Cyber Defense Tool? Eric Richardson.
Geoff Salmon, Monia Ghobadi, Yashar Ganjali, Martin Labrecque, J. Gregory Steffan University of Toronto.
Objectivity Data Migration Marcin Nowak, CERN Database Group, CHEP 2003 March , La Jolla, California.
Adaptive Database Application Modeling API Final Project Report SOURENA NASIRIAMINI CS 491 6/2/2005.
The Sprint IP Monitoring Project and Traffic Dynamics at a Backbone POP Supratik Bhattacharyya Sprint ATL
RD-CSY /09 Distance Vector Routing Protocols.
Data Warehouse success depends on metadata
Network Monitoring for Internet Traffic Engineering Jennifer Rexford AT&T Labs – Research Florham Park, NJ 07932
Chapter 14 The Second Component: The Database.
An Overview of Software-Defined Network
Automated Tests in NICOS Nightly Control System Alexander Undrus Brookhaven National Laboratory, Upton, NY Software testing is a difficult, time-consuming.
Amin Kazempour Long Yunyan XU
An Overview of Software-Defined Network Presenter: Xitao Wen.
September RTC-Mon Enabling High-Speed and Extensible Real-Time Communications Monitoring Diego Costantini, Felipe Huici
Scott Pinkerton Sample GUI/Application Portfolio 1.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Building An Use Case Implementation With Denmark’s Broadcasting Archive Of Radio And Television (BART) Researching Fedora To Serve As Central Repository.
Provenance-aware Storage Systems Kiran-Kumar Muniswamy-Reddy David A. Holland Uri Braun Margo Seltzer Harvard University.
Module 10 Configuring and Managing Storage Technologies.
Framework for Automated Builds Natalia Ratnikova CHEP’03.
So, Jung-ki Distributed Computing System LAB School of Computer Science and Engineering Seoul National University Implementation of Package Management.
Automatic Software Testing Tool for Computer Networks ADD Presentation Dudi Patimer Adi Shachar Yaniv Cohen
Happy Network Administrators  Happy Packets  Happy Users WIRED Position Statement Aman Shaikh AT&T Labs – Research October 16,
NetFlow: Digging Flows Out of the Traffic Evandro de Souza ESnet ESnet Site Coordinating Committee Meeting Columbus/OH – July/2004.
The Network Performance Advisor J. W. Ferguson NLANR/DAST & NCSA.
MOME MOME: An advanced measurement meta-repository IPS-MoMe Workshop, Warsaw, Poland March 14, 2005 Felix Strohmeier Authors:
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
We have developed a GUI-based user interface for Chandra data processing automation, data quality evaluation, and control of the system. This system, known.
_______________________________________________________________CMAQ Libraries and Utilities ___________________________________________________Community.
Oracle Data Integrator Architecture Components.
802.11n Sniffer Design Overview Vladislav Mordohovich Igor Shtarev Luba Brouk.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Strategies for Adding EML Support to the GCE Data Toolbox for Matlab Wade Sheldon Georgia Coastal Ecosystems LTER (WWW: gce-lter.marsci.uga.edu/lter)
Continuous DB integration testing with RAT „RATCOIN”
INNOV-10 Progress® Event Engine™ Technical Overview Prashant Thumma Principal Software Engineer.
Vladimír Smotlacha CESNET High-speed Programmable Monitoring Adapter.
Distance Vector Routing Protocols Dynamic Routing.
Any data..! Any where..! Any time..! Linking Process and Content in a Distributed Spatial Production System Pierre Lafond HydraSpace Solutions Inc
EXPOSING OVS STATISTICS FOR Q UANTUM USERS Tomer Shani Advanced Topics in Storage Systems Spring 2013.
© 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group.
GEOL882.3 Seismic Processing Systems Objective Processing Systems SEGY and similar file formats General structure of several systems.
1 KEKB Archiving Dec Tatsuro KEK.
Copyright 2007, Information Builders. Slide 1 Machine Sizing and Scalability Mark Nesson, Vashti Ragoonath June 2008.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
NeST: Network Storage John Bent, Venkateshwaran V Miron Livny, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
T EST T OOLS U NIT VI This unit contains the overview of the test tools. Also prerequisites for applying these tools, tools selection and implementation.
1 Monitoring: from research to operations Christophe Diot and the IP Sprintlabs ipmon.sprintlabs.com.
© 2014 VMware Inc. All rights reserved. Cloud Archive for vCloud ® Air™ High-level Overview August, 2015 Date.
Network Traffic Monitoring and Analysis - Shisheer Teli CCCF.
1 Netflow Collection and Aggregation in the AT&T Common Backbone Carsten Lund.
Mobile Packet Sniffer Ofer Borosh Vadim Lanzman Dr. Chen Avin
CINET Registry Aditya Agashe, Harshal Ganpatrao Hayatnagarkar and Sarang Joshi Mentored by Dr. Keith Bisset (NDSSL) CS 6604 – Digital Libraries Virginia.
Interaction and Animation on Geolocalization Based Network Topology by Engin Arslan.
CLIF meets Jenkins Performance testing in continuous integration, and more... Bruno Dillenseger - Orange Labs CLIF is OW2's load testing framework project,
POOL persistency framework for LHC
ONOS Drake Release September 2015.
מבוא לטכנולוגיית מידע בארגון
ClosedFlow: OpenFlow-like Control over Proprietary Devices
IP Control Gateway (IPCG)
Scrumium NetBrain Thursday, May 09, 2019.
Configuration DB Status report Lana Abadie
Building a “System” Moving from writing a program to building a system. What’s the difference?! Complexity, size, complexity, size complexity Breadth.
Presentation transcript:

Metadata Management of Terabyte Datasets from an IP Backbone Network: Experience and Challenges Sue B. Moon and Timothy Roscoe

5/25/2001NRDM Overview Sprint IP Monitoring Project Types of Data Types of Analysis Experience and Challenges Metadata Abstractions and Model Design and Implementation

5/25/2001NRDM Sprint IP Monitoring Project Design Goal: to acquire data without sampling or insufficient accuracy. System Components: –Linux PC with 3 PCI buses and 100GB –DAG card with OC3 to OC48 support and GPS. –SAN-based analysis platform –Data repository

5/25/2001NRDM Configuration at Monitored PoP customer

5/25/2001NRDM Analysis Platform and Data Repository at Sprint ATL

5/25/2001NRDM Types of Collected Data Packet trace of 50 to 100GB –44 byte packet header + 12 byte framing info per packet BGP routing tables IS-IS tables PoP configuration (topology)

5/25/2001NRDM Types of Analysis Simple statistics gathering Isolation of TCP flows Trace correlation Generation of traffic matrices

5/25/2001NRDM Challenges Total amount of data > 10 TB –What to keep on-line and off-line Sharing data and results –What has been computed/generated Correlating different types of data –E.g. packet traces with routing tables Determining s/w dependency Reproducibility of results

5/25/2001NRDM Task Abstraction Storage of data –Ad-hoc solution: disk arrays, SAN, tape library Source code maintenance –CVS Metadata management –Our focus in this work

5/25/2001NRDM Metadata Abstraction Raw input data sets Result data sets Analysis programs –Versions of s/w Analysis operations –between data sets and programs

5/25/2001NRDM Design and Implementation Dependency graph in relational database schema => RDBMS Interaction with version control –S/W major release Linkage to data storage system –Make raw data set self-describing –Metadata independent of data location User interface –Browsing DB thru GUI and capturing analysis operations by simple command scripts.

5/25/2001NRDM Conclusion and Future Work Flexible and minimally intrusive Extensions: –Automatic storage management –Result caching –Job scheduling –Automation of analysis Will results be easily reproducible? Will users adapt to the new discipline?