Presentation is loading. Please wait.

Presentation is loading. Please wait.

Metadata Management of Terabyte Datasets from an IP Backbone Network: Experience and Challenges Sue B. Moon and Timothy Roscoe.

Similar presentations


Presentation on theme: "Metadata Management of Terabyte Datasets from an IP Backbone Network: Experience and Challenges Sue B. Moon and Timothy Roscoe."— Presentation transcript:

1 Metadata Management of Terabyte Datasets from an IP Backbone Network: Experience and Challenges Sue B. Moon and Timothy Roscoe

2 5/25/2001NRDM 20012 Overview Sprint IP Monitoring Project Types of Data Types of Analysis Experience and Challenges Metadata Abstractions and Model Design and Implementation

3 5/25/2001NRDM 20013 Sprint IP Monitoring Project Design Goal: to acquire data without sampling or insufficient accuracy. System Components: –Linux PC with 3 PCI buses and 100GB –DAG card with OC3 to OC48 support and GPS. –SAN-based analysis platform –Data repository

4 5/25/2001NRDM 20014 Configuration at Monitored PoP customer

5 5/25/2001NRDM 20015 Analysis Platform and Data Repository at Sprint ATL

6 5/25/2001NRDM 20016 Types of Collected Data Packet trace of 50 to 100GB –44 byte packet header + 12 byte framing info per packet BGP routing tables IS-IS tables PoP configuration (topology)

7 5/25/2001NRDM 20017 Types of Analysis Simple statistics gathering Isolation of TCP flows Trace correlation Generation of traffic matrices

8 5/25/2001NRDM 20018 Challenges Total amount of data > 10 TB –What to keep on-line and off-line Sharing data and results –What has been computed/generated Correlating different types of data –E.g. packet traces with routing tables Determining s/w dependency Reproducibility of results

9 5/25/2001NRDM 20019 Task Abstraction Storage of data –Ad-hoc solution: disk arrays, SAN, tape library Source code maintenance –CVS Metadata management –Our focus in this work

10 5/25/2001NRDM 200110 Metadata Abstraction Raw input data sets Result data sets Analysis programs –Versions of s/w Analysis operations –between data sets and programs

11 5/25/2001NRDM 200111 Design and Implementation Dependency graph in relational database schema => RDBMS Interaction with version control –S/W major release Linkage to data storage system –Make raw data set self-describing –Metadata independent of data location User interface –Browsing DB thru GUI and capturing analysis operations by simple command scripts.

12 5/25/2001NRDM 200112 Conclusion and Future Work Flexible and minimally intrusive Extensions: –Automatic storage management –Result caching –Job scheduling –Automation of analysis Will results be easily reproducible? Will users adapt to the new discipline?


Download ppt "Metadata Management of Terabyte Datasets from an IP Backbone Network: Experience and Challenges Sue B. Moon and Timothy Roscoe."

Similar presentations


Ads by Google