A Data Management Life-Cycle By David Ferderer Project Chief Chris SkinnerContractor Greg GuntherContractor
Presentation Outline USGS Landscape Life-Cycle Model and Strategy Component Descriptions (Skinner) Demonstration (Gunther) Conclusions and Future Directions
USGS Landscape - Energy Program What We Do –Provides Science-Based Energy Assessments Organization Issues –Regional Centers and Competitive Funding Process –Multiple Project Areas, Applications, Data Types, and Platforms Information Issues –Technology and Data Explosion –Access, Delivery, and Archive Requirements –Diverse Client and Product Needs Policy and Mandates
USGS Landscape - Central Energy Team 125 Full and Part-Time Employees – Independent Thinkers and Researchers Multiple Application Platforms –UNIX (ArcInfo 8, ArcView 3x, SDE 3, ORACLE 8, EarthVision, Seismic, PETROMOD) –PC/NT (ArcInfo 8, ArcView 3x, Geographix) Centralized and Distributed Data Storage 100mb Fast Ethernet Network
Central Energy Team “Information” Shift Data Management Information Services GIS Project Life-Cycle Integration
Life-Cycle Model and Strategy Life-Cycle Model (Conceptual) –A Series of Processes and Utilities that Manage the Flow of Data to Information, Products, and Knowledge Life-Cycle Implementation Strategy (Actual) –Processes are Translated into the Find, Get, Use, Deliver, and Maintain Strategy –Strategy Defines Tasks, Components, and Deliverables
Implementation Strategy DM Finds Internal and External Data Resources DM Gets the Data Organized, Documented, and Accessible to Team Projects Projects Use the Data and Other Resources in Research DM Assists Projects in Delivering Products to Public DM Maintains the System and Upgrades Components
Strategy Components and Utilities (Internal USGS) Find External Data and Information Get Data Organized Use Data and Other Resources In Research Projects Find Internal Data and Information (Archive and Reuse) Deliver Data and Knowledge to Projects and the Public Maintain (Upgrades and Documentation) Team Data Library Archive Library Inventory Database Metadata Utilities Data Processing Utilities Project Design Intranet Resources Hypermedia Publications CD-ROM Templates Data Life-Cycle
Team Data Library Centralized Storage –Team Data Resources (primarily spatial) –Theme and Sub-Theme Organization Standardized –Naming Conventions –Directory Structure –Storage Formats (e00, shape, SDE) –Common Data Projection (geographic) –Metadata –Browse Graphics Team Data Library
Team Archive Library Offline Storage of Team Data Resources Contains –Publications –USGS Digital Data Products (DLG, DEM, DOQ) –Team Archives Standardized File Names and Directory Structure Archive Library
Inventory Database MS Access Database Tracking Team’s Data Holdings Contains –60 Information Fields (10 Required) in 21 Tables –28 Fields Corresponding to FGDC Metadata Elements –Inventoried 4600 Datasets and 680 Archives (> 500 GB) Inventory Database
Inventory Database Features –Tracks Multiple Types of Data (Spatial, Text, Graphic and Tabular) –Separately Tracks Archives, Publications, and Individual datasets –Automatic Loading and Editing Scripts –Serves as the Engine to DART … Inventory Database
DART D ata A ccess, R etrieval, and T racking System –Easy Access to Team Data Resources via Web Browsers –Customized Search and Browse of Archives, Publications, and Datasets –Direct Data and Metadata Download to User’s Desktop –Object-Oriented Application –Java Server Pages on ServeletExec 3.1 –Stay Tuned for the Demonstration!
Metadata Utilities Web-Based Metadata Entry and Creation System –Users Generate, Modify, and Save Compliant Metadata Output to the Desktop –Provides a Simplified and Comprehensive Online Help System Contains –Links to Other Metadata Tools and Resources –Library of Metadata Metadata Utilities
Other Data Management Products Data Processing and Automation Utilities –Portal to ‘How-To’, AMLs, and FAQ Documents Residing in the Team and On the WWW Project and Workspace Design Recommendations –Templates Promote Efficient Work-Flow, Data Organization, Archives, and Rapid Publication CD-ROM Templates and Hypermedia Distribution Data Processing Utilities Project Design Hypermedia Publications CD-ROM Templates
Maintenance DM Provides Continual Maintenance and Upgrades of System Components Develop Publications and Documentation –User Manuals –Formal Component Documentation –Templates, Guidelines, and Policies –Fact Sheets and Bulletins
Demonstration Greg Gunther
System Summary Easy Access to Datasets Generate Metadata Quickly and Easily Find External Data with Over 1000 WWW Links Simplify Data Processing Tasks Organizes Projects with Workspace Templates Streamlines CDROM Publications Provides One-Stop Shopping For Shared Internal Resources
Future Directions Increase Inventory Effort Integrate GeoDatabase Model (ArcGIS) for Proprietary Datasets Formalize Metadata Extension to FDGC Standard Streamline Product Delivery - Implement IMS Publish Documented Tools and Utilities Implement Enterprise Architecture and Planning
Future Architecture Enterprise Planning* *Modified from Spewak Model Planning & Initiatives Business Processe s Current Systems Getting Started Where We Are Today Where We Want To Be Plan To The Future Implementation and Migration Plans Data Architecture GIS &Application Architecture IS/IT Architecture
Conclusions – What We Have Learned Data Management: It’s ESSENTIAL for Survival But Needs to be Promoted Distributed Projects REQUIRE Data Centralization Projects RARELY Account for Data Management Planning and Costs Data Stewardship MUST Begin at the Onset of Projects The Terms EASY and USEFUL - Lead to Implementation Component Model Must be FLEXIBLE to Adapt to Technology Trends
The End And The Beginning Of a New Cycle…