Download presentation
Presentation is loading. Please wait.
Published byQuentin Horton Modified over 8 years ago
1
Data Models for Ecological Databases John Porter Department of Environmental Sciences University of Virginia John Porter Department of Environmental Sciences University of Virginia
2
DBMS Types File system-based Hierarchical Network Relational Object-oriented You’ve seen these before, now lets go into more detail File system-based Hierarchical Network Relational Object-oriented You’ve seen these before, now lets go into more detail
3
File-System Based Directory Files very simple and easy to set upvery simple and easy to set up inefficientinefficient few capabilitiesfew capabilities very simple and easy to set upvery simple and easy to set up inefficientinefficient few capabilitiesfew capabilities
4
Hierarchical CodesMethods VariablesLocations DatasetsInvestigators Project Hierarchical efficient not very general e.g. phylogenetic structures geographical images
5
Network Database Projects Locations Datasets very flexible unwieldy to modify not widely used very flexible unwieldy to modify not widely used Links are hard-coded into database. They are not a property of the data
6
Relational Database Projects Locations Datasets Data_id Location_id widely-used, mature table-oriented restricted range of structures widely-used, mature table-oriented restricted range of structures Linkages are through the properties of the data itself - not hard coded
7
Object Oriented Methods Object Data Structure developing -few commercial implementations diverse structures extensible Complex data structures, along with the methods to use the data are in the database
8
Data Modeling DBMS Systems are highly flexible Good: they can do a lot! Bad: they have to be told how to do it! A Database Management System is the CANVAS, the DATA MODEL is the painting……. DBMS Systems are highly flexible Good: they can do a lot! Bad: they have to be told how to do it! A Database Management System is the CANVAS, the DATA MODEL is the painting…….
9
Data Modeling Data modeling is used to develop the database structures used in a database Your data model effects –reliability of the data –efficiency and speed of queries –the complexity of the database Data modeling is an art, not a science! Data modeling is used to develop the database structures used in a database Your data model effects –reliability of the data –efficiency and speed of queries –the complexity of the database Data modeling is an art, not a science!
10
Some Terminology: Tables contain attributes or fields (columns) and multiple observations or tuples (rows)
11
Flat-file Species Observation Genus Species Observer Common Name Date Attributes in ovals Tables in boxes
12
Normalization One widely-used approach for reducing errors within a database is to normalize your data structures Normalization is the process of eliminating duplicate or redundant information One widely-used approach for reducing errors within a database is to normalize your data structures Normalization is the process of eliminating duplicate or redundant information
13
Two-table Relational Database Species Genus Species Common Name Spec_code Observer D ate Observation Spec_code
14
Complex Data Model Specimens Species Observations ObserversLocations Images Internet Links Notation: One-to-one One-to-many or
15
Data Model for Metadata at theVCR/LTER PersonnelProjectsDataset Mailing Lists VariableCodes Variable DatasetLocations Optional Linkage Mandatory Linkage
16
“Beanstalk”& “String of Pearls” Location Table Lat/LonLat/Lon Metadata methodsmethods unitsunits
17
Beanstalk / String of Pearls Highly normalized Extremely flexible - capable of handling many different kinds of data Inefficient –Queries can be very slow –Can require large amounts of space Highly normalized Extremely flexible - capable of handling many different kinds of data Inefficient –Queries can be very slow –Can require large amounts of space
18
Why is there no perfect data model for ecological data? One of the reasons data modeling is an ART not a SCIENCE is that ecologists use data in many different ways –Data that is perfectly formed for one kind of analysis may be unusable for another –Different analytical software may be used One of the reasons data modeling is an ART not a SCIENCE is that ecologists use data in many different ways –Data that is perfectly formed for one kind of analysis may be unusable for another –Different analytical software may be used
19
Why No Perfect Model? Generally ecologists want to use data in “flat file” formats that combine all the tables containing data into a single, denormalized “spreadsheet”-type format- but even that format can vary between researchers –ClimDB needed to support single parameter and multiple parameter formats to meet researcher needs Generally ecologists want to use data in “flat file” formats that combine all the tables containing data into a single, denormalized “spreadsheet”-type format- but even that format can vary between researchers –ClimDB needed to support single parameter and multiple parameter formats to meet researcher needs
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.