Access Primer Africamuseum 5 June 2013
MS Access Relational Database Management System Data/information resides in series of related tables Principle of normalisation Often ‘client-server’ End-user interface independent from ‘engine’
Access elements Tables: to store data Queries: to ask questions from database Forms: to create a user interface Macros: to automate repetitive tasks Modules: create non-standard functionality, using Visual Basic
Tables Must be defined before they can be used Vaguely resemble Excel sheets but: Unit of information is a ‘record’ (line) Saved automatically as soon as one moves to another record Consists of ‘fields’ Type must be chosen All fields of a record remain together No restriction on the number of lines Excel: 64,000 lines Order of records is not fixed!
Types of fields Text (length; max = 255) Memo Integer (normal or long) Autonumber Real (single or double) Date Boolean (yes/no)
Indices Facilitate sorting and selective extraction of records Special index: primary key unique identifier for a record often ‘autonumber’ Other indices: Unique or not Slow data entry down
Hands-on Create table to store names and addresses of students Experiment with field types ‘Populate’ table – create records and provide input Play with facilities to display table Filter, sort
Import text (or other) files Download data from for example data.gbif.org Don’t select too many data! Check data through text editor Then we’ll import the data in our database
Hands-on Download data from for example data.gbif.org Don’t select too many data! Check data through text editor Play with filter, sort…
Normalisation All data should be in the database once and only once Repeated data go to separate table Relationships between the tables are part of the ‘model’ of the database Link between tables defined on pair of fields, one in each table, that are of same type Records of the two tables are ‘related’ if value in the link fields is the same Later more!!
Non-normalised Table Species# legs# eyesplaceCountrydate Asterias rubens50OostendeBelgium12/3/2004 Asterias rubens50ZeebruggeBelgium13/3/2005 Asterias rubens50ZeebruggeBelgium14/3/2005 Cancer pagurus102De PanneBelgium12/3/2004 Cancer pagurus102OostendeBelgium12/3/2004 Cancer pagurus102ZeebruggeBelgium14/3/2004 Asterias rubens50WimereuxFrance13/3/2005 Asterias rubens50WimereuxFrance14/3/2005 Cancer pagurus102WimereuxFrance12/3/2004
Normalisation Table with morphological characters fields: species, number of legs, number of eyes Table with distribution records fields: species, place, date The field ‘Species’ is common to the two tables, and makes it possible to link records between two tables
Creating relations ‘one to many’ relations Defined on fields of the same type In case of text fields: same length On the many side: primary key Helps in maintaining the integrity of the data Enforce presence of a ‘related record’ in the ‘related table’ Avoid spelling variations (Oostende, Ostend, Ostende…)
Relations Best on fields that do not carry any real-world information Autonumber In the example: add field autonumber in the species table, and use this as a code in the distributions table; make autonumber field the primary key Also best to add Autonumber as primary key in distributions table
And more Relations… Also the locality of the species distribution record should be in a separate table With fields for position, country…
Hands-on Refine the data model by creating an extra table for the countries Create extra table with id and countryname Make id the primary key in countries table Change places table, change country to country_id, and update field type Create relationship between places and countries
Queries Select and sort records of a table Pull together information from related tables Simple statistics In principle: SQL But access interface hides SQL from the user For now: query grid Define which tables will be used, fields that have to be shown, criteria for selection and sorting…
SQL statement SELECT species.speciesname, distrecords.observationdate, places.placename, places.country FROM species INNER JOIN (places INNER JOIN distrecords ON places.id = distrecords.place_id) ON species.id = distrecords.species_id WHERE (((places.country)="Belgium")) ORDER by speciesname, observationdate;
Statistics query Count the number of records in the database given criteria in a ‘Group By’ clause
Cross-tab query Not standard SQL, specific for Access Create table, with one column header and one or more row headers, and a value for the cells
Hands-on How many different species do you have in your download from GBIF? How many with latitude & longitude? Check the ‘country’ field Why do we need normalisation & standard vocabularies?
Access Wizard
Forms Create an interface for the users Hide codes/autonumbers (combobox) Elements of the form are linked to fields in the tables Can show single records at a time, or ‘repeated forms’ Entries/changes to the form are written automatically to the table
ComboBox Limit possible entries into a field Hide codes/autonumbers from users Two different representations of the same information are combined Code, to be entered into the table Full text equivalent, for the benefit of the user
Subforms To display the content of a many-to-many relationship We don’t know how many related records there will be, so impossible to solve with static, predefined structure A form can be repeated, showing multiple lines Number of lines is not limited First create the subform
Subforms And then the main form
And now… How to decide on the structure of a database? Which information do we want to store? Which tables do we need? Relationships between them? Which indices (primary and others) do we need? Start from an analysis of user needs In principle also, decide on a platform
Task Create ‘model’ for biodiversity/biogeography database Which standard vocabularies to use? What are important elements of information? Sources Biology Classification, synonymy Geography