Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 A Short Introduction to Analyzing Biological Data Using Relational Databases Part II: Creating a Relational Database to Model Biological Data Alex Ropelewski.

Similar presentations


Presentation on theme: "1 A Short Introduction to Analyzing Biological Data Using Relational Databases Part II: Creating a Relational Database to Model Biological Data Alex Ropelewski."— Presentation transcript:

1 1 A Short Introduction to Analyzing Biological Data Using Relational Databases Part II: Creating a Relational Database to Model Biological Data Alex Ropelewski ropelews@psc.edu Pittsburgh Supercomputing Center National Resource for Biomedical Supercomputing Bienvenido Vélez Bienvenido.Velez@upr.edu University of Puerto Rico at Mayaguez Department of Electrical and Computer Engineering

2 The following material is the result of a curriculum development effort to provide a set of courses to support bioinformatics efforts involving students from the biological sciences, computer science, and mathematics departments. They have been developed as a part of the NIH funded project “Assisting Bioinformatics Efforts at Minority Schools” (2T36 GM008789). The people involved with the curriculum development effort include: Dr. Hugh B. Nicholas, Dr. Troy Wymore, Mr. Alexander Ropelewski and Dr. David Deerfield II, National Resource for Biomedical Supercomputing, Pittsburgh Supercomputing Center, Carnegie Mellon University. Dr. Ricardo González Méndez, University of Puerto Rico Medical Sciences Campus. Dr. Alade Tokuta, North Carolina Central University. Dr. Jaime Seguel and Dr. Bienvenido Vélez, University of Puerto Rico at Mayagüez. Dr. Satish Bhalla, Johnson C. Smith University. Unless otherwise specified, all the information contained within is Copyrighted © by Carnegie Mellon University. Permission is granted for use, modify, and reproduce these materials for teaching purposes. Most recent versions of these presentations can be found at http://marc.psc.edu/http://marc.psc.edu/

3 Learning Objectives The SQL relational database language Creating relational DB tables using SQL Inserting tuples into DB tables using SQL 3

4 SQL The language of relational databases –Data definition/schema creation –Implements relational algebra operations –Data manipulation Insertion Manipulation Updates Removals – A standard (ISO) since 1987 4

5 An Improved Relational Database Design 5 RunNumDateMatrix 17/21/07Pam70 27/20/07Blosom80 Sequences Runs Matches AccessionDescriptionSpecies P14555Group IIA Phospholipase A2Human P81479Phospholipase A2 isozyme IVIndian Green Tree Viper P00623Phospholipase A2Eastern Diamondback Rattlesnake AccessionRunNumeValue P1455514.18 E-32 P8147922.68 E-52 P1455523.47 E-33 P8147911.20 E-54 P0062321.21 E-08

6 SQL: Create Statement Describes data to be placed in a table 6 CREATE TABLE Sequences( Accession varchar(32), Description varchar(256), Species varchar(256) ) Sequences AccessionDescriptionSpecies

7 SQL: Create Statement Describes data to be placed in a table 7 CREATE TABLE Matches( Accession varchar(32), RunNum int, eValue float ) Matches AccessionRunNumeValue

8 SQL: CREATE TABLE Statement Describes data to be placed in a table 8 CREATE TABLE Runs( RunNum int, Matrix varchar(32), DateRun date ) Runs RunNumMatrixDateRun

9 Frequently Used Relational Data Types TYPE NAMEDESCRIPTION BOOLEANTrue or False Value INTInteger Number FLOATFloating Point Number DATEDate (Year, Month and Day) TIMESTAMPData and Time of an Event CHAR(n)Fixed Size Character String VARCHAR(n)Variable Size Character BLOBLarge Segment of Text or Data ENUM(v1,…,vn)One of the values v1 … vn 9 BEWARE: Available Datatypes May Vary Significantly Across DBMS Implementations

10 Frequently Used Data Type Modifiers TYPE NAMEDESCRIPTION DEFAULT vDefault value is v NOT NULLField cannot be empty AUTO_INCREMENTOn every insert column automatically assigned n+1 where n is the value assigned to the last row inserted PRIMARY KEYThis column is the primary key for the table. Value must be unique among all rows. 10 BEWARE: Behavior of Data Type Modifiers May Vary Significantly Across DBMS Implementations

11 SQL: CREATE TABLE Statement Using Data Type Modifiers 11 CREATE TABLE Runs( RunNum int AUTO_INCREMENT, Matrix varchar(32) DEFAULT “BLOSUM62”, DateRun date NOT NULL) Runs RunNumMatrixDateRun

12 SQL: INSERT INTO Statement Places data into the table for the first time 12 Sequences AccessionDescriptionSpecies P14555Group IIA Phospholipase A2Human INSERT INTO Sequences (Accession, Description, Species) VALUES (’P14555’,’Group IIA Phospholipase A2’, ’Human’)

13 Key Concepts SQL (usually pronounced “sequel”) is the most common language for manipulating relational data SQL’s CREATE statement can be used to specify and create relational tables CREATE can be used to add attributes of several data types SQL’s INSERT statement can be used to insert rows into a table. Each INSERT statement inserts a single row into a relational table Data can also be imported from flat files and spreadsheets using importing tools available in most relational database servers 13


Download ppt "1 A Short Introduction to Analyzing Biological Data Using Relational Databases Part II: Creating a Relational Database to Model Biological Data Alex Ropelewski."

Similar presentations


Ads by Google