1 A Short Introduction to Analyzing Biological Data Using Relational Databases Part II: Creating a Relational Database to Model Biological Data Alex Ropelewski.

Slides:



Advertisements
Similar presentations
 The following material is the result of a curriculum development effort to provide a set of courses to support bioinformatics efforts involving students.
Advertisements

A Guide to SQL, Seventh Edition. Objectives Understand the concepts and terminology associated with relational databases Create and run SQL commands in.
A Guide to MySQL 3. 2 Objectives Start MySQL and learn how to use the MySQL Reference Manual Create a database Change (activate) a database Create tables.
SQL Overview Defining a Schema CPSC 315 – Programming Studio Spring 2008 Project 1, Lecture 3 Slides adapted from those used by Jeffrey Ullman, via Jennifer.
Introduction to Information and Computer Science
Database Systems Lecture 5 Natasha Alechina
SQL Overview Defining a Schema CPSC 315 – Programming Studio Slides adapted from those used by Jeffrey Ullman, via Jennifer Welch Via Yoonsuck Choe.
MARC: Developing Bioinformatics Programs July 2009 Alex Ropelewski PSC-NRBSC Bienvenido Vélez UPR Mayaguez Reference: How to Think Like a Computer Scientist:
 The following material is the result of a curriculum development effort to provide a set of courses to support bioinformatics efforts involving students.
Session 5: Working with MySQL iNET Academy Open Source Web Development.
Chapter 5 Introduction to SQL. Structured Query Language = the “programming language” for relational databases SQL is a nonprocedural language = the user.
Introduction to SQL Steve Perry
MARC: Developing Bioinformatics Programs July 2009 Alex Ropelewski PSC-NRBSC Bienvenido Vélez UPR Mayaguez Reference: How to Think Like a Computer Scientist:
Chapter 7 SQL HUANG XUEHUA. SQL SQL server2005 introduction Install components  management studio.
 The following material is the result of a curriculum development effort to provide a set of courses to support bioinformatics efforts involving students.
SQL SQL Server : Overview SQL : Overview Types of SQL Database : Creation Tables : Creation & Manipulation Data : Creation & Manipulation Data : Retrieving.
MARC: Developing Bioinformatics Programs July 2009 Alex Ropelewski PSC-NRBSC Bienvenido Vélez UPR Mayaguez Reference: How to Think Like a Computer Scientist:
1 Structured Query Language (SQL). 2 Contents SQL – I SQL – II SQL – III SQL – IV.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
SQL: DDL. SQL Statements DDL - data definition language –Defining and modifying data structures (metadata): database, tables, views, etc. DML - data manipulation.
A Guide to MySQL 3. 2 Introduction  Structured Query Language (SQL): Popular and widely used language for retrieving and manipulating database data Developed.
Component 4: Introduction to Information and Computer Science Unit 6a Databases and SQL.
SQL Basics. What is SQL? SQL stands for Structured Query Language. SQL lets you access and manipulate databases.
Database Management COP4540, SCS, FIU Structured Query Language (Chapter 8)
 The following material is the result of a curriculum development effort to provide a set of courses to support bioinformatics efforts involving students.
# 1# 1 Creating Tables, Setting Constraints, and Datatypes What is a constraint and why do we use it? What is a datatype? What does CHAR mean? CS 105.
Visual Programing SQL Overview Section 1.
MARC: Developing Bioinformatics Programs July 2009 Alex Ropelewski PSC-NRBSC Bienvenido Vélez UPR Mayaguez 1 Essential Computing for Bioinformatics Lecture.
 The following material is the result of a curriculum development effort to provide a set of courses to support bioinformatics efforts involving students.
Week 8-9 SQL-1. SQL Components: DDL, DCL, & DML SQL is a very large and powerful language, but every type of SQL statement falls within one of three main.
BING 6004: Intro to Computational BioEngineering Spring 2016 Lecture 1: Using Python Expressions and Variables Bienvenido Vélez UPR Mayaguez Reference:
Bienvenido Vélez UPR Mayaguez Reference: How to Think Like a Computer Scientist: Learning with Python 1 Introduction to Python programming for Bioinformatics.
Alex Ropelewski Pittsburgh Supercomputing Center National Resource for Biomedical Supercomputing Bienvenido Vélez
MARC: Developing Bioinformatics Programs Alex Ropelewski PSC-NRBSC Bienvenido Vélez UPR Mayaguez Reference: How to Think Like a Computer Scientist: Learning.
Physical Model Lecture 11. Physical Data Model The last step is the physical design phase, In this phase data is – Store – Organized and – Access.
MARC: Developing Bioinformatics Programs June 2012 Alex Ropelewski PSC-NRBSC Bienvenido Vélez UPR Mayaguez Reference: How to Think Like a Computer Scientist:
Alex Ropelewski PSC-NRBSC Bienvenido Vélez UPR Mayaguez Using Molecular Biology to Teach Computer Science High-level Programming with Python Finding Patterns.
High-level Programming with Python Expressions and Variables Alex Ropelewski PSC-NRBSC Bienvenido Vélez UPR Mayaguez Reference: How to Think Like a Computer.
MARC: Developing Bioinformatics Programs Alex Ropelewski PSC-NRBSC Bienvenido Vélez UPR Mayaguez Reference: How to Think Like a Computer Scientist: Learning.
1 Section 1 - Introduction to SQL u SQL is an abbreviation for Structured Query Language. u It is generally pronounced “Sequel” u SQL is a unified language.
Introduction to Database Programming with Python Gary Stewart
MARC: Developing Bioinformatics Programs June 2012 Alex Ropelewski PSC-NRBSC Bienvenido Vélez UPR Mayaguez Reference: How to Think Like a Computer Scientist:
3 A Guide to MySQL.
CompSci 280 S Introduction to Software Development
A Short Introduction to Analyzing Biological Data Using Relational Databases Part III: Writing Simple (Single Table) Queries to Access Relational Data.
Chapter 5 Introduction to SQL.
Bioinformatics Data Management
Client/Server Databases and the Oracle 10g Relational Database
MySQL-Database Jouni Juntunen Oulu University of Applied Sciences
CHAPTER 7 DATABASE ACCESS THROUGH WEB
Insert, Update and the rest…
SQL Creating and Managing Tables
Data Definition and Data Types
Designing Tables for a Database System
ISC440: Web Programming 2 Server-side Scripting PHP 3
SQL Creating and Managing Tables
SQL Creating and Managing Tables
SQL OVERVIEW DEFINING A SCHEMA
SQL DATA CONSTRAINTS.
CMPT 354: Database System I
CS122 Using Relational Databases and SQL
SQL-1 Week 8-9.
Session - 6 Sequence - 1 SQL: The Structured Query Language:
CS1222 Using Relational Databases and SQL
Data Definition Language
Data Definition Language
A Very Brief Introduction to Relational Databases
Session - 6 Sequence - 1 SQL: The Structured Query Language:
CS122 Using Relational Databases and SQL
SQL (Structured Query Language)
Presentation transcript:

1 A Short Introduction to Analyzing Biological Data Using Relational Databases Part II: Creating a Relational Database to Model Biological Data Alex Ropelewski Pittsburgh Supercomputing Center National Resource for Biomedical Supercomputing Bienvenido Vélez University of Puerto Rico at Mayaguez Department of Electrical and Computer Engineering

The following material is the result of a curriculum development effort to provide a set of courses to support bioinformatics efforts involving students from the biological sciences, computer science, and mathematics departments. They have been developed as a part of the NIH funded project “Assisting Bioinformatics Efforts at Minority Schools” (2T36 GM008789). The people involved with the curriculum development effort include: Dr. Hugh B. Nicholas, Dr. Troy Wymore, Mr. Alexander Ropelewski and Dr. David Deerfield II, National Resource for Biomedical Supercomputing, Pittsburgh Supercomputing Center, Carnegie Mellon University. Dr. Ricardo González Méndez, University of Puerto Rico Medical Sciences Campus. Dr. Alade Tokuta, North Carolina Central University. Dr. Jaime Seguel and Dr. Bienvenido Vélez, University of Puerto Rico at Mayagüez. Dr. Satish Bhalla, Johnson C. Smith University. Unless otherwise specified, all the information contained within is Copyrighted © by Carnegie Mellon University. Permission is granted for use, modify, and reproduce these materials for teaching purposes. Most recent versions of these presentations can be found at

Learning Objectives The SQL relational database language Creating relational DB tables using SQL Inserting tuples into DB tables using SQL 3

SQL The language of relational databases –Data definition/schema creation –Implements relational algebra operations –Data manipulation Insertion Manipulation Updates Removals – A standard (ISO) since

An Improved Relational Database Design 5 RunNumDateMatrix 17/21/07Pam70 27/20/07Blosom80 Sequences Runs Matches AccessionDescriptionSpecies P14555Group IIA Phospholipase A2Human P81479Phospholipase A2 isozyme IVIndian Green Tree Viper P00623Phospholipase A2Eastern Diamondback Rattlesnake AccessionRunNumeValue P E-32 P E-52 P E-33 P E-54 P E-08

SQL: Create Statement Describes data to be placed in a table 6 CREATE TABLE Sequences( Accession varchar(32), Description varchar(256), Species varchar(256) ) Sequences AccessionDescriptionSpecies

SQL: Create Statement Describes data to be placed in a table 7 CREATE TABLE Matches( Accession varchar(32), RunNum int, eValue float ) Matches AccessionRunNumeValue

SQL: CREATE TABLE Statement Describes data to be placed in a table 8 CREATE TABLE Runs( RunNum int, Matrix varchar(32), DateRun date ) Runs RunNumMatrixDateRun

Frequently Used Relational Data Types TYPE NAMEDESCRIPTION BOOLEANTrue or False Value INTInteger Number FLOATFloating Point Number DATEDate (Year, Month and Day) TIMESTAMPData and Time of an Event CHAR(n)Fixed Size Character String VARCHAR(n)Variable Size Character BLOBLarge Segment of Text or Data ENUM(v1,…,vn)One of the values v1 … vn 9 BEWARE: Available Datatypes May Vary Significantly Across DBMS Implementations

Frequently Used Data Type Modifiers TYPE NAMEDESCRIPTION DEFAULT vDefault value is v NOT NULLField cannot be empty AUTO_INCREMENTOn every insert column automatically assigned n+1 where n is the value assigned to the last row inserted PRIMARY KEYThis column is the primary key for the table. Value must be unique among all rows. 10 BEWARE: Behavior of Data Type Modifiers May Vary Significantly Across DBMS Implementations

SQL: CREATE TABLE Statement Using Data Type Modifiers 11 CREATE TABLE Runs( RunNum int AUTO_INCREMENT, Matrix varchar(32) DEFAULT “BLOSUM62”, DateRun date NOT NULL) Runs RunNumMatrixDateRun

SQL: INSERT INTO Statement Places data into the table for the first time 12 Sequences AccessionDescriptionSpecies P14555Group IIA Phospholipase A2Human INSERT INTO Sequences (Accession, Description, Species) VALUES (’P14555’,’Group IIA Phospholipase A2’, ’Human’)

Key Concepts SQL (usually pronounced “sequel”) is the most common language for manipulating relational data SQL’s CREATE statement can be used to specify and create relational tables CREATE can be used to add attributes of several data types SQL’s INSERT statement can be used to insert rows into a table. Each INSERT statement inserts a single row into a relational table Data can also be imported from flat files and spreadsheets using importing tools available in most relational database servers 13