Presentation is loading. Please wait.

Presentation is loading. Please wait.

Methodological Foundations of Biomedical Informatics (BMSC-GA 4449) Kelly Ruggles & David Fenyo.

Similar presentations


Presentation on theme: "Methodological Foundations of Biomedical Informatics (BMSC-GA 4449) Kelly Ruggles & David Fenyo."— Presentation transcript:

1 Methodological Foundations of Biomedical Informatics (BMSC-GA 4449) http://fenyolab.org/methods2015 Kelly Ruggles & David Fenyo

2 Choose a Data Set Examples: Encyclopedia of DNA Elements (ENCODE) 1000 Genomes Project The Cancer Genome Atlas (TCGA) International Cancer Genome Consortium (ICGC) Clinical Proteomics Tumor Analysis Consortium (CPTAC) Human Microbiome Project (HMP) Epigenomics Roadmap Cancer Cell Line Encyclopedia (CCLE) Library of Integrated Network-Based Cellular Signatures (LINCS) Global Initiative for Sharing All Influenza Data (GISAID) The Genotype-tissue Expression Project (GTEx) A 3D Map of the Human Genome Central Line-Associated Bloodstream Infections (CLABSI) The JGI Genome Portal Chromosome Scrambling in S. cerevisiae

3 First Homework (Sept 8) Using the data set you have chosen: Write a database schema in SQL*. Implement the schema in for example MySQL. *You don't need to capture everything in your database schema, just select an interesting subset. Later homework on the data sets: Write a Python script for loading the data. Write a Python script to query the data. Make a web interface that supports simple queries. Perform statistical calculation on the data (both directly in Python and also by calling R from Python). Make a static visualization of some aspect of the data via the web interface (e.g. call R to make a heatmap). Make an interactive visualization of some aspect of the data via the web interface.

4 Database Design Determine the purpose of the database Find and organize the information required Divide the information into tables Turn information items into columns Specify primary keys Set up the table relationships. Refine the design Apply the normalization rules https://en.wikipedia.org/wiki/Database_design

5 Example Database Schema: Proteomics Project Experiment Sample Analysis Spectrum Peptide Amino_Acid Modification Protein

6 Example Database Schema: One-to-Many CREATE TABLE Project ( project_id int, description varchar(2000), … ); CREATE TABLE Experiment ( exp_id int, project_id int, description varchar(2000), … );

7 Example Database Schema: Many-to-Many CREATE TABLE Experiment ( exp_id int, project_id int, description varchar(2000), … ); CREATE TABLE Sample ( sample_id int, description varchar(2000), … ); CREATE TABLE Aliquot ( aliquot_id int, exp_id int, sample_id int, … );

8 Methodological Foundations of Biomedical Informatics (BMSC-GA 4449) Lecture 1 Introduction (September 1, 2014 TRB 718 5pm) Lecture 2 Scientific Programming (September 8, 2014 TRB 718 5pm) Lecture 3 Algorithms (September 15, 2014 TRB 718 5pm) Lecture 4 Statistics (September 22, 2014 TRB 718 5pm) Lecture 5 Linear Algebra (September 29, 2014 TRB 718 5pm) Lecture 6 Optimization (October 6, 2014 TRB 718 5pm) Lecture 7 Data visualization (October 13, 2014 TRB 718 5pm) Lecture 8 Experimental design (October 20, 2014 TRB 718 5pm) Lecture 9 Machine Learning (October 27, 2014 TRB 718 5pm) Lecture 10 Information Retrieval (November 3, 2014 TRB 718 5pm) Lecture 11 Signal Processing (November 10, 2014 TRB 718 5pm) Lecture 12 Pathways and Networks (November 17, 2014 TRB 718 5pm) Lecture 13 Modeling and Simulation (November 24, 2014 TRB 718 5pm) Lecture 14 Project Presentation (December 15, 2014 TRB 718 5pm)

9 Lecture 2 Scientific Programming Programming Languages: Python, R, MATLAB and JavaScript Databases: SQL and NoSQL High-Performance Computing Version control: Git

10 Lecture 3 Algorithms

11 Lecture 4 Statistics https://xkcd.com/882/

12 Lecture 5 Linear Algebra

13 Lecture 6 Optimization

14 Lecture 7 Data Visualization The Cancer Genome Atlas Network, Comprehensive molecular portraits of human breast tumors. Nature. 490 (7418):61-70.

15 Lecture 8 Experimental Design Experimental Design by Christine Ambrosino www.hawaii.edu/fishlab/Nearside.htm

16 Lecture 9 Machine Learning

17 Lecture 10 Information Retrieval

18 Lecture 11 Signal Processing

19 Lecture 12 Pathways and Networks

20 Lecture 13 Modeling and Simulation ExperimentSimulation Chromosomal coordinate (kb)

21 Methodological Foundations of Biomedical Informatics (BMSC-GA 4449) http://fenyolab.org/methods2015 Kelly Ruggles & David Fenyo


Download ppt "Methodological Foundations of Biomedical Informatics (BMSC-GA 4449) Kelly Ruggles & David Fenyo."

Similar presentations


Ads by Google