Presentation is loading. Please wait.

Presentation is loading. Please wait.

BF528 - Applications in Translational Bioinformatics

Similar presentations


Presentation on theme: "BF528 - Applications in Translational Bioinformatics"— Presentation transcript:

1 BF528 - Applications in Translational Bioinformatics
1/19/2018

2 Instructor Introductions
Adam Labadorf David Bray Rachael Ivison Kritika Karri Marzie Rasekh Meghan Thommes

3 Course Overview Survey course in bioinformatics
Focus on high-throughput sequencing data, tools, and techniques Focus on practical skills Group work simulates real-world collaborative environment

4 Course Goals Survey current bioinformatics techniques in translational studies Give you hands-on experience working with high-throughput biological data and tools Read and understand papers that use bioinformatics in translational studies Develop shared vocabulary between biology and computation

5 Prerequisites Molecular and cell biology
BF527, BE505/605 or equivalent Good-to-haves: Basic statistics knowledge Programming/linux cluster experience But don’t panic...

6 Course Organization http://bf528.readthedocs.io
Wed/Fri 2:30-4:15 SED 208 Semi-flipped paradigm Online content limited to ~1 hr/class Class period split into two segments: Lecture or discussion of online material Project group meeting and discussion

7 Course Organization cont’d
Students assigned into groups of 4 4 projects over the course of the semester No homeworks No exams

8 Schedule of Topics Class Day Date Topic Project Out/Due Lecturer 1 Fri
Jan 19 Introduction Adam 2 Wed Jan 24 Computational Skills Primer Kritika 3 Jan 26 Genomics, Genes, and Genomes 4 Jan 31 Array Technologies 5 Feb 2 2nd Gen Sequencing Marzie 6 Feb 7 Sequence Analysis 1 - WGS/WES 7 Feb 9 Genomic Variation and SNP Analysis 8 Feb 14 Biological Data Formats 9 Feb 16 Databases 10 Feb 21 Sequence Analysis 2 - RNA-Seq 1/2 11 Feb 23 Biomarkers Marc 12 Feb 28 Sequence Analysis 3 - ChIP-Seq David 13 Mar 2 Phylogenetics Spring Break

9 Schedule of Topics cont’d
14 Wed Mar 14 Gene sets and enrichment 2/3 Kritika 15 Fri Mar 16 Replicability vs Reproducibility Strategies Adam 16 Mar 21 Computational Pipeline Strategies 17 Mar 23 Computational Environment Management 18 Mar 28 Sequence Visualization David 19 Mar 30 Microbiome: 16S Meghan 20 Apr 4 Microbiome: Metagenomics 3/4 21 Apr 6 Metabolomics 22 Apr 11 Proteomics/Mass Spectrometry Andrew 23 Apr 13 Single Cell Techniques 24 Apr 18 Integrative Genomics 25 Apr 20 Network Biology 26 Apr 25 Systems Biology 4 Trevor 27 Apr 27 The Future + Retrospective Ensemble

10 Projects Assigned into groups based on experience
Groups are for the entire semester You will reproduce published findings from published manuscripts Each project has a full writeup

11 Project Groups Group members will play one of four roles:
Data Curator - find, download, and organize data Programmer - process data into analyzable form Analyst - transform processed data into interpretable form Biologist - understand paper and biological context, help interpret results Roles rotate for each project Structured class time to help facilitate group work and help each other!

12 Project Group Meeting : Wednesdays
Time allotted for groups to meet and discuss progress “Stand-up” meeting structure: “What did I work on since our last meeting?” “What challenges did I encounter?” “Are there any obstacles to completing my work?” “What will I be working on for next meeting?” Each group will make a brief status report at the end of class

13 Project Group Meeting : Fridays
Time allotted for roles to meet and discuss progress Similar structure to Wednesdays Share challenges and solutions among roles Each role group will make a brief status report at the end of class

14 Project Report Organized like a published study
Sections (primary role): Intro - background and motivation (Biologist) Data - data description (Data Curator) Methods - processing and tools (Programmer) Results - findings (Analyst) Discussion - interpret findings (Biologist) Conclusion (all)

15 Assessment Each project is 25% of your total grade Broken down:
Intro, Conclusion - 2.5% Data, Methods, Results, Discussion, 20% Stand-up participation: 15%

16 Translational Bioinformatics

17 Biology as Data Science
DNA structure published in Nature first genetic sequence determined, protein DataBank Sanger sequencing, first genome sequenced PCR technique invented Human Genome Project begun first bacterial genome sequenced, microarray technology first described yeast genome on a microarray, sequencing by synthesis concept established first multicellular eukaryote sequenced first draft of human genome Solexa Genome Analyzer released

18 “Big” Data Single Microarray dataset: ~500Mb
Single short read dataset: ~2Gb-300Gb Human genome reference sequence: ~2Gb One run of Illumina instruments: HiSeq 2500: ~1Tb NovaSeq 6000: ~6Tb Gene Expression Omnibus (GEO): 2014: 1,237,138 samples, ~28 Tb 2018: 2,335,694 samples, ?? Tb

19 What is Bioinformatics?
“Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biological data. As an interdisciplinary field of science, bioinformatics combines computer science, statistics, mathematics, and engineering to study and process biological data.” Wikipedia

20 Conceptual History of Bioinformatics
Biological sequences digitized Biological databases needed to store sequences Search tools needed for databases Tools for analyzing data from searches Computational tools required to analyze human genome Sophisticated sequence analysis tools enable analysis of large amounts of sequencing data Sequencing data volume explodes, requiring new tools And here we are

21 The Biologist’s Tools Wet lab biologists: Bioinformaticians:

22 Sequence: The Fundamental Datatype
Computer Science genome assembly, homology, phylogeny Physics DNA/RNA/protein structure, drug prediction Statistics gene expression, population genetics, biomarkers Mathematics metabolic modeling, synthetic biology, systems biology

23 Genbank Sequences

24 Translational Bioinformatics
“Translational Bioinformatics is an emerging field in the study of health informatics, focused on the convergence of molecular bioinformatics, biostatistics, statistical genetics, and clinical informatics.” Wikipedia

25 Workshop 0. Basic Linux and Command Line Usage
For Next Time Assignment: familiarize yourself with the material on basic command line usage found here: Workshop 0. Basic Linux and Command Line Usage

26 SSH and SCC SCC - Shared Compute Cluster You all have accounts on SCC
You will need an ssh client program to connect: Mac, Linux: Terminal (included) Windows: MobaXTerm Connect to: scc1.bu.edu with your BU username/password Demonstration

27 Survey Results

28 How comfortable are you with the following programming languages/concepts?

29 How comfortable are you with the following statistics concepts?

30 How comfortable are you with the following biology concepts?

31 How comfortable are you with the following bioinformatics concepts?

32 Rank the following roles that you might play in a project in order of preference

33 What do you hope to learn?

34 How do you plan to use what you learn?


Download ppt "BF528 - Applications in Translational Bioinformatics"

Similar presentations


Ads by Google