Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mehmet Tevfik DORAK, MD PhD 2nd Practical Bioinformatics Course

Similar presentations


Presentation on theme: "Mehmet Tevfik DORAK, MD PhD 2nd Practical Bioinformatics Course"— Presentation transcript:

1 Mehmet Tevfik DORAK, MD PhD 2nd Practical Bioinformatics Course
Massive Data Sources Mehmet Tevfik DORAK, MD PhD 2nd Practical Bioinformatics Course Istanbul, 17/18 April 2017

2 Schedule

3 Massive Data Sources - Existing Collections in Databases
- BioMarts and FTP Sites UCSC, Ensembl, NCBI - Web Portals of Major Projects ENCODE, RoadMap Epigenomics Project, IHEC - Online Databases - Existing Collections in Databases GeneNetwork-UTHSC, ImmPort - Supplementary Data Files of Published Papers - Galaxy Shared Data

4 Massive Data UCSC Table Browser: ENCODE Annotated Genomic regions: dbSNP SNP list by chromosome: See GMail: dbSNP Data Download HGDP SNP data: NIEHS SNP Data Download: GRASP (QTL) results: GRASP (Full GWAS Results): FANTOM5: dbSUPER Super Enhancers miRNA targets: miRTarBase: CAGE eQTL SQLite database (and R query template): CellMiner: dSysMap: CADD scores: DANN scores: EIGEN scores: RegulomeDB scores: dbWGFP Scores: GenoCanyon Predictive Scores: GERP scores: FunSeq2 scores:

5 Massive Data Enhancers list as BED file by cell type and location: PAZAR Transcription Factor Targets etc: TRRUST TFBSs database: Swiss Regulon Data: (incl TFBSs) Broad Institute: GTEx datasets: Chicago eQTL datasets (RNA-seq included): HapMap: (ftp://ftp.ncbi.nlm.nih.gov/hapmap) LS-SNP Large Scale Human SNP Annotation: SNPs3D: (incl. gene x gene inrerations) KD4v: (May be a dead link) Scientific data sources: Gencode data: Immune Cell Science BioData Repository [Roederer, 2015 #5325]: ftp://twinr-ftp.kcl.ac.uk/ImmuneCellScience Cancer Genomics Hub (CGHub): ( Various genomics datasets from FunSeq website (see README file for descriptions): Pre-computed Structure-PPi scores for COSMIC and 1KG mutations: Broad Institute Catalogs (TUCP, lincRNA etc): Broad Institute Transcriptome Assemblies Download: Broad Institute RNA-seq Read Alignments (Illumina Body Map): SEER: Large health-related data sets: Personal Genome Project / Harvard: Open SNP: Download the annotation dump: Includes annotation for all SNPs from all sources: (via PheWAS dataset download: HGNC download ( Download our ready-made data files from our Statistics and Downloads page, create your own datasets using either our Custom Downloads tool or BioMart service, or write a script/program utilising our REST service.

6 ? BioMarts: Ensembl http://www.ensembl.org/biomart
Ensembl BioMart:

7 BioMarts: Ensembl http://www.ensembl.org/biomart
Ensembl BioMart:

8 BioMarts: UCSC http://genome.ucsc.edu/cgi-bin/hgTables
UCSC BioMart:

9 BioMarts: UCSC http://genome.ucsc.edu/cgi-bin/hgTables
UCSC BioMart:

10 BioMarts: GWAS Central
GWAS Central GWASMart:

11 Web Portals of Major Projects
ENCODE Portal: Paper:

12 Web Portals of Major Projects
RoadMap Epigenomics Project data portal: Paper: (PMC:

13 Web Portals of Major Projects
RoadMap Epigenomics Project web portal:

14 Web Portals of Major Projects
IHEC data portal:

15 Web Portals of Major Projects
IHEC data portal: >>> Download page:

16 Web Portals of Major Projects
IHEC web portal (CELL): Paper (CELL):

17 Almost everybody database allows you to download their datasets.
Databases Almost everybody database allows you to download their datasets. Links: >>>

18 FunSeq: Downloads Link:

19 dbNSFP: Downloads Paper: https://www.ncbi.nlm.nih.gov/pubmed/26555599
dbNSFP: (see also:

20 Existing Collections of Data Even more datasets from mice!
GeneNetwork: Even more datasets from mice!

21 Existing Collections of Data
Oprn ImmPort:

22 Galaxy: Shared Data Library https://usegalaxy.org/library
Galaxy Library:

23 Supplementary Data Files
Link:

24 Supplementary Data Files
Paper: Suppl Tables (PMC):

25 … Looking forward …..

26

27


Download ppt "Mehmet Tevfik DORAK, MD PhD 2nd Practical Bioinformatics Course"

Similar presentations


Ads by Google