Ortholog identification and summaries.

Slides:



Advertisements
Similar presentations
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Advertisements

Whole genome alignments Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
Proliferation cluster (G12) Figure S1 A The proliferation cluster is a stable one. A dendrogram depicting results of cluster analysis of all varying genes.
A new way of seeing genomes Combining sequence- and signal-based genome analyses Maik Friedel, Thomas Wilhelm, Jürgen Sühnel FLI Introduction: So far,
Computational Identification of Drosophila microRNA Genes Journal Club 09/05/03 Jared Bischof.
1 Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine Chenghai Xue, Fei Li, Tao He,
ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore.
Figure 2: over-representation of neighbors in the fushi-tarazu region of Drosophila melanogaster. Annotated enhancers are marked grey. The CDS is marked.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
RBP1 Splicing Regulation in Drosophila Melanogaster Fall 2005 Jacob Joseph, Ahmet Bakan, Amina Abdulla This presentation available at
Supplementary Fig. 1 Supplementary Figure 1. Distributions of (A) exon and (B) intron lengths in O. sativa and A. thaliana genes. Green bars are used for.
Supplementary Fig. 1 Supplementary Figure 1. Distributions of (A) exon and (B) intron lengths in O. sativa and A. thaliana genes. Green bars are used.
Fig. 1. Genomic structure of the csd gene in A
Figure 1. Annotation and characterization of genomic target of p63 in mouse keratinocytes (MK) based on ChIP-Seq. (A) Scatterplot representing high degree.
RNA-Seq analysis in R (Bioconductor)
Gene expression.
Ab initio gene prediction
Sensitivity of RNA‐seq.
The Normal Probability Distribution Summary
Jacek Majewski  The American Journal of Human Genetics 
Ultraconserved Elements in the Human Genome
Volume 10, Issue 7, Pages (July 2017)
Volume 11, Issue 3, Pages (March 2018)
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Expression profiling of snoRNAs in normal hematopoiesis and AML
Identify D. melanogaster ortholog
Genomic alterations in breast cancer cell line MDA-MB-231.
Distribution of the known and new candidate genes involved in hydroxylation reactions of UQ biosynthesis in proteobacteria. Distribution of the known and.
Translation of Genotype to Phenotype by a Hierarchy of Cell Subsystems
Supplementary Figure 4. Comparisons of MethyLight and gene expression data. PMR values (X-axis) were plotted against log2 gene expression values (Y-axis)
Alternative Splicing QTLs in European and African Populations
Supplemental Figure S4. Expression changes of phased and non-phased siRNAs among the three tissues examined The expression changes of phased and non-phased.
The Release 5.1 Annotation of Drosophila melanogaster Heterochromatin
ADAGE model example. ADAGE model example. For one sample in the expression compendium (one column in the figure with red or green colors, representing.
Volume 10, Issue 8, Pages (March 2015)
Evaluation of re-annotation for non-melanogaster Drosophila species.
Technical reproducibility and biological variability.
Volume 128, Issue 6, Pages (March 2007)
Volume 16, Issue 6, Pages (December 2004)
Baekgyu Kim, Kyowon Jeong, V. Narry Kim  Molecular Cell 
Dynamic Regulation of Nucleosome Positioning in the Human Genome
Integrative statistical analysis pipeline for RNA-seq and NanoString with application to gene expression data of cancer patients Jeea Choi, Catarina D.
Baseline Characteristics of Study Populations
Figure 7. Expression changes of phased and non-phased siRNAs among the three tissues examined The expression changes of phased and non-phased siRNAs between.
Volume 11, Issue 3, Pages (March 2018)
MetaPhase clustering results on the M-Y draft metagenome assembly.
The nuclear RNA degradation landscape of mESCs.
Complex evolutionary trajectories of sex chromosomes across bird taxa
Identification of the GCS1 ortholog in Gonium pectorale.
Genomewide profiling of chromatin accessibility in prostate cancer specimens Genomewide profiling of chromatin accessibility in prostate cancer specimens.
MAGs and testes transcript expression levels in the An. gambiae complex. MAGs and testes transcript expression levels in the An. gambiae complex. (A) Percentage.
TFs and predicted regulatory networks for the tissue- and lineage-dependent clusters 2, 3, and 9. TFs and predicted regulatory networks for the tissue-
Analysis of the nonsynonymous to synonymous substitutions dN/dS (ω) for the 5,493 1:1 ortholog transcripts. Analysis of the nonsynonymous to synonymous.
No correlation between the distance of the duplicated alleles and the transcription levels on each of the alleles. No correlation between the distance.
Fig. 7 LSH database and similarity search example.
Fig. 4 Sequence divergence based on RAD sequences in five scaffolds containing genital QTL candidate genes between parapatric species with diverged genital.
Comparison of species and function profiles with ultradeep sequencing data. Comparison of species and function profiles with ultradeep sequencing data.
Qian Cong, Dominika Borek, Zbyszek Otwinowski, Nick V. Grishin 
BACE2 substrates in mouse CSF.
K-means clusters of T. elongatus and M
Volume 11, Issue 7, Pages (May 2015)
Differential methylation between AA and EA cases among breast cancer subtypes. Differential methylation between AA and EA cases among breast cancer subtypes.
Comparing 3D Genome Organization in Multiple Species Using Phylo-HMRF
Defining the eTME genes.
Identification of putative TET1 targets in TNBC
Integrated analysis of gene expression and copy number alterations.
Comparison of 16S sequencing and shallow shotgun recovery of species-level taxa. Comparison of 16S sequencing and shallow shotgun recovery of species-level.
Association between genome size and the dN/dS ratio for archaeal (A; n = 21) and bacterial (B; n = 28) genome pairs and association between coding density.
Derek de Rie and Imad Abuessaisa Presented by: Cassandra Derrick
Presentation transcript:

Ortholog identification and summaries. Ortholog identification and summaries. (A) Pipeline to obtain extra 1:1 orthologs relative to Dmel. We used gene synteny, expression correlation among tissues, and exon structure similarity to train SVM models to recognize all known 1:1 orthologs. We used the same SVM model to generate more ortholog candidates among the genes that were not included in the current 1:1 ortholog dataset. Then we used sequence similarity (both alignable/total and identical/alignable) to finalize the extra 1:1 orthologs. (B) Distribution of orthologs in ± 10 gene window surrounding the query gene relative to Dmel for each non-melanogaster species (in kernel density). Solid lines are distributions of previously reported 1:1 orthologs, and dotted line (NULL group) is the expected distribution based on random gene pairs (generated by python random package) between Dmel and Dyak (the other non-melanogaster species all generate identical distributions) in genome (the same for the following panels). (C) Distribution of expression similarity among orthologs (Spearman's r) relative to Dmel for each non-melanogaster species (in kernel density). For each ortholog, normalized read counts of 14 sexed tissues between Dmel and each non-melanogaster species were used to calculated correlation. (D) Distribution of intron number relative to Dmel for each non-melanogaster species. Intron number plus one was used to avoid an infinite value. The different distribution in Dyak compared with that of the other non-melanogaster species in (B) to (D) is presumably because of its evolutionary closeness to Dmel. (E, F) An example of SVM training of sequence similarity between Dmel and Dmoj. Receiver operating characteristic curve (E) and visualization of SVM training (F). Known orthologs are shown as green dots, whereas random gene pairs are shown as black dots. The SVM hyperplane is shown as a solid line. Haiwang Yang et al. LSA 2018;1:e201800156 © 2018 Yang et al.