Why Infinite Dimensional Topological Groups May Work For Genetics Boris Khots Dmitriy Khots

Slides:



Advertisements
Similar presentations
1 A camera is modeled as a map from a space pt (X,Y,Z) to a pixel (u,v) by ‘homogeneous coordinates’ have been used to ‘treat’ translations ‘multiplicatively’
Advertisements

Copyright © Cengage Learning. All rights reserved.
Chemical Analysis Qualitative Analysis Quantitative Analysis Determination “Analyze” a paint sample for lead “Determine” lead in a paint sample.
Math 3121 Abstract Algebra I
Algebraic Structures: Group Theory II
SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative gnome hybridization data.
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
Structural Behavior of Deck-Stiffened Arch Bridges Allen C. Sit, Sanjay. R. Arwade, University of Massachusetts, Amherst Fixed arch Deflection Arch Stresses.
Microarray GEO – Microarray sets database
Discrete Structures Chapter 5 Relations and Functions Nurul Amelina Nasharuddin Multimedia Department.
1. Elements of the Genetic Algorithm  Genome: A finite dynamical system model as a set of d polynomials over  2 (finite field of 2 elements)  Fitness.
The Islamic University of Gaza Faculty of Engineering Civil Engineering Department Numerical Analysis ECIV 3306 Chapter 3 Approximations and Errors.
Basic Concepts and Definitions Vector and Function Space. A finite or an infinite dimensional linear vector/function space described with set of non-unique.
7-1 Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall Chapter 7 Sampling and Sampling Distributions Statistics for Managers using Microsoft.
Chapter 3 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
QUALITY CONTROL OF PHYSICO-Chemical METHODS Introduction :Validation توثيق المصدوقية.
Chapter 5 Several Discrete Distributions General Objectives: Discrete random variables are used in many practical applications. These random variables.
Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
Digital Signals and Systems
Thinking Mathematically Algebra: Graphs, Functions and Linear Systems 7.3 Systems of Linear Equations In Two Variables.
Copyright © Cengage Learning. All rights reserved. CHAPTER 11 ANALYSIS OF ALGORITHM EFFICIENCY ANALYSIS OF ALGORITHM EFFICIENCY.
DTFT And Fourier Transform
Section Finding Limits Graphically and Numerically.
Rev.S08 MAC 1140 Module 12 Introduction to Sequences, Counting, The Binomial Theorem, and Mathematical Induction.
Copyright ©2011 by Pearson Education, Inc. publishing as Pearson [imprint] Introductory Circuit Analysis, 12/e Boylestad Chapter 21 Decibels, Filters,
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
Numerical Sequences. Why Sequences? There are six animations about limits to show the sequence in the domain and range. Problems displaying the data.
16-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 16 The.
Edge Linking & Boundary Detection
4 4.4 © 2012 Pearson Education, Inc. Vector Spaces COORDINATE SYSTEMS.
Data Analysis 1 Mark Stamp. Topics  Experimental design o Training set, test set, n-fold cross validation, thresholding, imbalance, etc.  Accuracy o.
Copyright ©2011 Pearson Education 7-1 Chapter 7 Sampling and Sampling Distributions Statistics for Managers using Microsoft Excel 6 th Global Edition.
Digital Image Processing CCS331 Relationships of Pixel 1.
TRAJECTORIES IN LIE GROUPS Wayne M. Lawton Dept. of Mathematics, National University of Singapore 2 Science Drive 2, Singapore
G52IVG, School of Computer Science, University of Nottingham 1 Edge Detection and Image Segmentation.
Copyright © Cengage Learning. All rights reserved. 11 Infinite Sequences and Series.
Gauge Fields, Knots and Gravity Wayne Lawton Department of Mathematics National University of Singapore (65)
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Learning bounded unions of Noetherian closed set systems via characteristic sets Yuichi Kameda 1, Hiroo Tokunaga 1 and Akihiro Yamamoto 2 1 Tokyo Metropolitan.
Numerical Methods.
Topic: Quadratics and Complex Numbers Grade: 10 Key Learning(s): Analyzes the graphs of and solves quadratic equations and inequalities by factoring, taking.
Spectral Sequencing Based on Graph Distance Rong Liu, Hao Zhang, Oliver van Kaick {lrong, haoz, cs.sfu.ca {lrong, haoz, cs.sfu.ca.
4 © 2012 Pearson Education, Inc. Vector Spaces 4.4 COORDINATE SYSTEMS.
AGC DSP AGC DSP Professor A G Constantinides©1 Signal Spaces The purpose of this part of the course is to introduce the basic concepts behind generalised.
For a specific gene x ij = i th measurement under condition j, i=1,…,6; j=1,2 Is a Specific Gene Differentially Expressed Differential expression.
Analyzing Expression Data: Clustering and Stats Chapter 16.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Sampling and Sampling Distributions Basic Business Statistics 11 th Edition.
Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University 1/45 GEOSTATISTICS INTRODUCTION.
Basic Business Statistics
September 28, 2000 Improved Simultaneous Data Reconciliation, Bias Detection and Identification Using Mixed Integer Optimization Methods Presented by:
Math 3121 Abstract Algebra I Lecture 14 Sections
In this section, we will begin investigating infinite sums. We will look at some general ideas, but then focus on one specific type of series.
LIMITS OF FUNCTIONS. CONTINUITY Definition (p. 110) If one or more of the above conditions fails to hold at C the function is said to be discontinuous.
Warsaw Summer School 2015, OSU Study Abroad Program Normal Distribution.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
Lecture 8: Measurement Errors 1. Objectives List some sources of measurement errors. Classify measurement errors into systematic and random errors. Study.
Introduction to Symmetry Analysis Brian Cantwell Department of Aeronautics and Astronautics Stanford University Chapter 1 - Introduction to Symmetry.
Chapter 2 1. Chapter Summary Sets The Language of Sets - Sec 2.1 – Lecture 8 Set Operations and Set Identities - Sec 2.2 – Lecture 9 Functions and sequences.
Infinite Sequences and Series 8. Sequences Sequences A sequence can be thought of as a list of numbers written in a definite order: a 1, a 2, a.
Mathematical Formulation of the Superposition Principle
Section Finding Limits Graphically and Numerically
CS201: Data Structures and Discrete Mathematics I
“Teach A Level Maths” Vol. 2: A2 Core Modules
University of California, Berkeley
Chapter 2: Analysis of Graphs of Functions
55. Graphing Exponential Functions
2 equations 2 Variables Graphically Substitution Elimination
CONTINUITY.
Presentation transcript:

Why Infinite Dimensional Topological Groups May Work For Genetics Boris Khots Dmitriy Khots

A significant role in mathematical modeling and algorithms for applications to processing of Genetic data (for example, Gene expression data) may play infinite-dimensional P-spaces and connected with them infinite-dimensional P-groups and P-algebras (B.S. Khots, Groups of local analytical homeomorphisms of line and P-groups, Russian Math Surveys, v.XXXIII, 3 (201), Moscow, London, 1978, ). Investigation of the topological-algebraic properties of P-spaces, P- groups and P-algebras is connected with the solution of the infinite-dimensional fifth David Hilbert problem. In Genetic data processing the utilization of the topological-algebraic properties of P-spaces, P-groups and P-algebras may permit to find "Gene functionality". We applied these methods to Yeast Rosetta and Lee- Hood Gene expression data, leukemia ALL-AML Gene expression data and found the sets of Gene-Gene dependencies, Gene-Trait dependencies. In particular, accuracy of leukemia diagnosis is On the other hand, Genetics requires a solution of new mathematical problems. For example, what are the topology-algebraic properties of a P-group (subgroups, normal subgroups, normal serieses,P-algebras, subalgebras, ideals, etc) that is finitely generated by local homeomorphisms of some manifold onto itself?

Main Idea Embedding gene expression data into a known abstract mathematical object Viewing the embedded data as a sub- object Projecting properties of the object onto the embedded data Using the projected properties to analyze the data

Embedding The methods of embedding depend on the chosen mathematical object The following are examples of embedding gene expression data into a group of homeomorphisms of [0, 1] and, more generally [a, b], onto itself

Embedding – [0,1] Example Consider the gene expression data plotted Vs genes We will use basic transformations to convert the noisy gene data into a graph of an increasing homeomorphism The new graph is now viewed as an element of the group H([0, 1])

Embedding – [a, b] Example On the left side of the above figure typical sets of gene expression data across a series of n samples are graphically represented as log ratios over the range [a, b] In the central part, the data is re-formatted as n(n-1)/2 scatter-plots comparing all possible samples Each set of data points can then be embedded into H([a, b)] by reordering as depicted on the right hand side For data points that have identical coordinates, a correction based on noise characteristics can be made to satisfy criteria for inclusion into H([a, b]) We can also embed data into various subgroups of H([a, b])

Mathematical Object The most useful objects (since a lot is known of their properties) for our purpose are P-groups P-group is a topological group and an infinite- dimensional manifold which is locally a P-space with smooth operation, smoothness being taken w.r.t. the topology of the given P-space P-space is a linear topological space with certain projection operators & topology

P-group Example T(M, M) = {f: M → M| f homeo, M manifold} is a P-group In particular, in the two examples above, H([0, 1]) and H[(a, b)] are P-groups

P-group Properties Existence of fixed points for all f in T(M, M) Stationary subgroups Normal subgroups Composition series One-parameter subgroups Isomorphism classes of subgroups

Using Object Properties The above figure depicts a sketch of an algorithm that uses topology-algebraic properties of H([a, b]) for identification of gene expression dependencies The injective transformations Φ, Ψ, …,Ξ, Ω map the homeomorphism segments E, E i, …, E j, E k, onto the elements f E, f Ei, …, f Ej, f Ek in H([a, b]) Gene expression dependency exits iff there is a sequence of the group words W n formed by E i, …, E j, E k in H([a, b]), which converges to f E when n → ∞

Using Object Properties Thus we get a functional dependency of the form f E =F(f Ei, …, f Ej, f Ek )

Results This approach has identified gene expression dependencies from a well- curated yeast gene expression data set available from Rossetta Inpharmatics Using microarray platform, the expression of 6317 genes was profiled from 277 experimental samples (yeast strains, in which a single gene was either deleted or overexpressed) and 63 “controls” (replicate cultures of wild-type yeast) Prior to analysis, we removed –Data for genes that were expressed below sensitivity of detection in a majority of experiments –Data from experiments, where later work showed systematic bias as a result of aneuploidy We have found: –24, 691 dependencies of the form f G0 =F({f Gn }) for 5795 yeast genes –Number of genes per dependency: 3-9 –Number of dependencies per gene: 1-12 –Total number of genes dependent on each G 0 : 3-85