Matching Models to Data in Modelling Morphogen Diffusion Wei Liu and Mahesan Niranjan School of Electronics and Computer Science University of Southampton United Kingdom {wl08r, mn}@ecs.soton.ac.uk Hi everyone, I am wei liu. I come from uni of southampton Glad to be here to introduce my work to you. I am going to talk about matching models to data in modelling morphogen diffusion.
Drosophila Small flies Length of embryo: 500 Short generation time Genetics Common model Before that, let me introduce the background of my work. Drosophila is small fly with yellow or brown body and red eye. The embryo of drosophila is about 500 micrometers . These flies have short generation time. It has been heavily used in research of genetics and it is a common model in development biology. flydatabase
Drosophila Development After Fertilization Nuclear division No cytoplasm cleavage Syncytium Pattern formation Cellular blastoderm Body axes segment boundaries Lets look at dro development. In the early drosophila embryo development, Following fertilization, the nuclear division begins and There is no cleavage of cytoplasm. And we call this common cytoplasm with multi nuclei as syncytium. It also allows morphogen gradient to play a key role in pattern formation. After about 3 hours , cell walls develop and is called cellular blastoderm. then The major body axes and segment boundaries are determined. Finally, we get the adult fly. from LIFE: The Science of Biology, Purves et al, 1998
Outline Passive diffusion models for spatial patterns establishment Constant supply Bicoid morphogens Constant supply followed by exponential decay Models vs Measured data Parameter estimation a brief overview of my talk. The passive diffusion models of morphogen proteins translated from maternally deposited messenger RNAs are usually used for spatial patterns establishment. Such model always assume a constant supply of morphogens By working in bicoid morphogen in Drosophila, we note that this constant assumption is not realistic because mRNA is known to decay after a certain time. We solve the models with combined supply numerically and compare the output with the measured data from flydatabase and estimate the parameters used in the model. These parameters can be assigned sensible values
What is the morphogen? Molecules in multi-cellular organism. Establishing spatial patterns of gene expression. Cells far from source: low level Cells close to source: high level Subdivided into different types. Different cells organise to form different organs Firstly, I want to introduce Morphogen Alan turing in 1952 discrebed a reaction diffusion system of morphogen. He discussed a possible mechanism that the genes may determine the structure of organism. Morphogens are a class of molecules in multi-cellular organism. They can provide concentration gradient from a localized source to establish spatial patterns of gene expression in embryo development. The cells far from the source will receive low level of morphogen concentration while cells close to the source will receive high level of concentration. Therefore, the cells can be subdivided into different types by their position to the source. These various cells form different organs during development. Turing, A.: The chemical basis of morphogenesis. Philosophical Transactions of the Royal Society B 237(641) (1952) 37–72
Bicoid Morphogen Drosophila body plan and position information. Contributing to set up the anterior posterior axis. Controlling cells fate along 70% of this axis. There are some maternal genes which have contributions to drosophila body plan and position information. In early drosophila embryo development, bicoid is a kind of morphogen that has contribution to set up the anterior-posterior axis [3] and it controls cells fate along 70% of this axis [4].
Bicoid Morphogen Concentration Here I want to show how the bicoid protein concentration gradient establish . 1. Initially, The maternally provided bicoid mRNA is deposited in the anterior pole of the oocyte. 2. Then mRNA translate to Bicoid proteins after fertilization. 3. After some time, Bicoid proteins diffuse from localized source. And then these proteins form a concentration gradient from anterior to posterior of the embryo. This gradient will induce gap genes expression. 4. After a certain time, the mRNA will decay. Anterior part Posterior part
Reaction-diffusion Equation The reaction-diffusion equation of single-morphogen concentration system is below: M(x,t) is morphogen concentration S(x,t) is a general source term at the anterior pole D is diffusion constant is half-life of the morphogen protein This reaction diffusion equation. This is a spatio-temporal one dimension diffusion model. where M(x, t) is the morphogen concentration as a function of space and time. D is the diffusion constant Tau_p is the half-life of the morphogen protein S(x, t) is the source at the anterior end.
Constant Source Usual assumption The usual assumption in solving this model is that the source is constant: S0 here is the production rate, (x) is the Kronecker delta function and (t) is Heaviside step function. Therefore, we have a point source at x = 0, Then there is constant supply when t increasing.
New Source Model Here, we work with a source model which has a constant part during which the maternal mRNA is kept stable, followed by an exponentially decaying part, due to maternal mRNA decaying from about cleavage cycle 12 of the developing embryo. There is one paper in 1998 found that mRNA will decay after cycle 12. Surdej, P., Jacobs-Lorena, M.: Developmental regulation of bicoid mrna stability is mediated by the first 43 nucleotides of the 3’ untranslated region. Molecular and Cellular Biology 18(5) (1998) 2892–2900
Widely used Model with constant source This figure shows the solution to the reaction diffusion model of morphogen concentration With the constant source This intensity profile is jointly in time and along the length of embryo. The right panel shows the constant supply of bicoid proteins at the anterior end of the embryo. This is the widely used model. which at steady state sets up an exponential profile. We got this solution with numerical method using pdepe Toolbox
The more realistic model shows the solution to the model, which we thinks as more realistic, in which the source is a combination of a constant supply followed by an exponential decay because the maternally deposited bicoid mRNA is decaying. After peak value, we call this part post peak The intensity of this solution is different from former constant one. The concentration is increasing and then decreasing in the same position of embryo.
Measured Data Ⅰ Flyex Database to estimate parameters of the model. Measured data: one dimension Bicoid integrated data in nuclear cleavage cycle 14A. Cycle 14A : 50 mins in duration; 8 equal temporal classes; 6.5 mins each class. We used FlyEx database [8] to estimate parameters of the model. In Flyex Database [8], bicoid integrated data in nuclear cleavage cycle 14A in one- dimension is used as measured data. Cycle 14A is nearly 50 mins in duration and is divided into 8 equal temporal classes of 6 mins duration. Andrei Pisarev, Ekaterina Poustelnikova, Maria Samsonova, John Reinitz (2009) FlyEx, the quantitative atlas on segmentation gene expression at cellular resolution. Nucl. Acids Res.; 37: D560 - D566. Ekaterina Poustelnikova, Andrei Pisarev, Maxim Blagov, Maria Samsonova, and John Reinitz (2004). A database for management of gene expression data in situ
Measured Data Ⅱ 1D integrated data – cycle11- cycle 14A (1-8 classes) This figure shows intensities of bicoid morphogens in cycle 14a for different classes. These are one dimensional data From FlyEx Database:
Measured Data Ⅲ Integrated 2D patterns – reconstructed image 14A-2
Comparison of model based and measured data of bicoid intensities 130 136 142 148 154 160 shows intensities of morphogen diffusion model output with a constant source followed by an exponentially decaying source, and the measured data from FlyEx. During this period, the bicoid concentration begins to decay due to the decay of the source mRNA and the diffused protein. It shows that there is a good match between these two matrix. Here is the 8 temporal classes. in the post-peak stages of morphogen profile, jointly in space and time. To the best of our knowledge, these stages have not attracted interest in the literature, and the popular model with a constant morphogen source is clearly incorrect in these stages. 166 172 178 Time (mins)
Matching Parameter Values to Data Ⅰ Squared error between model output and measured intensities to evaluate error. Parameters estimation: In this work, we used the squared error between model output and measured intensities to evaluate error in estimating the parameters. where T1 and T2 were the boundaries of cleavage cycle 14A for which data, Md(x, t), was available at eight uniformly sampled time points We need to estimate 4 parameters: The errors are computed during cycle 14A, holding three of the four parameters at their best estimates from literature and varying the fourth. 1.8 111 29 120 Molecular and cellular biology 1998 Diffusion constant D = 1.8 , The time mRNA starts to decay = 120mins, mRNA half-life = 29mins Bicoid protein half-life = 111mins.
Matching Parameter Values to Data Ⅱ The errors in the joint space of diffusion constant and maternal mRNA decay onset time. The errors in the joint space of diffusion constant and maternal mRNA decay onset time. and achieves a minimum in the range of sensible values.
Finding Optimal Values for all the Parameters Ⅲ Best combination of parameter values simultaneously. Diffusion constant D = 1.83 , The time mRNA starts to decay = 118mins, mRNA half-life = 28.4mins Bicoid protein half-life = 120mins. We also searched for the best combination of parameter values at the same time. This search resulted in values closed to the results obtained previously. Bergmann et al. [4] suggest a value for diffusion constant in the range 0.3 -3 μm2/s. When degradation starts later than 140mins or earlier than 110mins, the error increases significantly. Bergmann et al. [4] suggested values for p should be higher than the range 65 100mins Our estimate of bicoid mRNA half-life is 29mins, which is nearly 1/4 of bicoid proteins decaying time from our model.
Conclusion and Future Work Widely used model with a constant source is unrealistic. By matching models output to data. The single measurements without uncertainties. Developing a stochastic model for a population of embryos i.e. Master equation model Developing data driven model for embryo spatio- temporal data i.e. Kriged Kalman Filter Widely used models of a constant supply of morphogens at the source is not realistic for a number of reasons. i.e. decay of the source mRNA after a certain time, using the morphogen bicoid as an example. how parameters of the diffusion model can be calculated. In the present study we have used single measurements of profiles from the FlyEx database, which do not have uncertainties of the measurements quantified. The next step in this work would be to acquire uncertainties in bicoid profile measurements, arising from a distribution across a population of embryos, formulate the estimation problem in a probabilistic setting, and carry out posterior inference along the lines in [11].
Thanks !
Kriged Kalman Filter KF – linear Gaussian state space model KKF –modelling spatio-temporal data (Mardia et. al 1998) KF is used for the linear gaussian state space model. Under the assumption that all the parameters are known. However, this state space model is only for temporal dynamic. KKF proposed by Mardia can be used to model spatio-temporal data. And we want to use kkf for our embryo data which also spatio-temporal data. Base on this model, we can inference the underlying state or learning parameters. Of course this is still a challenge for us to apply this model. There are many difficults. Ie, the time point is limited and it will cause over-fitting problem.
FlyEx Database Step 1 : Data Acquisition Acquisition of quantitative data on gene expression in individual embryos Step 2 : Data Registration Excision of 10 % stripe of quantitative data Feature extraction Registration Step 3: Data Normalization Rescaling of data to bring data to unified standard form with a zero background Step 4: Data Averaging Construction of the integrated pattern of each gene expression Flyex database stores quantitative data on gene expression in segmentation genetic network in Drosophila melanogaster Andrei Pisarev, Ekaterina Poustelnikova, Maria Samsonova, John Reinitz (2009) FlyEx, the quantitative atlas on segmentation gene expression at cellular resolution. Nucl. Acids Res.; 37: D560 - D566. Ekaterina Poustelnikova, Andrei Pisarev, Maxim Blagov, Maria Samsonova, and John Reinitz (2004). A database for management of gene expression data in situ . Bioinformatics, 20: 2212-2221.