Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bioinformatics and Biostatistics in Limagrain / Biogemma

Similar presentations


Presentation on theme: "Bioinformatics and Biostatistics in Limagrain / Biogemma"— Presentation transcript:

1 Bioinformatics and Biostatistics in Limagrain / Biogemma
JOBIM Conference, July 2015

2 An international agricultural cooperative group
4th largest seed company worldwide Nearly 2,000 farmer members Sales of nearly 2 billion Euros Nearly 9,000 employees Subsidiaries in 42 countries 13.5% of turnover re-invested in research A portfolio of strong brands

3 A group that specializes in seeds and cereal products
Field Seeds Field Seeds Limagrain Coop Vegetable Seeds Vegetable Seeds Cereal Products Bakery Products Garden Products Cereal Ingredients

4 A European group open to the world
64% of sales 64% of workforce Nearly ,000 employees 66 nationalities 69% of sales achieved outside France Subsidiaries in countries 23% of sales 16% of workforce 7% of sales 12% of workforce Americas Asia & Pacific 6% of sales 8% of workforce Africa & Middle East

5 An innovative group 13.5% of turnover invested in research 200 M€
with collabora- tions) 13.5% 10.2%* 5.4%* 2.25%* Average industry Automobile industry Pharmaceutical industry Limagrain * Source : Leem - April 2013

6 BIOGEMMA, a research partnership
Biotechnologies 9.5% 16% 55 % 10% Field Seeds

7 Biogemma Identification of genes associated with agronomic traits
Development of GM varieties in cereals Development of tools and knowledge BIOINFORMATICS |

8 Bioinformatics for breeding
Molecular Breeding Biostatistics Discover Associations Bioanalysis Explain Associations Tools Bioinformatics db Analyze NGS-based data Develop databases and tools to store and analyse biological data

9 HPLC Crystallo-graphy
Omics analysis Phenotype Environment Chromatin Silencing Regulation of transcription miRNA, siRNA Protein modification, interaction, turnover Regulation of translation RNA stability What we measure Markers mRNA Transcription levels, DGE Protein Quantity, Activity levels Trait Phenome Regulation of expression How we Genotyping Sequencing RNA-Seq microarrays HPLC Crystallo-graphy IA, NIR, HPLC, eyeball DNA Genes, Genomes Biological material RNA mRNA, rRNA Transcriptome Enzyme Proteome Metabolome Transcription Translation Expression LD mapping, GWAS, GS

10 A great deal of complex information to correlate
Environment Genotype Phenotype Data processing tools getting more and more sophisticated

11 Data analysis & processing
Data Life Cycle Data production & acquisition Results interpretation & decision support field trials predicting cross value genotyping sequencing genomics LIMS, databases evaluation of individuals data retrieval quality control building predictive model statistical analyses Data analysis & processing

12 Data production & acquisition
Sequencing NGS based: whole genome, targeted sequencing, transcriptome Deliverables: SNP, structural variations, gene expression level, genomes Genotyping High density chips 103 – 105 SNP 105 samples Automate calling / quality control Steem_Z30_rep1 Steem_Z30_rep2 Steem_Z32_rep1 Steem_Z32_rep2 Steem_Z65_rep1 Steem_Z65_rep2

13 Data production & acquisition
Phenotypic data Automate data collection Sensors, images, NIR spectrometry… Adjustments/corrections by geostatistical methods Extraction of relevant information

14 Data production & acquisition
Environmental data Local / internal: Sensors, airborne imagery, … Global / external: Databases, internet, satellite images, … Precise description of the growing conditions Air temperature Relative humidity Dew point

15 Modelling Molecular data Cost  Availability 
Predict: genotype  phenotype QTL/GWAS – identify genomic regions involved genomic selection – "black box" approach

16 Modelling Statistical methods Linear mixed models Bayesian approaches
More and more complex models GxE Epistasis  computationally intensive methods (from Van Eeuwijk et al., 2010)

17 Data management Integrative viewer for genomic data Databases
BIG DATA: large volume of structured and unstructured data

18 Infrastructure Local on-the-premises computing
"data-centric computing" Central enterprise resources Security NGS data analysis on BIOGEMMA HPC (912 cores) Elastic (cloud) flexibility low cost / hour CPU

19 Take Home Messages Bioinformatics: a major activity supporting a large range of applications in Limagrain Genomics Phenomics Enviromics Biostatistics, Modelling and Prediction Big Data (HPC, data management) Both R&D and Applied In a highly competitive and challenging research area Pied de page

20 More information… Pied de page

21 Thank you


Download ppt "Bioinformatics and Biostatistics in Limagrain / Biogemma"

Similar presentations


Ads by Google