Presentation is loading. Please wait.

Presentation is loading. Please wait.

Software and Databases for managing and selecting molecular markers General introduction Pathway approach for candidate gene identification and introduction.

Similar presentations


Presentation on theme: "Software and Databases for managing and selecting molecular markers General introduction Pathway approach for candidate gene identification and introduction."— Presentation transcript:

1 Software and Databases for managing and selecting molecular markers General introduction Pathway approach for candidate gene identification and introduction to metabolic pathway databases. Identification of polymorphisms in data-based sequences

2 Databases (General and Crop Specific) Germplasm GRIN: http://www.ars-grin.gov/npgs/ TGRC: http://tgrc.ucdavis.edu/ Sequence NCBI: http://www.ncbi.nlm.nih.gov/ SGN: http://solgenomics.net/ Metabolic PlantCyc: http://www.plantcyc.org:1555/PLANT/server.html?

3 New format to NCBI

4 Access current and past scientific lit.

5 Increased emphasis on phenotypic data

6 Germplasm databases

7

8

9 Crop specific germplasm resources

10 Example: QTL for color uniformity in elite crosses QTLTraitOrigin 2L, YSDS. lyc. 4YSDS. lyc. 6L, Hueog c 7L, HueS. hab. 11L, HueS. lyc. Audrey Darrigues, Eileen Kabelka

11 Carotenoid Biosynthesis: Candidate pathway for genes that affect color and color uniformity. Disclaimer: this is not the only candidate pathway…

12 http://www.arabidopsis.org/help/tutorials/aracyc_intro.jsp Databases that link pathways to genes

13 http://metacyc.org/ http://www.plantcyc.org/ http://sgn.cornell.edu/tools/solcyc/ http://www.arabidopsis.org/biocyc/index.jsp http://www.arabidopsis.org/help/tutorials/aracyc_intro.jsp External Plant Metabolic databases CapCyc (Pepper) (C. anuum) CoffeaCyc (Coffee) (C. canephora) SolCyc (Tomato) (S. lycopersicum) NicotianaCyc (Tobacco) (N. tabacum) PetuniaCyc (Petunia) (P. hybrida) PotatoCyc (Potato) (S. tuberosum) SolaCyc (Eggplant) (S. melongena) Databases that link pathways to genes

14 http://www.plantcyc.org:1555/

15

16

17 Note: missing step (lycopene isomerase, tangerine)

18

19 Check boxes (Note: MetaCyc has many more choices, but no plants)

20

21

22

23 Scroll down page Capsicum annum sequence retrieved

24

25 http://www.ncbi.nlm.nih.gov/

26 Select database

27

28 Query CCACCACCATCCTCACTTTAACCCACAAATCCCACTTTCTTTGGCCTAATTAACAATTTT |||||||||||||||||||||||||||||||||| ||||||||||||||||||||||||| Sbjct CCACCACCATCCTCACTTTAACCCACAAATCCCATTTTCTTTGGCCTAATTAACAATTTT Zeaxanthin epoxidase Probable location on Chromosome 2 Alignment of Z83835 and EF581828 reveals 5 SNPs over ~2000 bp

29 51 annotated loci

30 Information missing from other databases is here… Candidates identified in other databases are here

31

32 Comment on the databases: Information is not always complete/up to date. Display is not always optimal, and several steps may be needed to go from pathway > gene > potential marker. Sequence data has error associated with it. eSNPs are not the same as validated markers. Germplasm data may also have error (e.g. PI 128216) There is a wealth of information organized and available.

33 The previous example detailed how we might identify sequence based markers for trait selection. Query CCACCACCATCCTCACTTTAACCCACAAATCCCACTTTCTTTGGCCTAATTAACAATTTT |||||||||||||||||||||||||||||||||| ||||||||||||||||||||||||| Sbjct CCACCACCATCCTCACTTTAACCCACAAATCCCATTTTCTTTGGCCTAATTAACAATTTT Improving efficiency of selection in terms of 1) relative efficiency of selection, 2) time, 3) gain under selection and 4) cost will benefit from markers for both forward and background selection. Remainder of Presentation will focus on Where to apply markers in a program Forward and background selection Marker resources Alternative population structures and size

34 Relative efficiency of selection: r (gen) x {H i /H d } Line performance over locations > MAS > Single plant Comparison of direct selection with indirect selection (MAS).

35 F1 50:50 BC1 75:25 BC2 87.5:12.5 BC3 93.75:6.25 BC4 96.875:3.125 Expected proportion of Recurrent Parent (RP) genome in BC progeny Accelerating Backcross Selection

36 References: Frisch, M., M. Bohn, and A.E. Melchinger. 1999. Comparison of Selection Strategies for Marker-Assisted Backcrossing of a Gene. Crop Science 39: 1295-1301.

37 Progeny needed for Background Selection During MAS Q10 indicates a 90% probability of success From Frisch et al., 1999.

38 Marker Data Points required (Modified from Frisch et al., 1999; based on assumption of 12 chromosomes; initial selection with 4 markers/chromosome)

39 For effective background selection we need: Markers for our target locus (C > T SNP for Zep) Markers on the target chromosome (Chrom. 2) Markers unlinked to the target chromosome (~2 per chromosome arm)

40 http://www.tomatomap.net http://sgn.cornell.edu/

41 Ovate

42

43 HBa0104A12

44

45

46 55 polymorphic markers 44 polymorphic markers

47 Where can we expect to be? Data based on estimated ~42% of sequence, therefore expect as many as 300 markers for a cross like E6203 x H1706 analysis by Buell et al., unpublished

48 DOS UNIX CygWin (Unix emulator) BLAST BioPerl Perl BioPerl Perl Cyc NCBI When is the time to move from reliance on public databases to in house pipelines? In-house database

49

50 Complete genome sequences are available for: Soybean, Corn, Potato, Tomato, Cucumber, and more are coming….

51 DOS UNIX CygWin (Unix emulator) BLAST BioPerl Perl BioPerl Perl Cyc NCBI When is the time to move from reliance on public databases to in house pipelines? In-house database

52 QTL’s mapped in a bi-parental cross may not be appropriate for MAS in all populations… Marker allele and trait may not be linked in all populations. Genetic background effects may be population specific. Original association may be spurious. QTL detection is dependent on magnitude of the difference between alleles and the variance within marker classes. Confirmation of phenotype along the way is very important!

53 Take home messages: Marker resources exist for forward and background selection in elite x elite crosses in tomato. Marker resources are currently not sufficient for QTL discovery in bi-parental or AM populations; they will soon be. The best time to use genetic markers : early generation selection Restructuring of breeding program to integrate markers may include: 1) Increasing genotypic replication (population size) at the expense of replication (consider augmented designs). 2) Collecting objective data.

54 References: Kaepler, 1997. TAG 95:618-621. Frisch, et al., 1999. Crop Science 39: 1295-1301. Knapp and Bridges, 1990. Genetics 126: 769-777. Yu et al., 2006. Nature Genetics 38:203-308. Van Deynze et al., 2007. BMC Genomics 8:465 www.biomedcentral.com/content/pdf/1471-2164-8-465.pdf


Download ppt "Software and Databases for managing and selecting molecular markers General introduction Pathway approach for candidate gene identification and introduction."

Similar presentations


Ads by Google