Presentation is loading. Please wait.

Presentation is loading. Please wait.

L. Perelygina (BIO-GSU)

Similar presentations


Presentation on theme: "L. Perelygina (BIO-GSU)"— Presentation transcript:

1 L. Perelygina (BIO-GSU)
Consolidating Software Tools for DNA Microarray Design and Manufacturing M. Atlas (CS-GSU) N. Hundewale (CS-GSU) L. Perelygina (BIO-GSU) A. Zelikovsky (CS-GSU)

2 Microarray Technology
It is a device that allows for DNA to be bound to it for analysis with homologous cDNA or RNA. Refer to by other names: microchip, biochip, DNA chip, DNA microarray, gene array, GeneChip®, and genome chip Microarrays provide a tool for answering a wide variety of questions about the dynamics of cells: In which cells is each gene active? Under what environmental conditions is each gene active? How does the activity level of a gene change under different conditions? -Stage of a cell cycle? -Environmental conditions? -Diseases? What genes seem to be regulated together?

3 Microarray Technology Cont
Two general types that are popular: Spotted Arrays (Pat Brown, Stanford). Oligonucleotide Arrays (Affymetrix). Both based on the same basic principles. anchoring pieces of DNA to glass/nylon slides. complementary hybridization. Spotted Arrays Control Cells (left) and Target Cells (right). Harvesting mRNA from both cell group Tagging the mRNA with green and red dye. Applying the mRNA to the cDNA microarray. Reading the result using a laser A false-color composite representing the results. Oligonucleotide Arrays Gene Chips Instead of putting entire genes on array, put sets of DNA 25-mers (synthesized oligonucleotides). Produced using a photolithography process similar to the ones used to make semiconductor chips. mRNA samples are processed separately instead of in pairs.

4 Hybridization experiment Analysis of hybridization intensities
Genome ID DNA Array Flow Downloading genome sequence and producing ORF in FASTA format. For each gene G, find probes that hybridize to G at a given TM. do not hybridize to any other gene at that TM. In probe placement, determine for each probe a site on the array 2-D surface for it to be placed or synthesized. And in probe embeddings, the step where each nucleotide is synthesized is determined. DNA array goes through a combination of photolithography and combinatorial chemistry process. Each probe is quantified, and probes are diluted so that all are at an equal concentration. Hybridization intensities can be measured by a laser scanner and converted to a quantitative readout. Reading genomic data Probe selection Physical design Mask and array manufacturing Hybridization experiment Analysis of hybridization intensities

5 Or: ORF Finder Extracting
Reading Genomic Data Genome ID Downloading genome sequence from genbank Bioperl ORF Extraction GeneMark (Bordovsky GaTech) Or: ORF Finder Extracting Extra ORFs: ( ) ORF parser ORF in FASTA format Probe selection

6 Probe Selection Probe selection ORF preprocessing
Promide Choosing the best melting temperature Ocand :find all candidate for given temperature Temp Checker’s Pool of probes Physical design Max/ Min Temp

7 Probe Design Constraints
Sequence Related Length of probes Deviation of melting temperature of probe-target hybrids must be low (for physical reasons) No self complementary regions longer than four nucleotides (not descriptive enough) Melting temperatures of target and non-target seq. must be larger than a predefined (too close, too hard to identify) Ensuring a minimum number of mismatches is enough (homologous sequences) System Related Execution Time Usability

8 Deposition sequence design
Physical Design Probe selection Deposition sequence design Test control 2D-probe placement 3D-probe embedding Mask and array manufacturing

9 Mask and Array manufacturing

10 Array Manufacturing Very Large-Scale Immobilized Polymer Synthesis:
Treat substrate with chemically protected “linker” molecules, creating rectangular array Site size = approx. 10x10 microns Selectively expose array sites to light Light deprotects exposed molecules, activating further synthesis Flush chip surface with solution of protected A,C,G,T Binding occurs at previously deprotected sites Repeat steps 2&3 until desired probes are synthesized

11 Benchmarks: Herpes B virus
B virus is a member of the subfamily Alphaherpesvirinae from the genus Simplex virus. Herpes B (HB) virus is mild localized or asymptotic infection in its internal hosts . In contrast HB virus infection in foreign host, humans or monkeys species other than macaques often result in encephalitis, encephalomyelitis, and death. Herpes B virus genome is 156,798 bps long and includes 74 genes[Perelygina03]. In this study, we design a chip for Perelygina to carry on her experiments on HB virus.

12 Experimental Study Melting Temperature: Number of Candidates:
In our experiment we have considered the following parameters and we measured the results for different values of these parameters. Melting Temperature: We choose the temperatures 60C and 65C as best melting temperatures for our DNA probe array. Number of Candidates: We experimented with different values of K (number of candidates) for each pools of probes: 1 and 2. Chip Size: We ran our Experiments with 2 different chip sizes. We experimented with 50x50 and 60x60. We give the number of conflict and runtime for each algorithm for the Herpes B virus and simulated data

13 Herpes B Virus Simulated Data  K=1 # Conflicts CPU Time(sec) Initial 107577 265992 Tsort 98830 0.17 231526 0.08 Tsp 95640 0.22 227960 0.09 Lalign 79254 0.25 189272 0.1 Reptx 2 64830 4.45 154766 1.58 Chessboard 63594 15.58 150812 7.1 Herpes B Virus Simulated Data K=2 # Conflicts CPU Time(sec) Initial 54205 265328 Tsort 49746 0.3 232954 0.14 Tsp 48541 0.34 227762 0.15 LAlign 42858 0.42 182972 0.16 Reptx 2 32098 7.84 149332 3.16 Chessboard 31498 20.93 146708 10.89

14

15 Conclusion and Future work
Our experiments show: The genomic data follow the pattern predicted by simulated data. In case of Herpes B virus, like simulated data, increasing number of candidates per probe (k) decreases number of border conflicts during the probe placement algorithms. The number of border conflicts is several times smaller than for simulated data. The trade-off between number of border conflicts and the CPU time taken for the various algorithms that are defined in the physical design. We give a concatenate software solution for the entire DNA array flow. We explore all steps in a single automated software suite of tools. Future work The entire software suite be made available through web services. Users can enter name of organism or ID and with an option of choosing to set the required parameters the suite will produce the DNA probe micro-array chip layout.


Download ppt "L. Perelygina (BIO-GSU)"

Similar presentations


Ads by Google