Presentation is loading. Please wait.

Presentation is loading. Please wait.

Probe Selection Problems in Gene Sequences. (C) 2003, SNU Biointelligence Lab, DNA Microarrays cDNA: PCR from.

Similar presentations


Presentation on theme: "Probe Selection Problems in Gene Sequences. (C) 2003, SNU Biointelligence Lab, DNA Microarrays cDNA: PCR from."— Presentation transcript:

1 Probe Selection Problems in Gene Sequences

2 (C) 2003, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ DNA Microarrays cDNA: PCR from clones Oligonucleotide: design specific probes

3 (C) 2003, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Gene Detection Using Microarrays ….CCCATGGACTCAAG…. ….CCCTGGCGACAGTT…. ….AATCCTACGACGGC…. ….AGCTCTAGGCCCAT…. Each probe  Selected from gene sequences  Same length (20mer~)  Detect one gene

4 (C) 2003, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Probe Selection Uniqueness  Comparing potential probe sequences with the full-length sequences of other genes being monitored Hybridization characteristics  Tm among probes  GC content  Secondary structure  Position of core region  Number of mismatch

5 (C) 2003, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Affymetrix Probe Selection Criteria Single base(As, Ts, Cs, Gs)  Not exceeds 50% of the probe size Length of contiguous As and Ts or Cs and Gs region  Less than 25% of the probe size (G+C)%  Between 40 and 60% of the probe sequence  (G+C)% can be adjusted based on G+C contents of a genome sequence Contiguous repeat  No 15-long repeat anywhere in the entire coding sequence of the whole genome No self-complementary within probe sequence

6 (C) 2003, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Tm Calculation Approximate target DNA concentration  : enthalpy for helix formation  : entropy for helix formation  R : molar gas constant ( )  c : total molar concentration of the annealing oligonucleotides when oligonucleotides are not self-complementary

7 (C) 2003, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ General Approach Probe sequence 생성  임의의 서열을 만들어 후보 probe 를 생성한다.  Target gene 서열로부터 후보 probe 를 생성한다.  일반적으로 유전자의 처음부터 sliding window 방식으로 생성 한다. ACGCGTCGCGAGGCCTAGGCC… 후보 probe 시퀀스

8 (C) 2003, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ General Approach 후보 probe 서열 중 target sequences 이외의 유전자에 흔히 존재하는 것 들을 삭제한다. (blast 등과 같은 sequence alignment 프로그램을 이용한다 )  cross hybridization 발생 가능성 억제 Target 과 probe 간의 hybridization 이 잘되게 하기 위해 Tm 값을 이용해 후보 probe 서열들을 필터링한다. Sequence 의 2 차원 구조에 따른 필터링을 한다. Intramolecular 구조가 형성될 수 있는 것들을 제거.

9 (C) 2003, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ GA Representation Chromosome represents possible probe set. Each position of a chromosome is the starting point of probe sequence in gene sequence. one probe (parameter: probe length c) probe set (size n: number of target genes) 3..... 2 15 90 start position of selected region

10 (C) 2003, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ GA Operators Fitness function  Linear combination of sequence match and probe characteristics  Combination rate can be varied with generation  Sequence match between probe and target gene  Perfect match: 1  Mismatch: 0  Considered probe characteristics  Tm  (G+C)%  Self-complementary (by sequence comparison) Population  Population  Parents are selected from roulette-wheel  One-point crossover with Pc, mutation with Pm

11 (C) 2003, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Optimizing Probe Set One probe(spot) is designed to detect one gene. Assumption  Optimal probe set will contain as few oligonucleotide as possible Goal  M probes detect N genes (M is less than or equal to N)  Find M probes which detect N genes

12 (C) 2003, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/Example 3 target genes ….CCCATAGGCTCAAG…. ….CCTAGGCGCGCTCA…. ….AGCTCTAGGCCCAT….

13 (C) 2003, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ GA Representation Probe sequences are selected from target gene sequences one probe (parameter: probe length c) probe set (size m: number of probes) 3..... 2 15 90 start position of selected region 1..... 8 20 10 gene id

14 (C) 2003, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Sequence Matching Target, probe Matching result between target c i and probe set  Can be represented as list or vector   fingerprint  Two clones are distinguishable  Find probe set which make all fingerprints different each other.

15 (C) 2003, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/Issues Consider number of mismatch Consider Blast search Design of probe sequences For fitness conditions, probe characteristics can be learned?


Download ppt "Probe Selection Problems in Gene Sequences. (C) 2003, SNU Biointelligence Lab, DNA Microarrays cDNA: PCR from."

Similar presentations


Ads by Google