Presentation is loading. Please wait.

Presentation is loading. Please wait.

December 5 2008 Rocky Mountain Bioinformatics Conference Speaker: Carl Bergenhem Co-Authors: Chih Lee, Chun-Hsi Huang Dept. of Computer Science and Engineering.

Similar presentations


Presentation on theme: "December 5 2008 Rocky Mountain Bioinformatics Conference Speaker: Carl Bergenhem Co-Authors: Chih Lee, Chun-Hsi Huang Dept. of Computer Science and Engineering."— Presentation transcript:

1 December 5 2008 Rocky Mountain Bioinformatics Conference Speaker: Carl Bergenhem Co-Authors: Chih Lee, Chun-Hsi Huang Dept. of Computer Science and Engineering University of Connecticut

2 December 5 2008 Rocky Mountain Bioinformatics Conference Selection of oligonucleotides for hybridization  Identified probe must not hybridize with other sequences  Must hybridize to target sequence under same reaction conditions  Reaction condition of temperature T is important DNA Probe Selection Algorithm  Kaderali and Schliep, Bioinformatics (2002)  Utilizes the Suffix Tree Data Structure  Efficiently finds the appropriate oligonucleotides for hybridization given a target temerpature T

3 December 5 2008 Rocky Mountain Bioinformatics Conference Data Structure used frequently in Bioinformatics  Allows for search of substrings in O(k) time  Wide range of use  O(n) creation time and memory usage Issues  Nodes scattered all over memory  Searching generates high cache miss rate  Each cache miss adds to overhead of tree traversal

4 December 5 2008 Rocky Mountain Bioinformatics Conference Example Parameters  Cache hit time: 0.1 μs  Cache miss time: 1 μs  k data references  The higher miss rate in Case I will result in a higher data access time (0.46/0.19 = 2.42 times) than Case II. Miss rateHit RateMiss CalculationsHit CalculationsTotal Case I40%60%k*0.4*1 μs = 0.4kk*0.6*0.1= 0.06k μs0.46k μs Case II10%90%k*0.1*1 μs = 0.1kk*0.9*0.1 = 0.09k μs0.19k μs

5 December 5 2008 Rocky Mountain Bioinformatics Conference Making Suffix Tree Cache Aware/Oblivious  Customizing suffix-tree construction  Node restructuring  Tagging, Hashing, Node Coloring, etc. Improving locality distribution of the genome data referenced in DNA Probe Search Goals  Data locality better exploited  Practical run time matching theoretical prediction

6 December 5 2008 Rocky Mountain Bioinformatics Conference Preliminary Results  Initial tree node relocation reduces miss rates at DFS traversal by ~15% with a 32B line size  Additional approaches also reduce miss rates Current Work  Retrieve and analyze data references in DNA Probe Selection  Customize Suffix Tree construction and node structure  Optimize mapping between nodes and cache lines  Analyze practical run times


Download ppt "December 5 2008 Rocky Mountain Bioinformatics Conference Speaker: Carl Bergenhem Co-Authors: Chih Lee, Chun-Hsi Huang Dept. of Computer Science and Engineering."

Similar presentations


Ads by Google