Download presentation
Presentation is loading. Please wait.
1
December 5 2008 Rocky Mountain Bioinformatics Conference Speaker: Carl Bergenhem Co-Authors: Chih Lee, Chun-Hsi Huang Dept. of Computer Science and Engineering University of Connecticut
2
December 5 2008 Rocky Mountain Bioinformatics Conference Selection of oligonucleotides for hybridization Identified probe must not hybridize with other sequences Must hybridize to target sequence under same reaction conditions Reaction condition of temperature T is important DNA Probe Selection Algorithm Kaderali and Schliep, Bioinformatics (2002) Utilizes the Suffix Tree Data Structure Efficiently finds the appropriate oligonucleotides for hybridization given a target temerpature T
3
December 5 2008 Rocky Mountain Bioinformatics Conference Data Structure used frequently in Bioinformatics Allows for search of substrings in O(k) time Wide range of use O(n) creation time and memory usage Issues Nodes scattered all over memory Searching generates high cache miss rate Each cache miss adds to overhead of tree traversal
4
December 5 2008 Rocky Mountain Bioinformatics Conference Example Parameters Cache hit time: 0.1 μs Cache miss time: 1 μs k data references The higher miss rate in Case I will result in a higher data access time (0.46/0.19 = 2.42 times) than Case II. Miss rateHit RateMiss CalculationsHit CalculationsTotal Case I40%60%k*0.4*1 μs = 0.4kk*0.6*0.1= 0.06k μs0.46k μs Case II10%90%k*0.1*1 μs = 0.1kk*0.9*0.1 = 0.09k μs0.19k μs
5
December 5 2008 Rocky Mountain Bioinformatics Conference Making Suffix Tree Cache Aware/Oblivious Customizing suffix-tree construction Node restructuring Tagging, Hashing, Node Coloring, etc. Improving locality distribution of the genome data referenced in DNA Probe Search Goals Data locality better exploited Practical run time matching theoretical prediction
6
December 5 2008 Rocky Mountain Bioinformatics Conference Preliminary Results Initial tree node relocation reduces miss rates at DFS traversal by ~15% with a 32B line size Additional approaches also reduce miss rates Current Work Retrieve and analyze data references in DNA Probe Selection Customize Suffix Tree construction and node structure Optimize mapping between nodes and cache lines Analyze practical run times
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.