Download presentation
Presentation is loading. Please wait.
Published byAlaina Flynn Modified over 9 years ago
1
Using the Starbridge Systems FPGA-based Hypercomputer for Cancer Research Experiences of a computational chemist/biologist Jack Collins, Ph.D. Advanced Biomedical Computing Center SAIC/National Cancer Institute Frederick, MD
2
Collins E195/MAPLD2004 2 Motivation New technologies in proteomics, genomics, and imaging are providing more data and challenging the conventional wisdom of biologists. Computational biologists must develop more realistic and precise models of biological systems at the cellular and network levels to help make sense of this new data. –Biomedical research needs to measure performance in “Heartbeats to Solution”
3
Collins E195/MAPLD2004 3 Biological Applications Systems Biology –Correlated networks of cells and biological processes –Reaction pathways/cascades Properties of cell/bacterial/viral populations (Biodefense) –Bacterial virulence factors Generating diversity by changing immune signature –Environmental Adaptation of cells/pathogens –Drug Resistence (Cancer, HIV, Bacteria) Nano-systems/nano-technology –Statistical fluctuations must be included in models –Single cells -- Essentially nano-systems Machinery within cells/nucleus Non-equilibrium dynamics
4
Collins E195/MAPLD2004 4 Cellular Processes (Examples) DNA Replication –Interactions with proteins and small molecules Transcription Factors –Gene Regulation RNA –Editing, interference, protein synthesis Regulatory feedback –Kinase Pathways/Cascades
5
Collins E195/MAPLD2004 5 Modeling Reactions/Pathways Discrete Processes Must study populations of cells/molecules but the mean behavior is dependent on the states of the individual entities. –Low copy number of cells/molecules –Variation in copy number –Relatively slow reaction rates –Varied conditions/environments –“Activation potential” to reaction
6
Collins E195/MAPLD2004 6 Simulation Methods Stochastic Simulations –Deterministic modeling Mean behavior of large numbers – often small numbers of biological components –Fluctuations are important Boolean Networks –Lack of experimental rate constants
7
Collins E195/MAPLD2004 7 Why use FPGAs? Current Computational Limitations –Can only model relatively modest systems Computational Efficiency –Inherent parallelism in molecular reactions Scalability –Use multiple FPGAs to simultaneously model hundreds of reactions Looking to Future –Computational power rapidly growing –Price/Performance
8
Collins E195/MAPLD2004 8 Smith-Waterman Update (Proof of Concept) Total # Operations / Second –1 Smith-Waterman Step includes: 25 Logic Operations (Adds, compares, mostly 26-27 bit ops, some single bit ops) 13 Data Reorder Operations (Move, Combine…) 11 Data Stor (Assignment) –Logic Operations Only: 25 Ops * 25Mhz * 448 Smith-Waterman kernels = 280Billion Operations / Second –Logic & Data Operations: 49 Ops * 25Mhz * 448 Smith-Waterman kernels = 550Billion Operations / Second Total Aggregate Communications Bandwidth of Systolic Array –12 * 88 * 25Mhz = 26.4 Gb/s plus 7 * 22 * 50Mhz = 7Gb/s = 34.1 Gb/s Resources Consumed / Resources Available –PE2 – PE7: 60% to 70% consumed –PE1 20% consumed; XPE 5%; XPR.1% DMA transfer between host PC and FPGAs –Initial results 210Mb/sec (FPGA->X86)
9
Collins E195/MAPLD2004 9 Smith-Waterman (cont.) See Poster by Jim Yardley, SBS Opportunities to further optimize the algorithm include: –Increasing the number of SW_Iterations that can be done in parallel (up to 100 Billion Smith Waterman steps/second) –Increasing the clock speed of the hardware (up to 1 Trillion Smith Waterman steps/second) –Friendlier User Interface
10
Collins E195/MAPLD2004 10 Viva Environment VIVA GRAPHICAL LANGUAGE –Capture natively parallel code –Accommodate data of any type, size, or precision –Tune algorithms for speed of execution or conservation of hardware resources VIVA EDITOR –Call Viva algorithms from legacy code such as C, C++, or Fortran –Interactively debug code –Import/Export EDIF files VIVA COMPILER/SYNTHESIZER –Program multi-million gate designs –Compile hardware designs quickly for efficient development VIVA LIBRARIES –Reuse flexible Viva objects which accept any data type or size –Target any hardware platform with a ‘System Description’ –Prototype Viva on any X-86-based Windows machine
11
Collins E195/MAPLD2004 11 Viva as a Modeling Language? Programming FPGAs has generally been the domain of engineers. Viva –“Pseudo-graphical language” – Map Model to Viva –Inherent parallelism of Model can map to FPGAs –Recursion of model –Document Code –Use the underlying elements of Viva™ to create an environment that the bio-informatician/computational biologist can use to program the FPGA hardware –Build Library Elements/Modules specific to Model
12
Collins E195/MAPLD2004 12 Libraries for Biology/Biochemistry Known Reaction Processes Conditional Elements to relate the reactions to each other Outputs to visualize the reactions Built-in Infrastructure for handling I/O Minimize Learning Curve for Modeling Biological Processes
13
Collins E195/MAPLD2004 13 Ease of Programming? Library Creation Examples of simple reactions programmed in Viva by a relatively novice user over a few days. –A B –A B –A B –A+B C
14
Collins E195/MAPLD2004 14
15
Collins E195/MAPLD2004 15
16
Collins E195/MAPLD2004 16
17
Collins E195/MAPLD2004 17
18
Collins E195/MAPLD2004 18 Programming Style Program Design Multiple ways to package logic –No Unique Solution Which is “best” depends on “user” –Simplicity vs. Functionality –Ease of Debugging –Ease of Documenting
19
Collins E195/MAPLD2004 19
20
Collins E195/MAPLD2004 20
21
Collins E195/MAPLD2004 21 Output Interfaces Efficient Computation of the Model is Useless if you can’t see the results Interface into COM objects Integrate Data Analysis and Visualization
22
Collins E195/MAPLD2004 22
23
Collins E195/MAPLD2004 23
24
Collins E195/MAPLD2004 24
25
Collins E195/MAPLD2004 25 Lessons Learned Timing is Everything! –Complexity of building large systems of reactions means that both efficiency (minimize clock ticks) and stability of computation (consistent results by keeping latency and synchronization in check) must be considered in a general system. Many ways to package the logic –Not all are equal! –Simplicity vs Functionality –Document your Code! –Bugs are often subtle Potential is Enormous
26
Collins E195/MAPLD2004 26 Obvious Extensions Timing & Synchronization Finite State machine –State of system may depend on several variables or conditions Not all conditions need to be completely known Some may be “black boxes” that produce a signal Go-Done-Busy-Wait
27
Collins E195/MAPLD2004 27 Future Directions User-Friendly Interfaces to Applications Expand Application Areas –Imaging –Pattern Recognition/Clustering/Data Mining Expand Libraries for Reactions/Pathways –“Tinker-Toy” Modeling Work with Vendor to bring FPGA solutions to wider community of computational biologists –Faster Application Development Time –Debugging and Documentation
28
Collins E195/MAPLD2004 28 Acknowledgements Starbridge Systems –Kent Gilson –Jim Yardley –Fred Geiger NCI for Support –Stan Burt, Director ABCC
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.