Computational Biology 2008 Advisor: Dr. Alon Korngreen Eitan Hasid Assaf Ben-Zaken
Concepts Biological Background Computational Background Goals Methods GPU Abilities GPU Programming Conclusions The Future
Neurons Core components of the nervous system. Highly specialized for the processing and transmission of cellular signals. Communicate with one another via electrical and chemical synapses in a process called synaptic transmission. Consists of soma, axon and dendrite. Their behavior is examined by measuring voltage-gated conductances.
Compartmental Models Models used to investigate complex neuron’s physiology. Consists distributions of voltage-gated conductance. Large number of loosely constrained parameters. Almost impossible to conduct manually.
Genetic Algorithm Used to automate parameter search in the model. Multiple recordings from number of locations improved GA ability to constrain the model. Combined cost function was found to be the most effective. Problem: GA was very slow - about 25 seconds per generation. Solution: Better Computation power.
Create computer software that will perform calculations on the graphics card. Manipulate the existing algorithm to use the software whenever parallel computation can take place. Improve running time and by that allow the addition of even more parameters to the algorithm. Goals
Methods Graphics Accelerators As a tool for parallel computations. Sh – High Level Metaprogramming Language. Communication with the graphics card.C++ Wrapping of the Sh Program. GA also written in C++.
Graphic Accelerators
Performance Thanks in part to the video gaming and entertainment industries today we have graphics cards that are extremely fast and programmable. FLOPS – Floating Point Operations per Second. The FLOPS is a measure of a computer's performance, especially in fields of scientific calculations that make heavy use of floating point calculations, it is similar to instructions per second. CPU – can perform 12 gigaflops. GPU – Nvidia 6800 can perform hundreds of gigaflops.
GPU Architecture In order to understand the parallel power, the architecture need to be studied as well. The GPU holds large number of processing units. Each unit can function as a small CPU. You can send multiple inputs and receive multiple outputs for the same operation. On today’s advanced hardware you can even send a different operation for each unit.
GPGPU Stands for “General-Purpose computation on GPUs“. GPGPU Languages Why do we need them? Make programming GPUs easier! – Don’t need to know OpenGL, DirectX, or ATI/NV extensions. – Simplify common operations. – Focus on the algorithm, not on the implementation.
Main considerations Cross platform Windows\Linux. Cross platform Nvidia\ATI. Operations supported. Memory Management. Program manipulation. Ease of Learn & Documentation available.
Sh High Level Metaprogramming Language Operating Systems: Windows Linux Graphic Cards: NVIDIA GeForce FX 5200 and up. ATI Radeon 9600 and up. Stream Processing: Allows very customized operations to take place on GPU. Today, the new hardware allows even more. Help and Support: We had to compromise something… * Sh is embedded in C++ and therefore its syntax is very easy to learn.
Analyzing the algorithm gprof – a built in Linux tool for profiling programs. We used this tool to find heavy running functions The bottleneck functions were Matrix multiplication and determinant calculation that took about 57% of running time. Next we created stream processed functions to perform these calculations on GPU.
Sh Sample Code // Environment and namespace declaration #include using namespace SH; //Sh Enviroment Initialization shInit(); //Sh Program Definition //This Program adds a 3 float vector of 42.0 to an input 3 float vector ShProgram prg = SH_BEGIN_PROGRAM(“gpu:stream") { ShInputAttrib3f a; ShOutputAttrib3f b; b = a + ShAttrib3f(42.0, 42.0, 42.0); } SH_END;
Sh Sample Code //Setting up the input variable float data[] = { 1.0, 0.5, -0.5 }; ShHostMemoryPtr mem_in = new ShHostMemory(sizeof(float) * 3, data, SH_FLOAT); ShChannel in(mem_in, 1); //Setting up the output variable float outdata[3]; ShHostMemoryPtr mem_out = new ShHostMemory(sizeof(float) * 3, outdata, SH_FLOAT); ShChannel out(mem_out, 1); //Executing the program out = prg << in;
Conclusions The parallel power is unquestionable. In conjunction to a good graphics card Sh could be a possible solution for hardcore processing. Sh grants a developer the ability to develop software that runs on the GPU in real time that could not work on the CPU at interactive rates. We created a program that creates its calculations on the GPU. We still didn’t manage to create a program that its execution is faster than CPU rates because of tuples issue.
References Constraining Compartmental Models Using Multiple Voltage Recordings and Genetic Algorithms Naomi Keren, Noam Peled, and Alon Korngreen Principles of Neural Science Eric R. Kandel. GPGPU Community Website. Sh Website Metaprogramming GPUs with Sh Michael McCool, Stefanus Du Toit
Thanks for listening