O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 1 A Brief Summer Recap Flocking, CUDA, GPU, Ants, and More Jesse St.Charles.

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 1 A Brief Summer Recap Flocking, CUDA, GPU, Ants, and More Jesse St.Charles

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 2 Some Terms: Kernel – A algorithm which executes on the GPU (blueprint for all simultaneous threads) Host – CPU (normal location programs run) Device – Graphics Card (this is the location of the GPU) CUDA – Compute Unified Device Architecture (API that allows programming of the GPU) made by NVIDIA

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 3 NVIDIA CUDA Single Program Multiple Data architecture (SPMD)  Uses Grid/Block thread spawning  Programmer uses thread ID and block ID to access unique data per thread

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 4 CUDA: Flocking and Document Flocking Main() Kernel Call-> Main() Kernel Call-> Main One Loop Per Generation Neighborhood Calculation Kernel and Document Comparison Update Pos and Velocity Kernel Start N2N2 N Threads

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 5 Initial Flocking Results GPU CPU 100x

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 6 Document Flocking Results CPU GPU 5x

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 7 Document Flocking Results Document comparisons – most expensive 2000 Documents at Generation: 2 52 200

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 8

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 9 Other Summer Work Shortest Path  Done for traffic simulation and emergency response  Produced an Ant-Colony shortest path implementation on the GPU but had convergence issues when the graph became to large  Implemented Dijkstra’s shortest path algorithm on the GPU; no clear benefit on the GPU

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 10 Summer End Products  Ant-Colony shortest path GPU program  Dijkstra’s Shortest path GPU program  Document Flocking GPU program with display  Companion CPU implementations  Paper produced and submitted for publication  Poster produced for presentation

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 11 Future Work in Document Flocking  Use dimensionality reduction  Develop document refinement GPU implementation that removes stop words, stems, and calculates TF-ICF  Develop a whole document analysis system for a GPU workstation (one GPU for refinement, one for document clustering)  Find F-measure for flocking cluster accuracy

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 12 My Immediate Future Fall 2007  Last undergraduate semester at UTC  Continuing research collaboration with Xiaohui  Applying for graduate fellowships  Selecting and applying to Graduate CS programs to begin PhD track in Fall 2008

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 1 A Brief Summer Recap Flocking, CUDA, GPU, Ants, and More Jesse St.Charles.

Similar presentations

Presentation on theme: "O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 1 A Brief Summer Recap Flocking, CUDA, GPU, Ants, and More Jesse St.Charles."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 1 A Brief Summer Recap Flocking, CUDA, GPU, Ants, and More Jesse St.Charles.

Similar presentations

Presentation on theme: "O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 1 A Brief Summer Recap Flocking, CUDA, GPU, Ants, and More Jesse St.Charles."— Presentation transcript:

Similar presentations

About project

Feedback