Presentation is loading. Please wait.

Presentation is loading. Please wait.

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 1 A Brief Summer Recap Flocking, CUDA, GPU, Ants, and More Jesse St.Charles.

Similar presentations


Presentation on theme: "O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 1 A Brief Summer Recap Flocking, CUDA, GPU, Ants, and More Jesse St.Charles."— Presentation transcript:

1 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 1 A Brief Summer Recap Flocking, CUDA, GPU, Ants, and More Jesse St.Charles

2 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 2 Some Terms: Kernel – A algorithm which executes on the GPU (blueprint for all simultaneous threads) Host – CPU (normal location programs run) Device – Graphics Card (this is the location of the GPU) CUDA – Compute Unified Device Architecture (API that allows programming of the GPU) made by NVIDIA

3 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 3 NVIDIA CUDA Single Program Multiple Data architecture (SPMD)  Uses Grid/Block thread spawning  Programmer uses thread ID and block ID to access unique data per thread

4 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 4 CUDA: Flocking and Document Flocking Main() Kernel Call-> Main() Kernel Call-> Main One Loop Per Generation Neighborhood Calculation Kernel and Document Comparison Update Pos and Velocity Kernel Start N2N2 N Threads

5 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 5 Initial Flocking Results GPU CPU 100x

6 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 6 Document Flocking Results CPU GPU 5x

7 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 7 Document Flocking Results Document comparisons – most expensive 2000 Documents at Generation: 2 52 200

8 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 8

9 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 9 Other Summer Work Shortest Path  Done for traffic simulation and emergency response  Produced an Ant-Colony shortest path implementation on the GPU but had convergence issues when the graph became to large  Implemented Dijkstra’s shortest path algorithm on the GPU; no clear benefit on the GPU

10 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 10 Summer End Products  Ant-Colony shortest path GPU program  Dijkstra’s Shortest path GPU program  Document Flocking GPU program with display  Companion CPU implementations  Paper produced and submitted for publication  Poster produced for presentation

11 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 11 Future Work in Document Flocking  Use dimensionality reduction  Develop document refinement GPU implementation that removes stop words, stems, and calculates TF-ICF  Develop a whole document analysis system for a GPU workstation (one GPU for refinement, one for document clustering)  Find F-measure for flocking cluster accuracy

12 O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 12 My Immediate Future Fall 2007  Last undergraduate semester at UTC  Continuing research collaboration with Xiaohui  Applying for graduate fellowships  Selecting and applying to Graduate CS programs to begin PhD track in Fall 2008


Download ppt "O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY 1 A Brief Summer Recap Flocking, CUDA, GPU, Ants, and More Jesse St.Charles."

Similar presentations


Ads by Google