CUDA Simulation Benjy Kessler.  Given a brittle substance with a crack in it.  The goal is to study how the crack propagates in the substance as a function.

Slides:



Advertisements
Similar presentations
GPU programming: CUDA Acknowledgement: the lecture materials are based on the materials in NVIDIA teaching center CUDA course materials, including materials.
Advertisements

1 ITCS 5/4145 Parallel computing, B. Wilkinson, April 11, CUDAMultiDimBlocks.ppt CUDA Grids, Blocks, and Threads These notes will introduce: One.
GPU programming: CUDA Acknowledgement: the lecture materials are based on the materials in NVIDIA teaching center CUDA course materials, including materials.
More on threads, shared memory, synchronization
CUDA Continued Adrian Harrington COSC 3P93. 2 Material to be Covered What is CUDA Review ▫Architecture ▫Programming Model Programming Examples ▫Matrix.
NVIDIA Research Parallel Computing on Manycore GPUs Vinod Grover.
Programming with CUDA, WS09 Waqar Saleem, Jens Müller Programming with CUDA and Parallel Algorithms Waqar Saleem Jens Müller.
Basic CUDA Programming Shin-Kai Chen VLSI Signal Processing Laboratory Department of Electronics Engineering National Chiao.
GPU PROGRAMMING David Gilbert California State University, Los Angeles.
CUDA Grids, Blocks, and Threads
Using Random Numbers in CUDA ITCS 4/5145 Parallel Programming Spring 2012, April 12a, 2012.
Using a GPU in InSAR processing to improve performance Rob Mellors, ALOS PI 152 San Diego State University David Sandwell University of California, San.
Programming of multiple GPUs with CUDA and Qt library
CUDA C/C++ BASICS NVIDIA Corporation © NVIDIA 2013.
Shekoofeh Azizi Spring  CUDA is a parallel computing platform and programming model invented by NVIDIA  With CUDA, you can send C, C++ and Fortran.
© David Kirk/NVIDIA and Wen-mei W. Hwu, , SSL 2014, ECE408/CS483, University of Illinois, Urbana-Champaign 1 ECE408 / CS483 Applied Parallel Programming.
GPU Programming EPCC The University of Edinburgh.
An Introduction to Programming with CUDA Paul Richmond
2012/06/22 Contents  GPU (Graphic Processing Unit)  CUDA Programming  Target: Clustering with Kmeans  How to use.
Nvidia CUDA Programming Basics Xiaoming Li Department of Electrical and Computer Engineering University of Delaware.
GPU Programming and CUDA Sathish Vadhiyar High Performance Computing.
More CUDA Examples. Different Levels of parallelism Thread parallelism – each thread is an independent thread of execution Data parallelism – across threads.
GPU Programming David Monismith Based on notes taken from the Udacity Parallel Programming Course.
Tim Madden ODG/XSD.  Graphics Processing Unit  Graphics card on your PC.  “Hardware accelerated graphics”  Video game industry is main driver.  More.
GMAC Global Memory for Accelerators Isaac Gelado, John E. Stone, Javier Cabezas, Nacho Navarro and Wen-mei W. Hwu GTC 2010.
First CUDA Program. #include "stdio.h" int main() { printf("Hello, world\n"); return 0; } #include __global__ void kernel (void) { } int main (void) {
CUDA Programming David Monismith CS599 Based on notes from the Udacity Parallel Programming (cs344) Course.
1 ITCS 4/5010 GPU Programming, UNC-Charlotte, B. Wilkinson, Jan 14, 2013 CUDAProgModel.ppt CUDA Programming Model These notes will introduce: Basic GPU.
CUDA All material not from online sources/textbook copyright © Travis Desell, 2012.
CIS 565 Fall 2011 Qing Sun
Large-scale Deep Unsupervised Learning using Graphics Processors
GPU Architecture and Programming
Parallel Processing1 GPU Program Optimization (CS 680) Parallel Programming with CUDA * Jeremy R. Johnson *Parts of this lecture was derived from chapters.
Lecture 6: Shared-memory Computing with GPU. Free download NVIDIA CUDA a-downloads CUDA programming on visual studio.
GPU Programming using BU Shared Computing Cluster
CS 193G Lecture 2: GPU History & CUDA Programming Basics.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408/CS483, University of Illinois, Urbana-Champaign 1 Introduction to CUDA C (Part 2)
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408/CS483, University of Illinois, Urbana-Champaign 1 ECE408 / CS483 Applied Parallel Programming.
OpenCL Joseph Kider University of Pennsylvania CIS Fall 2011.
Martin Kruliš by Martin Kruliš (v1.0)1.
1 GPU programming Dr. Bernhard Kainz. 2 Dr Bernhard Kainz Overview About myself Motivation GPU hardware and system architecture GPU programming languages.
Lecture 8 : Manycore GPU Programming with CUDA Courtesy : SUNY-Stony Brook Prof. Chowdhury’s course note slides are used in this lecture note.
© David Kirk/NVIDIA and Wen-mei W. Hwu, CS/EE 217 GPU Architecture and Programming Lecture 2: Introduction to CUDA C.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408/CS483, University of Illinois, Urbana-Champaign 1 ECE 8823A GPU Architectures Module 2: Introduction.
Massively Parallel Programming with CUDA: A Hands-on Tutorial for GPUs Carlo del Mundo ACM Student Chapter, Students Teaching Students (SRS) Series November.
Heterogeneous Computing With GPGPUs Matthew Piehl Overview Introduction to CUDA Project Overview Issues faced nvcc Implementation Performance Metrics Conclusions.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408/CS483, University of Illinois, Urbana-Champaign 1 ECE408 / CS483 Applied Parallel Programming.
GPU Programming and CUDA Sathish Vadhiyar High Performance Computing.
Programming with CUDA WS 08/09 Lecture 2 Tue, 28 Oct, 2008.
1 ITCS 4/5145GPU Programming, UNC-Charlotte, B. Wilkinson, Nov 4, 2013 CUDAProgModel.ppt CUDA Programming Model These notes will introduce: Basic GPU programming.
Introduction to CUDA Programming CUDA Programming Introduction Andreas Moshovos Winter 2009 Some slides/material from: UIUC course by Wen-Mei Hwu and David.
CUDA C/C++ Basics Part 2 - Blocks and Threads
CUDA Introduction Martin Kruliš by Martin Kruliš (v1.1)
CUDA Programming Model
Basic CUDA Programming
Device Routines and device variables
ECE408 Fall 2015 Applied Parallel Programming Lecture 21: Application Case Study – Molecular Dynamics.
CUDA C/C++ BASICS NVIDIA Corporation © NVIDIA 2013.
CS 179: GPU Programming Lecture 7.
A lighthearted introduction to GPGPUs Dr. Keith Schubert
Introduction to CUDA C Slide credit: Slides adapted from
CUDA Parallelism Model
Antonio R. Miele Marco D. Santambrogio Politecnico di Milano
Device Routines and device variables
Antonio R. Miele Marco D. Santambrogio Politecnico di Milano
Performance Evaluation of Concurrent Lock-free Data Structures on GPUs
CUDA Programming Model
CUDA Programming Model
Chapter 4:Parallel Programming in CUDA C
Force Directed Placement: GPU Implementation
Presentation transcript:

CUDA Simulation Benjy Kessler

 Given a brittle substance with a crack in it.  The goal is to study how the crack propagates in the substance as a function of time.  This accomplished by simulating the substance as a grid of points with forces acting upon them.  Based on a paper by Heizler, Kessler and Levine[1].

 The simulation models a mode-III crack.  The bonds between molecules are modeled as springs with Hookean forces acting on them.  The spring breaks when the forces acting upon it exceed a certain threshold.  When the spring breaks the particles snap apart exerting more force on the neighboring particles, and the crack progresses.  There is no randomness built into the model but there is inherent randomness in floating point arithmetic.

 A simulation was written to solve the propagation problem.  The simulation is a molecular dynamics simulation, where each particle obeys Newton’s Laws of Motion.  Each iteration the particles update their positions and velocities and cracked springs are eliminated.

 The current implementation only utilizes the CPU, but not the GPU.  The goal of the project is to rewrite the project using CUDA.  This should speed up the running time of the simulation.  In addition it would be interesting to run the simulation on a cluster using Condor.

 CUDA is a C/C++ library which provides an interface to NVIDIA graphics cards.  Using CUDA you can easily run parallel programs on the graphics card.  You can download the CUDA SDK and documentation for free at  Supports Windows (32,64) UNIX and Mac

 float* h_A;  float* h_B;  float* h_C;  float* d_A;  float* d_B;  float* d_C;  // Device code  __global__ void VecAdd(const float* A, const float* B, float* C, int N)  {  int i = blockDim.x * blockIdx.x + threadIdx.x;  if (i < N)  C[i] = A[i] + B[i];  }  // Host code  int main(int argc, char** argv)  {  // Allocate input vectors h_A and h_B in host memory  h_A = (float*)malloc(size);  h_B = (float*)malloc(size); h_C = (float*)malloc(size);

 // Initialize input vectors  RandomInit(h_A, N);  RandomInit(h_B, N);  // Allocate vectors in device memory  cutilSafeCall( cudaMalloc((void**)&d_A, size) );  cutilSafeCall( cudaMalloc((void**)&d_B, size) );  cutilSafeCall( cudaMalloc((void**)&d_C, size) );  // Copy vectors from host memory to device memory  cutilSafeCall( cudaMemcpy(d_A, h_A, size, cudaMemcpyHostToDevice) );  cutilSafeCall( cudaMemcpy(d_B, h_B, size, cudaMemcpyHostToDevice) );  // Invoke kernel  int threadsPerBlock = 256;  int blocksPerGrid = (N + threadsPerBlock - 1) / threadsPerBlock;  VecAdd >>(d_A, d_B, d_C, N);  // Copy result from device memory to host memory  // h_C contains the result in host memory  cutilSafeCall( cudaMemcpy(h_C, d_C, size, cudaMemcpyDeviceToHost) );  }

[1] [2002] Heizler, S. ; Kessler, David ; Levine, H. "Mode-I Fracture in a Nonlinear Lattice with Viscoelastic Forces" Physical Review E, vol. 66, 2002, Art. No [2] Add Vector example in CUDA samples.