Download presentation
Presentation is loading. Please wait.
1
Parallel algorithms for calculation of binding energies and adatom diffusion on GaN (0001) surface
Alexander Minkin, Andrey Knizhnik National Research Centre “Kurchatov Institute” The 7th International Conference GRID'2016 1
2
The 7th International Conference GRID'2016
Why GaN ? GaN has great potential in high-power electronics capable of operation at elevated temperatures and high frequencies. Epitaxy: Deposition and growth of monocrystalline structures/layers. Epitaxy types: Homoepitaxy: Substrate & material are of same kind (Si-Si) Heteroepitaxy: Substrate & material are of different kinds (Ga-N, Al-N) Crystal growth of GaN with MBE Advantages: High purity growth Hydrogen free environment Possibility to use plasma or Laser assisted growth Disadvantages: Need ultra-high vacuum Low growth rate Nitrogen (N2) gas cannot be directly used for GaN growth. Ammonia (NH3) is used instead. The 7th International Conference GRID'2016
3
GaN MBE Problems: Heteroepitaxy :
Difficult to get native substrates in high quality and large quantities Even the slight lattice mismatch induce misfit dislocations at the interface, which could develop cracks in crystals that degrade the performance of devices. Heteroepitaxy : Al2O3 GaN 500 nm Buffer layer AlGaN (the percentage of Al decreases to 10%) Initial layer (AlN or AlGaN) Barrier layer 2DEG Al2O3 Better quality in many cases, available up to inches in diameter, inexpensive. The 7th International Conference GRID'2016
4
The 7th International Conference GRID'2016
Atomistic modeling DFT calculation: PW91 functional Plane wave basis Ecutoff<400 eV GaN (0001) surface, 4 layers The opposite side is H-terminated Atoms are deposited on a surface and become adsorbed (adatoms). They diffuse around the surface and can be bound to the surface. Vice versa, unbinding and desorption happens. PES for Al adatom on Al terminated (0001) AlN surface The 7th International Conference GRID'2016
5
Classical molecular dynamics (MD) approach
The 7th International Conference GRID'2016
6
Description of the MD method
Basic MD tasks Molecular dynamics ~109 atoms for several ns Calculation of forces Neighbor list generation Integrate Indicators computation For systems with short-range interactions all tasks include large number of independent steps => suitable for fine-grain parallelization, such as GPU and accelerators. The 7th International Conference GRID'2016
7
Pair potentials. Lennard-Jones potential
Energy can be calculated from the sum of rij dependent energy contributions between pairs of atoms: The 7th International Conference GRID'2016
8
Many-body potentials. Embedded atom model
Many-body potentials can reproduce: Vacancy formation energy. Mechanical properties of crystals and nanostructures. Example: c11/c44=1.5 (for Cu). Variety of chemical effects affected by the strength of the covalent bonding interaction. General form of many-body potentials: Embedded atom potential EASC2013 conference, April The 7th International Conference GRID'2016
9
Pair potentials. Lennard-Jones potential
Two-particle interaction depends only on their mutual positions rij
10
The 7th International Conference GRID'2016
Tersoff potential Bond order Cut off function The 7th International Conference GRID'2016
11
Binding energies calculation
BridgeGa BridgeN fcc (FCC) hcpGa (HCP1) N Ga hcpN (HCP2) The 7th International Conference GRID'2016
12
Binding energies calculation (DFT)
BE(eV/A) = Eslab+A − Eslab – EA Ga adatom z-coord DFT relaxation Energy BE ∆z Ga/hcpN Ga/bridgeGa Ga/fcc Ga/bridgeN Ga/hcpGa N adatom z-coord DFT relaxation Energy BE ∆z N/hcpN N/bridgeGa N/fcc N/bridgeN N/hcpGa The 7th International Conference GRID'2016
13
DFT Results Binding energies (BE) and diffusion barriers for Ga and N, on a Ga-terminated GaN(0001) surface BE, eV Diffusion Barrier, eV Ga -5.488 0.786 N -7.311 0.983 NEB N prefers the fcc site with a BE of eV (negative binding energies signify exothermic adsorption), whereas Ga prefers to bind at the hcpN site (BE= eV).
14
Binding energies calculation
BE(eV/A) = Eslab+A − Eslab Ga adatom z-coord variation without relaxation (GaN.tersoff) Energy BE ∆z Ga/hcpN -1.38 2.43 Ga/bridgeGa -1.77 2.24 Ga/fcc -2.59 2.60 Ga/bridgeN -1.18 2.93 Ga/hcpGa -1.28 2.39 Ga adatom z-coord variation without relaxation (GaN.sw) Energy BE ∆z Ga/hcpN 2.55 Ga/bridgeGa 1.90 Ga/fcc 2.48 Ga/bridgeN 2.82 Ga/hcpGa 2.29 Ga adatom z-coord variation with relaxation (GaN.tersoff) Energy BE ∆z Ga/hcpN 2.85 Ga/bridgeGa 0.81 Ga/fcc 1.75 Ga/bridgeN 2.17 Ga/hcpGa 1.96 Ga adatom z-coord variation with relaxation (GaN.sw) Ga/hcpN Energy BE ∆z Ga/bridgeGa 1.03 Ga/fcc 1.23 Ga/bridgeN 2.30 Ga/hcpGa 3.09 2.27 The 7th International Conference GRID'2016
15
Binding energies calculation
BE(eV/A) = Eslab+A − Eslab N adatom z-coord variation without relaxation (GaN.tersoff) Energy BE z N/hcpN 2.40 N/bridgeGa 1.57 N/fcc 1.70 N/bridgeN 2.50 N/hcpGa 1.93 N adatom z-coord variation without relaxation (GaN.sw) Energy BE z N/hcpN 2.04 N/bridgeGa 1.49 N/fcc 1.97 N/bridgeN 2.47 N/hcpGa 1.95 N adatom z-coord variation with relaxation (GaN.tersoff) Energy BE z N/hcpN 1.96 N/bridgeGa 0.79 N/fcc 1.22 N/bridgeN 1.74 N/hcpGa 2.79 N adatom z-coord variation with relaxation (GaN.sw) Energy BE z N/hcpN 2.50 N/bridgeGa 0.94 N/fcc 0.71 N/bridgeN 0.91 N/hcpGa 1.96 The 7th International Conference GRID'2016
16
The 7th International Conference GRID'2016
Flynn types Flynn, M. (1972). "Some Computer Organizations and Their Effectiveness". IEEE Trans. Comput. C-21: 948. Single Instruction Multiple Instruction Single Data SISD (serial) MISD Multiple Data SIMD (CUDA, OpenCL) MIMD (MPI, OpenMP) The 7th International Conference GRID'2016
17
The 7th International Conference GRID'2016
GPU: pros and cons Large number of compute cores. Fewer transistors for control and cache. RISC architecture. Execution large number of threads allows to hide global memory latency. High global memory bandwidth. Low cost of shared memory access. High performance/$. High performance /Watt. High cost of global memory random access. Worse performance for code with large number of branches (‘if-then-else’ instructions). The 7th International Conference GRID'2016
18
The 7th International Conference GRID'2016
OpenCL Hardware OpenCL is supported by different type platforms and devices: NVIDIA GPUs (GeForce, Tesla, Quadro) AMD GPUs (RadeOn, FirePro), x86 CPU (AMD, Intel), Cell BE, ARM CPUs, Altera FPGA OpenCL will be supported on some devices: ARM GPU: Mali, PowerVR, Adreno. Parallella (Adapteva Epiphany-IV Architecture) The 7th International Conference GRID'2016
19
The 6th International Conference GRID'2014
OpenCL Technology OpenCL technology has some disadvantages: Hardware dependent optimizations are needed to get high performance code. OpenCL is really portable and cross platform but the performance is not portable. OpenCL compilers are rather raw. OpenCL incorporation is rather slow. Poor debug capabilities. Poor profile and optimization tools (Nvidia, Intel). The 6th International Conference GRID'2014
20
Tersoff potential Algorithm A (with atomic operations)
N atoms, N GPU threads! reused values not reused values The 6th International Conference GRID'2014
21
Atomic (read-modify-write) operations
k1 j2 j1 k2 i1 =j2 j1 =k2 i2 OpenCL realisation of atomics: int waiting = 1; while (waiting) { if (!atom_xchg(&semaphor[k], 1)) { forces[k] += prefactor * z_sh_ij_sum; atom_xchg(&semaphor[k], 0); waiting = 0; } The name ‘atom’ comes from the Greek ἄτομος (atomos, "indivisible") To get the correct results in case of concurrent memory access for write we need a special update scheduling mechanism by means of atomic operations The 7th International Conference GRID'2016
22
Tersoff potential Algorithm WA (without atomic operations)
N atoms, N GPU threads! The 7th International Conference GRID'2016
23
Tersoff potential The 7th International Conference GRID'2016
Analysis of efficiency of implementation of manybody interatomic potentials GPU – NVidia GeForce GTX 470, CPU – Core i5 760 (4 cores) The advantage of GPU-acceleration is clearly pronounced in molecular dynamics simulation because the many-body models (Tersoff, REBO, …) have much higher algorithmic complexity than pair potential models. The 7th International Conference GRID'2016
24
Embedded atom potential
Analysis of efficiency of implementation of manybody interatomic potentials GPU – NVidia GeForce GTX 470, CPU – Core i5 760 (4 cores) The advantage of GPU-acceleration is clearly pronounced in molecular dynamics simulation because the many-body models (Tersoff, REBO, …) have much higher algorithmic complexity than pair potential models. The 7th International Conference GRID'2016
25
Performance of the algorithms
Lower execution time is better performance ETR (I,II) = Execution time of Alg II / Execution time of Alg I Exec. Time Ratio \ Number of atoms 1000 2000 4000 8000 16000 Tersoff, ETR(Algorithm A, Algorithm WA) 9.67 9.62 10.42 10.68 11.94 EAM, ETR(Algorithm A, Algorithm WA) 0.51 0.48 0.47 0.49 LJ, ETR(Analytic, Numeric) 8.39 7.97 15.69 35.20 42.01 Algorithm A is implemented with atomic operations. Algorithm WA is implemented without atomic operations. Analytic force calculation is much faster than numeric even for pair potentials. The 7th International Conference GRID'2016
26
The 7th International Conference GRID'2016
Conclusion DFT is the best method for modeling of adatom diffusion and surface properties of GaN. Empirical interatomic potentials can be used for to calculate binding energies and diffusion barriers. Both Tersoff and SW potentials do not reproduce correctly binding energies. Additional fitting should be done. OpenCL is the effective technology of GPU programming and is very good for atomistic simulations based on molecular dynamics with many-body potentials. Atomic operations are not always bad! Tersoff potential is 10 times faster with atomics. The average GPU acceleration is about 30 times for Tersoff and embedded atom potentials. The 7th International Conference GRID'2016
27
The 7th International Conference GRID'2016
Acknowledgement This work is partially supported by the Russian Foundation for Basic Research (Grant no ). The reported study was done using computational resources of MCC NRC “Kurchtov Institute” The 7th International Conference GRID'2016
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.