March 12, 2008 A Parallel Algorithm for Optimization-Based Smoothing of Unstructured 3-D Meshes by Vincent C. Betro.

Slides:



Advertisements
Similar presentations
 Over-all: Very good idea to use more than one source. Good motivation (use of graphics). Good use of simplified, loosely defined -- but intuitive --
Advertisements

LaGriT Los Alamos Grid Toolbox Carl Gable meshing.lanl.gov lagrit.lanl.gov.
CSE554Cell ComplexesSlide 1 CSE 554 Lecture 3: Shape Analysis (Part II) Fall 2014.
Computer Graphics1 Geometry Area of polygons & Volume Of Polygonal surfaces.
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
Section 7 Mesh Control.
1 Lecture 7 - Meshing Applied Computational Fluid Dynamics Instructor: André Bakker © André Bakker ( ) © Fluent Inc. (2002)
Workshop 15 Hybrid meshing of a simple HVAC assembly
Parallelizing stencil computations Based on slides from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy Yelick, et al., UCB CS267.
UMR CNRS 6599 HeuDiaSyC, UMR CNRS 6066 Roberval 1 A M odular D esign for a P arallel M ultifrontal M esh G enerator J.P. Boufflet, P. Breitkopf, C. Longeau,
1 Minimum Ratio Contours For Meshes Andrew Clements Hao Zhang gruvi graphics + usability + visualization.
CAD Import, Partitioning & Meshing J.Cugnoni LMAF / EPFL 2009.
CISC October Goals for today: Foster’s parallel algorithm design –Partitioning –Task dependency graph Granularity Concurrency Collective communication.
CSE351/ IT351 Modeling And Simulation Choosing a Mesh Model Dr. Jim Holten.
CSE351/ IT351 Modeling and Simulation
Segmentation Divide the image into segments. Each segment:
Filling Arbitrary Holes in Finite Element Models 17 th International Meshing Roundtable 2008 Schilling, Bidmon, Sommer, and Ertl.
Steady Aeroelastic Computations to Predict the Flying Shape of Sails Sriram Antony Jameson Dept. of Aeronautics and Astronautics Stanford University First.
Parallel Mesh Refinement with Optimal Load Balancing Jean-Francois Remacle, Joseph E. Flaherty and Mark. S. Shephard Scientific Computation Research Center.
Tetra-Cubes: An algorithm to generate 3D isosurfaces based upon tetrahedra BERNARDO PIQUET CARNEIRO CLAUDIO T. SILVA ARIE E. KAUFMAN Department of Computer.
Efficient Parallelization for AMR MHD Multiphysics Calculations Implementation in AstroBEAR.
reconstruction process, RANSAC, primitive shapes, alpha-shapes
Cliff Rhyne and Jerry Fu June 5, 2007 Parallel Image Segmenter CSE 262 Spring 2007 Project Final Presentation.
FLANN Fast Library for Approximate Nearest Neighbors
1 Finite-Volume Formulation. 2 Review of the Integral Equation The integral equation for the conservation statement is: Equation applies for a control.
1 CFD Analysis Process. 2 1.Formulate the Flow Problem 2.Model the Geometry 3.Model the Flow (Computational) Domain 4.Generate the Grid 5.Specify the.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Curve Modeling Bézier Curves
Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI RD Project Review Meeting Canadian Meteorological Centre August.
ParFUM Parallel Mesh Adaptivity Nilesh Choudhury, Terry Wilmarth Parallel Programming Lab Computer Science Department University of Illinois, Urbana Champaign.
Grid Generation.
July 16, 2007 Parallel Hierarchical 2D Unstructured Mesh Generation with General Cutting A thesis defense presented by Vincent C. Betro.
Chapter 3 Meshing Methods for 3D Geometries
1 Interconnects Shared address space and message passing computers can be constructed by connecting processors and memory unit using a variety of interconnection.
ANSYS, Inc. Proprietary © 2004 ANSYS, Inc. Chapter 6 ANSYS CFX 9.0.
TSV-Aware Analytical Placement for 3D IC Designs Meng-Kai Hsu, Yao-Wen Chang, and Valerity Balabanov GIEE and EE department of NTU DAC 2011.
High Performance Computing 1 Load-Balancing. High Performance Computing 1 Load-Balancing What is load-balancing? –Dividing up the total work between processes.
Mesh Generation 58:110 Computer-Aided Engineering Reference: Lecture Notes on Delaunay Mesh Generation, J. Shewchuk (1999)
Discontinuous Galerkin Methods and Strand Mesh Generation
StreamX10: A Stream Programming Framework on X10 Haitao Wei School of Computer Science at Huazhong University of Sci&Tech.
Application Paradigms: Unstructured Grids CS433 Spring 2001 Laxmikant Kale.
An Approach to Combined Laplacian and Optimization-Based Smoothing for Triangular, Quadrilateral, and Quad- Dominant Meshes Scott A. Canann, Joseph R.
Surface Meshing Material tret de: S. J. Owen, "A Survey of Unstructured Mesh Generation Technology", Proceedings 7th International Meshing Roundtable,
CFX-10 Introduction Lecture 1.
LLNL-PRES DRAFT This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Adaptive Mesh Applications Sathish Vadhiyar Sources: - Schloegel, Karypis, Kumar. Multilevel Diffusion Schemes for Repartitioning of Adaptive Meshes. JPDC.
Partitioning using Mesh Adjacencies  Graph-based dynamic balancing Parallel construction and balancing of standard partition graph with small cuts takes.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
Data Structures and Algorithms in Parallel Computing Lecture 7.
BOĞAZİÇİ UNIVERSITY – COMPUTER ENGINEERING Mehmet Balman Computer Engineering, Boğaziçi University Parallel Tetrahedral Mesh Refinement.
October 25, 2007 P_HUGG and P_OPT: An Overview of Parallel Hierarchical Cartesian Mesh Generation and Optimization- based Smoothing Presented at NASA Langley,
Image Processing A Study in Pixel Averaging Building a Resolution Pyramid With Parallel Computing Denise Runnels and Farnaz Zand.
1 Data Structures for Scientific Computing Orion Sky Lawlor /04/14.
1 Rocket Science using Charm++ at CSAR Orion Sky Lawlor 2003/10/21.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
June 23, Variational tetrahedral meshing of mechanical models for FEA Matthijs Sypkens Smit Willem F. Bronsvoort CAD ’08 Conference, Orlando, Florida.
CDS 301 Fall, 2008 Domain-Modeling Techniques Chap. 8 November 04, 2008 Jie Zhang Copyright ©
CAD Import Partitioning & Meshing
CIVET seminar Presentation day: Presenter : Park, GilSoon.
Mesh Refinement: Aiding Research in Synthetic Jet Actuation By: Brian Cowley.
High Performance Computing Seminar
Application of Design Patterns to Geometric Decompositions V. Balaji, Thomas L. Clune, Robert W. Numrich and Brice T. Womack.
Lecture 7 - Meshing Applied Computational Fluid Dynamics
Parallel Programming in C with MPI and OpenMP
Domain-Modeling Techniques
Meshing Strategy.
CSE 554 Lecture 3: Shape Analysis (Part II)
Presentation transcript:

March 12, 2008 A Parallel Algorithm for Optimization-Based Smoothing of Unstructured 3-D Meshes by Vincent C. Betro

Outline Introduction Objectives Serial Optimization-Based Smoothing Parallel Implementation Results Conclusions Future Work

Introduction Parallel Hierarchical Unstructured Cartesian Mesh Generation Using P_HUGG, developed with Steve L. Karman, Jr. as part of an NRA with NASA Langley Uses hexahedral, Cartesian elements Body conforming through general cutting, yielding the four basic element types or polyhedral elements Can cause small, high aspect ratio elements to be formed Allows user to determine refinement based on geometry spacing

Introduction Why do we need optimization-based smoothing? < BEFORE AFTER >

Introduction Terminology Ghost nodes: Nodes that exist and will be worked on by a neighboring processor but that are part of elements on the given processor. They allow information about movement to be shared with the neighboring processors fluidly and yet only allow one processor to move them in the smoothing process in order to preserve uniqueness. Threshold: User-defined value that determines the level of the cost function (between 0 and 1 for non-inverted elements) that will be allowed to exist as a maximum in the final mesh. Usual values for a very smooth mesh are between 0.1 and 0.3. Also, low thresholds cause most, if not all, nodes in the mesh to be worked on, making the METIS partitioning more accurate and relevant.

Objectives To determine that one can smooth a mesh in parallel using ghost nodes on processor borders and obtain the same results in a comparable number of iterations / time To determine if weighting based on nodes being on the geometric boundaries (needing to be re-projected), nodes being completely internal (being perturbed in more directions), or no weighting at all would yield the most benefit in determining the METIS partition To determine if the algorithm would only allow larger meshes to be smoothed or if it would also garner speed-up on smaller meshes, if only to a point

Serial Optimization-Based Mesh Smoothing In order to remove high aspect ratio elements (sliver cells) and get improved results from the flow solver, optimization-based smoothing is performed on the mesh. Each node is perturbed based on a cost function calculated using Jacobians and condition numbers of the surrounding elements.

Serial Optimization-Based Mesh Smoothing First, the surrounding triangles’ cost function is computed: Then, the node’s cost function is computed using the max and average: The blending function is used to avoid the average cost being overshadowed:

Serial Optimization-Based Mesh Smoothing If the perturbation improves the cost function for the node, the node is moved permanently to the new position –Along direction of opposite face normal –Along direction of included edge –Along cardinal x, y, or z coordinate axis Note: A node is always perturbed by a pre-determined epsilon, based on a fraction of the length of the smallest edge

Serial Optimization-Based Mesh Smoothing The mesh is moved until: –eventually all perturbations cannot improve the cost function or –the threshold value supplied by the user is obtained or –the maximum number of perturbations supplied by the user is attained. Experience will tell the user when to stop working on the boundaries and/or internal points and/or change threshold values or maximum iterations.

Parallel Implementation Using the METIS libraries, one can partition the mesh to be distributed to multiple processors on which the optimization-based smoothing can then be carried out Available at METIS finds optimal configuration to assist in load balancing and decrease surface area for communication Using OPENMPI for the message-passing protocol

Parallel Implementation Similar implementation except: –Ghost nodes processed first to allow bias to propagate as normal –Communication of changes required between each iteration using processor-index triple Processor index Owning processor local index Global index –Global communication to discern global threshold, action taken based on global value; processors already attaining value continue to improve smoothness and adjust for neighbors who have not attained threshold of cost function

Results: DTMB 5415 Surface Ship Geometry extracted from existing tetrahedral mesh. 382,481 nodes 904,452 elements –391,828 tetrahedra –187,274 hexahedra –325,350 pyramids 1 to 64 processors on Dell Linux Cluster –325 nodes, 1300 processors (4 per node) –4 Gbytes per node –Dual-core Intel EM64T 3.0 GHz Xeon

Results: DTMB 5415 Surface Ship Weighting proved not to assist in time used or smoothness of final mesh Due to similar numbers of different operations needing done whichever type the node was Also, the connectivity of each node and type of opposite face come to bear on workload. Cases run with high thresholds tended to benefit less from the parallelization and the addition of extra processors.

Results: DTMB 5415 Surface Ship

Results: DPW III Surface geometry created in Gridgen –High-aspect ratio quads on wing converted to triangles 1,739,455 points 4,382,046 elements –2,326,329 tetrahedra –899,786 hexahedra –1,155,931 pyramids 1 to 64 processors on Dell Linux Cluster –325 nodes, 1300 processors (4 per node) –4 Gbytes per node –Dual-core Intel EM64T 3.0 GHz Xeon Surface definition Symmetry plane mesh

Results: DPW III Weighting proved not to assist in time used or smoothness of final mesh Due to similar numbers of different operations needing done whichever type the node was Also, the connectivity of each node and type of opposite face come to bear on workload. Cases run with high thresholds tended to benefit less from the parallelization and the addition of extra processors.

Results: DPW III

DPW volume mesh before and after optimization-based smoothing is applied to airfoil. < BEFORE AFTER >

Results: DPW III DPW volume mesh after extra, internal-only optimization-based smoothing is applied to the airfoil with threshold of 0.01 for 50 iterations.

Conclusions An effective optimization-based smoothing algorithm can be easily parallelized using OPENMPI Using low threshold values on both large and small meshes showed the most gains due to the fact that nearly all nodes in the mesh are being worked on the whole time to improve the cost function Large threshold values showed some gains with large meshes but generally showed no improvement past four processors for small meshes Moreover, regardless of the threshold values, very large meshes that could not fit on a serial machine were smoothed, such as one based on USGS data for Yosemite National Park Weighting boundary and non-boundary nodes differently has almost no affect on speed-up or smoothness, and can actually adversely affect both of the above The bias that might be attributable to ghost nodes was successfully damped by smoothing them first

Future Work Add in ParMETIS functionality to allow for pre-partitioned mesh files, which allows the decomposition step to be skipped and greater load balancing and decreased surface area to be obtained through the use of ParMETIS Allow the smoothing to work on general polyhedral elements, not only the four basic element types, by creating a cost function that can handle the corners of any type of polyhderal element Use this simpler optimization as a template for parallelizing linear-elastic smoothing and viscous layer insertion codes Use optimization-based and/or linear-elastic smoothing in conjunction with solution and feature based adaptation in parallel using the octree-based P_HUGG to re-mesh between solution steps for moving, body-conforming, viscous meshes on the fly without intermediate processing