mps-tk : A C++ toolkit for multiple-point simulation

Slides:



Advertisements
Similar presentations
Introduction to Grid Application On-Boarding Nick Werstiuk
Advertisements

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Practical techniques & Examples
SLA-Oriented Resource Provisioning for Cloud Computing
Master/Slave Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
A multi-scale, pattern-based approach to sequential simulation annual scrf meeting, may 2003 stanford university burc arpat ( coaching provided by jef.
Design Patterns Based on Design Patterns. Elements of Reusable Object-Oriented Software. by E.Gamma, R. Helm, R. Johnson,J. Vlissides.
Parallel and Distributed IR
Introduction to the Enterprise Library. Sounds familiar? Writing a component to encapsulate data access Building a component that allows you to log errors.
Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.
Phase Retrieval Applied to Asteroid Silhouette Characterization by Stellar Occultation Russell Trahan & David Hyland JPL Foundry Meeting – April 21, 2014.
A Cloud is a type of parallel and distributed system consisting of a collection of inter- connected and virtualized computers that are dynamically provisioned.
Elastic Applications in the Cloud Dinesh Rajan University of Notre Dame CCL Workshop, June 2012.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Engr. M. Fahad Khan Lecturer Software Engineering Department University Of Engineering & Technology Taxila.
Ch 1. A Python Q&A Session Spring Why do people use Python? Software quality Developer productivity Program portability Support libraries Component.
Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
The Effects of Parallel Programming on Gaming Anthony Waterman.
 Frequent Word Combinations Mining and Indexing on HBase Hemanth Gokavarapu Santhosh Kumar Saminathan.
Sudhanshu Khemka.  Treats each document as a vector with one component corresponding to each term in the dictionary  Weight of a component is calculated.
High-level Interfaces for Scalable Data Mining Ruoming Jin Gagan Agrawal Department of Computer and Information Sciences Ohio State University.
Enabling Grids for E-sciencE LRMN ThIS on the Grid Sorina CAMARASU.
HIERARCHICAL TEMPORAL MEMORY WHY CANT COMPUTERS BE MORE LIKE THE BRAIN?
Parallel Computing Chapter 3 - Patterns R. HALVERSON MIDWESTERN STATE UNIVERSITY 1.
Application of Design Patterns to Geometric Decompositions V. Balaji, Thomas L. Clune, Robert W. Numrich and Brice T. Womack.
 Operating system.  Functions and components of OS.  Types of OS.  Process and a program.  Real time operating system (RTOS).
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
10/2/20161 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam King,
Introduction to threads
These slides are based on the book:
Chapter 6: Securing the Cloud
Danfoss Visual Inspection System
Modeling Big Data Execution speed limited by: Model complexity
OPERATING SYSTEM CONCEPT AND PRACTISE
Deep Learning Amin Sobhani.
The Development Process of Web Applications
MATLAB Distributed, and Other Toolboxes
LOCO Extract – Transform - Load
Definition of Distributed System
Processes and Threads Processes and their scheduling
A Methodology for System-on-a-Programmable-Chip Resources Utilization
Parallel Objects: Virtualization & In-Process Components
Spatial Analysis With Big Data
Introduction to Operating System (OS)
Workshop in Nihzny Novgorod State University Activity Report
Real-Time Ray Tracing Stefan Popov.
University of Technology
Threads & multithreading
Pejman Tahmasebi and Jef Caers
Fast Pattern Simulation Using Multi‐Scale Search
Static Image Filtering on Commodity Graphics Processors
Parallel Computation Patterns (Reduction)
Analysis models and design models
An Introduction to Software Architecture
Overview of AIGA platform
Multithreaded Programming
Brent Lowry & Jef Caers Stanford University, USA
Monte Carlo I Previous lecture Analytical illumination formula
Authors: Barry Smyth, Mark T. Keane, Padraig Cunningham
Implementation support
Convolution Layer Optimization
Chaitali Gupta, Madhusudhan Govindaraju
Automatic and Efficient Data Virtualization System on Scientific Datasets Li Weng.
Emulator of Cosmological Simulation for Initial Parameters Study
Question 1 How are you going to provide language and/or library (or other?) support in Fortran, C/C++, or another language for massively parallel programming.
From Use Cases to Implementation
Implementation support
Presentation transcript:

mps-tk : A C++ toolkit for multiple-point simulation Alexandre Boucher Advanced Resources and Risk Technology, LLC Stanford University, ERE Toan Tran Stanford University, EE

Objectives of the toolkit A SGeMS-based C++ library to ease development of mps algorithms Present abstraction of mps components Provide a set of coherent mps building blocks to developers Provide implementation examples Stepping stone for the next generation of mps algorithms Target user is a developer not a geomodeler NOTE: goal is to develop tools to build mps algorithms not write new algorithms.

Generic mps simulation algorithm Imposition of a spatial texture (geology) on a grid such that it conforms to data. Patterns Database The patterns database is typically built from training images A sequential simulation algorithm builds and queries that database The simulation is not based on the training image per se, but from the patterns extracted from that training image

Generic sequential simulation algorithm Pattern Database

Generic sequential simulation algorithm Pattern Database

Texture is function of the patterns database The pattern database controls the look of the simulated geology The data control the locations of some texture element

Generic abstraction of mps Patterns database Known values and ancillary data (trends, regions, angles, …) Data event extracted from the grid and ancillary data Sampler Select a replicate from the list returned from the pattern database Replicates returned from the database that match the data event. May take different forms (e.g. distributions) Probability field adjustment

Abstraction components Data event Pattern and information retrieved from the simulation grid Pattern database Find patterns matching the data-event Replicates Outputs from the database Sampler Select a replicate from the collection of replicates

Libraries included in mps-tk mps_dataevent Contains all the dev related functionalities mps_pattern_database Database and functionalities to build new database mps_replicate Sampler, servo-system and collection of replicates mps_app_utility Base classes and utilities that link mps-tk together mps_parallelism Task manager for multi-threading Unit testing Functionality and regression testing

mps_dataevent Created with either ? Geovalue_dataevent Data-event around a central location (+ ancillary data) Fixed_template_geodev_builder Global_template_geodev_builder Directional_overlap_dataevent Directional_overlap_dev_builder ? Created with Dataevent_distance Compute the similarity between two data events Distance ,

mps_pattern_database Search_tree_pattern_database Build a search tree Scanning_pattern_database Search the TI for any data configuration. Searched with Fixed_template_pattern_database Decompose the TI into patterns with fixed shape Cross_correlation_pattern_database Fourier space TI values ready for convolution Multigrid_pattern_database patDB-1 , , … , One database per multi-grid

mps_replicate Results from the pattern database , Collection_replicate Results from the pattern database Cdf_replicate Conditional distribution from the pattern database Monte Carlo drawing from replicates Pixel-based simulation Draw from a distribution given the constraint of a servo system Cdf_servo_sampler Draw from a distribution given the constraint of a servo system with a probability field Updating_cdf_servo_sampler Patch-based simulation Random_replicate_sampler Draw a replicate from a collection of replicates

Mps_parallelism Computational_task Encapsulates data and a processing method to be executed by a computational resource (thread or process). Computational_task_manager Uses threads to execute tasks Currently uses the Boost thread library for shared memory machine. Designed to be extended to distributed memory applications. Platform-independent (Windows, Linux) To run in parallel, write cpu-intensive process as a Computational_task

Type of parallelisms implemented Parallel task manager applied within a data base (e.g. patterns scanning ) patDB Task 1 Task 2 … Task N Parallel task manager applied to a full simulation (e.g. CCSIM) patDB-1 Task 1 … patDB-1 Task N

A note on hard data conditioning Lack of standard approach for conditioning in mps Copy data on grid Built-in conditioning (snesim) Distance-weighted function (filtersim) Adaptive data event (ccsim) Rejection approach Objectives: mps-tk does not interfere with conditioning approaches. mps-tk offers flexibility to implement the appropriate conditioning technique.

Build a scanning pattern database // set basic data event builder dataevent_builder->grid_is(training_image_); dataevent_builder->neighborhood_is(subgrid_ti_neighborhoods_); // set trend data event builder trend_dev_builder->dataevent_builder_is(dataevent_builder); trend_dev_builder->trend_property_is(training_image_trend_property_); fixed_template_pattern_db_->dataevent_builder_is(trend_dev_builder); // set dev_distance to pattern database fixed_template_pattern_db_->dataevent_distance_is(trend_dataevent_distance_); // Create and set dev_selector to pattern database dataevent_selector->k_number_is(k_number); // set dev_selector to pattern database fixed_template_pattern_db_->dataevent_selector_is(dataevent_selector_);

Pattern database with multiple grid Multigrid_pattern_database A collection of database; one for each multiple grid Each multi-grid can use a different pattern database multigrid_scanning_pattern_db_ = Multigrid_pattern_database< Geovalue_dataevent, Collection_replicate<Geovalue_dataevent> >::Ptr_new(); for (int mgrid = 0; mgrid <= nb_mgrids_; mgrid++) { scanning_pattern_db_ = Scanning_pattern_database::Ptr_new(); // … initialize individual pattern database … // add database to the mgrid database multigrid_scanning_pattern_db_->pattern_database_add(scanning_pattern_db_); }

Multi-grid simulation with a scanning database /* Loop for all multiple grids */ for (int mgrid = nb_mgrids_; mgrid >= 0; mgrid--){ /* initialize simulation path */ RGrid::random_path_iterator path_begin = simul_grid_->random_path_begin(); RGrid::random_path_iterator path_end = simul_grid_->random_path_end(); /* Set subgrid to multigrid_pattern_databases */ multigrid_fixed_template_pattern_db_->current_subgrid_is(subgrid); /* Run simulation */ pixel_based_simulate( path_begin, path_end, fixed_template_dev_builder_.get(), multigrid_fixed_template_pattern_db_.get(),sampler_.get() ); }

CCSIM algorithm The data event builder and the FFT-based database were previously initialized for (; _path_begin != _path_end; _path_begin++) { /* Create the data event */ Directional_overlap_dataevent::Ptr simul_dev = simul_dev_builder_->dataevent(*_path_begin); /* Get the collection of replicates from the pattern database*/ Lazy_collection_replicate<Directional_overlap_dataevent>::Ptr replicates = ccsim_pattern_db_->replicate(simul_dev); /* Randomly draw a replicate from the selected set of replicates*/ sampler_->sample(replicates); }

Development and perspectives Immediate goal is to provide tools to develop new algorithms Write more examples to expand and test the components and the design libraries GPU-powered pattern database Parallelism on server with MPI Complex gridding Offer new perspectives Meta-pattern database : a database of several databases Pattern database not derived from a TI: e.g. process-based equations Cloud-based pattern database