XRD data analysis software development. Outline  Background  Reasons for change  Conversion challenges  Status 2.

Slides:



Advertisements
Similar presentations
Configuration management
Advertisements

Configuration management
Yaron Doweck Yael Einziger Supervisor: Mike Sumszyk Spring 2011 Semester Project.
Implementation of 2-D FFT on the Cell Broadband Engine Architecture William Lundgren Gedae), Kerry Barnes (Gedae), James Steed (Gedae)
ICS103 Programming in C Lecture 1: Overview of Computers & Programming
Lecture 1: Overview of Computers & Programming
ARM-DSP Multicore Considerations CT Scan Example.
Stream Processing of X-ray Microdiffraction Data on Multicores Yuzhen Xie, University of Western Ontario (UWO) joint work with Alain Biem, IBM Research.
Reference: Message Passing Fundamentals.
Chapter 16 Programming and Languages: Telling the Computer What to Do.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
Topic 1: Introduction to Computers and Programming
CCSA 221 Programming in C CHAPTER 2 SOME FUNDAMENTALS 1 ALHANOUF ALAMR.
U.S. Department of the Interior U.S. Geological Survey David V. Hill, Information Dynamics, Contractor to USGS/EROS 12/08/2011 Satellite Image Processing.
C.S. Choy95 COMPUTER ORGANIZATION Logic Design Skill to design digital components JAVA Language Skill to program a computer Computer Organization Skill.
Cell Architecture. Introduction The Cell concept was originally thought up by Sony Computer Entertainment inc. of Japan, for the PlayStation 3 The architecture.
OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston.
RUP Implementation and Testing
Data Structures & AlgorithmsIT 0501 Algorithm Analysis I.
Chapter 1. Introduction.
Lecture 14 Reconfigurable Computing Basics Lecturer: Simon Winberg.
NA3 DORII: Deployment of Remote Instrumentation Infrastructure Matteo Lanati EUCENTRE (European Centre for Training and Research in.
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
Gedae Portability: From Simulation to DSPs to the Cell Broadband Engine James Steed, William Lundgren, Kerry Barnes Gedae, Inc
National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Cell processor implementation of a MILC lattice QCD application.
Computer system overview1 The Effects of Computers Pervasive in all professions How have computers affected my life? How have computers affected my life?
EET 4250: Chapter 1 Computer Abstractions and Technology Acknowledgements: Some slides and lecture notes for this course adapted from Prof. Mary Jane Irwin.
Computer Systems Organization CS 1428 Foundations of Computer Science.
Configuration Management (CM)
BLU-ICE and the Distributed Control System Constraints for Software Development Strategies Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
InCoB August 30, HKUST “Speedup Bioinformatics Applications on Multicore- based Processor using Vectorizing & Multithreading Strategies” King.
Programming Examples that Expose Efficiency Issues for the Cell Broadband Engine Architecture William Lundgren Gedae), Rick Pancoast.
Invitation to Computer Science 5 th Edition Chapter 6 An Introduction to System Software and Virtual Machine s.
Computer Components.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Kevin Eady Ben Plunkett Prateeksha Satyamoorthy.
Of Remote Beamlines, Micro-diffraction and HP Network Computing VESPERS X ray Beamline Capabilities: Micro-diffraction/fluorescence User Base: Earth and.
Chapter 1 Computer Abstractions and Technology. Chapter 1 — Computer Abstractions and Technology — 2 The Computer Revolution Progress in computer technology.
Improving I/O with Compiler-Supported Parallelism Why Should We Care About I/O? Disk access speeds are much slower than processor and memory access speeds.
Chapter 1 Introduction. Chapter 1 - Introduction 2 The Goal of Chapter 1 Introduce different forms of language translators Give a high level overview.
CS 460/660 Compiler Construction. Class 01 2 Why Study Compilers? Compilers are important – –Responsible for many aspects of system performance Compilers.
Petra III Status Teresa Núñez Hasylab-DESY Tango Meeting DESY,
Original Requirements for Science Studio : (1)Convenient control of all aspects of an X ray fluorescence (XRF) facility: visible sample, easy sample manipulation,
Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.
LYU0703 Parallel Distributed Programming on PS3 1 Huang Hiu Fung Wong Chung Hoi Supervised by Prof. Michael R. Lyu Department of Computer.
Programming in C++ Dale/Weems/Headington Chapter 1 Overview of Programming and Problem Solving.
The Octoplier: A New Software Device Affecting Hardware Group 4 Austin Beam Brittany Dearien Brittany Dearien Warren Irwin Amanda Medlin Amanda Medlin.
CS-303 Introduction to Programming
1 ANISE: Active Network for Information from Synchrotron Experiments “Active” means near-instantaneous stream processing of complex data during transfer.
3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,
K. Sándor, M. Kozlovszky, V. Kamarás, L. Ficsór, S. V. Varga, B. Molnár HPCS 2008 April 14, 2008, Ottawa, Canada.
Fast and parallel implementation of Image Processing Algorithm using CUDA Technology On GPU Hardware Neha Patil Badrinath Roysam Department of Electrical.
Software. Introduction n A computer can’t do anything without a program of instructions. n A program is a set of instructions a computer carries out.
IBM Cell Processor Ryan Carlson, Yannick Lanner-Cusin, & Cyrus Stoller CS87: Parallel and Distributed Computing.
Introduction to Computer Programming By: Mr. Baha Hanene Chapter 1.
Chapter 1. Introduction.
What Do Computers Do? A computer system is
Advanced Computer Systems
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
High Performance Computing on an IBM Cell Processor --- Bioinformatics
Cell Architecture.
ICS103 Programming in C Lecture 1: Overview of Computers & Programming
Autoprocessing updates at the MX beamlines
Introduction to IT Zeeshan A. Bhatti.
Programming Languages
Multithreaded Programming
Lecture 2 The Art of Concurrency
Multicore and GPU Programming
An Orchestration Language for Parallel Objects
Multicore and GPU Programming
Presentation transcript:

XRD data analysis software development

Outline  Background  Reasons for change  Conversion challenges  Status 2

X-Ray Diffraction (XRD)  What is XRD experiment for? provides information on the relative positions of atoms in a crystal allows individual crystalline structures to be identified detects stains in the crystals as well  What is XRD data? They are digital images collected by CCD Camera when synchrotron X-Ray beamline scanning on a sample area Image data sizes are large  One 2D image collected for each scan point  8MB/image at CLS, 2084X2084 pixels/image  Hundreds or even thousands of images could be collected in an experiment, depending on  the size of the sample area to be scanned  the step size in moving the sample during the scan  more scan points provide more detailed information for the analysis 3

XRD data analysis  Deals with large amount of image data  Several procedures to each of the images Peak searching, identify regions of interest  including threshold finding, blob searching and 2D curve fitting on each blob Indexing, identify possible/known crystalline structures Strain analysis, detect stains in material (?)  Existing XRD data analysis software was written in IDL A proprietary scripting language Only carry out processes sequentially It is very time consuming !  e.g. normally, days are needed to complete the processing of a whole package of data 4

Reasons for change  Needed for incorporation into Science Studio Aim of SS to provide remote users feedback during experimental runs; XRD analysis is one Existing code in scripting language and relied on sequential processes  Existing software is written in IDL  Peak searching is in IDL  Indexing and strain analysis are in IDL calling externals in Fortran Needed to have versions for Streaming data analysis  Stream processing -- taking a steam of input data, processing the data in a series of steps, steaming the results out, achieving real time or close to real time performance  Needed to solve data storage problem Accumulating large amount raw image data for a long time could cause storage problem Actually, only those peaks in each image are the useful information for the analysis If those peaks can be found during data collection in real time, it might not be necessary to keep the raw images  E.g. a typical raw image size is 8MB at CLS, while the peak data for the image is only about 10KB 5

How to make the change  Our development target To port existing software for XRD data analysis to a Cell system at SHARCNET to achieve stream processing for XRD data analysis  SHARCNET’s Cell system Including 8 Cell blades (QS22) -- 2 Cell processor chips on each Cell blade, i.e. total 16 Cell processors  Cell processor -- a heterogeneous multi-core architecture Two types of cores optimized for different tasks 1 Power Processing Element (PPE) and 8 Synergistic Processing Elements(SPE) PPE -- Power PC architecture, acts as a controller to perform control-intensive tasks SPEs -- simpler cores devote more resources, perform computation intensive tasks Cell processor can be programmed to achieve streaming processing 6

Basic Cell Programming Model 7 XRD data analysis procedures Diffraction pattern OrientationStrain Resultant Maps

Challenges  Cell only runs Linux and compiled code in C/C++ PPE and SPE execute different instruction sets Compile code for PPE and SPE use different compiler  Existing software is written in IDL Peak searching is in IDL Indexing and strain analysis are in IDL calling externals in Fortran  Challenges No algorithm provided: rewrite code in C using only the source code in IDL Programming on Cell is new and challenge because of Cell’s special architecture  Need knowledge of programming at assembly level  Limited function libraries available for Cell’s SPE 8

Development plan  Rewrite code in C  Validate the results produced by the C code Comparing with results from existing software  Make the code run on Cell’s PPE  Design for parallel processing on Cell Identify strategy for parallel computation Identify what should be executed on Cell’s SPEs  Implement the design  Validate the results produced by Cell  Performance measurement 9

Progress Report  Peak searching and Indexing procedures have been rewritten in C  Results produced by the C code for both procedures have been validated at least with our limited data set  Peak searching has been ported on Cell successfully Threshold finding and blob searching are carried out by PPE 2D Curve fitting (Lorentz fitting) for each blob is carried out by SPUs  Typical number of blobs found on each image is about 100 ~200 depending on the threshold setting  Some preliminary performance measurements have been done on Cell system for peak searching procedure 10

Some preliminary performance measurement (2) Peak searching on CLS XRD data: 8MB/image, 2084X2084 pixels/image, Desktop speed: 9.34 sec./image Total number of Images Total Number of blade(s) used (2 Cell chips per blade) Parallel images per blade (16 SPEs per blade) Number of SPEs per image Total time (sec.) Cell speed (Time / image in sec.) (TotalTime/TotalImages) Speed up (desktopSpeed /CellSpeed) 32 (32X1X1) (16X1X2) (8X1X4) (4X1X8) (2X2X8) (1X4X8) (1X8X8)

More work to do..  Continue rewrite code in C for strain analysis on XRD data  Port indexing and strain analysis procedures onto Cell  Design programming model for Cell to achieve streaming processing for all procedures in XRD data analysis  Implement the design  Integrate the streaming processing on XRD data with Science Studio 12