(A) Approved for public release; distribution is unlimited. High Performance Processing with MONARCH – A Case Study in CT Reconstruction High Performance.

Slides:



Advertisements
Similar presentations
Image Artifacts Chapter 8 Bushong.
Advertisements

Seeram Chapter 13: Single Slice Spiral - Helical CT
Improvement of CT Slice Image Reconstruction Speed Using SIMD Technology Xingxing Wu Yi Zhang Instructor: Prof. Yu Hen Hu Department of Electrical & Computer.
IPIM, IST, José Bioucas, X-Ray Computed Tomography Radon Transform Fourier Slice Theorem Backprojection Operator Filtered Backprojection (FBP) Algorithm.
Dr. Bo Marr Ken Prager Raytheon (SAS) Engineering Mark Trainoff Raytheon Advanced Concepts and Technology Sep. 21, 2011 Super Computing with Embedded Efficiency:
1 Improving the Performance of Distributed Applications Using Active Networks Mohamed M. Hefeeda 4/28/1999.
H.264/AVC Baseline Profile Decoder Complexity Analysis Michael Horowitz, Anthony Joch, Faouzi Kossentini, and Antti Hallapuro IEEE TRANSACTIONS ON CIRCUITS.
Information Retrieval in Practice
Integrated planning  Understand integrated planning process  Describe SAP integrated planning technology or tool (SAP-IP)
Evaluation of Data-Parallel Splitting Approaches for H.264 Decoding
GPUs. An enlarging peak performance advantage: –Calculation: 1 TFLOPS vs. 100 GFLOPS –Memory Bandwidth: GB/s vs GB/s –GPU in every PC and.
BMME 560 & BME 590I Medical Imaging: X-ray, CT, and Nuclear Methods Tomography Part 4.
1 The Future of Software Architecture: Special Issues in Embedded, Real-time Systems Carolyn Boettcher, Richard Falcioni Raytheon Company, Electronic Systems.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
University of Michigan Electrical Engineering and Computer Science Low-Power Scientific Computing Ganesh Dasika, Ankit Sethia, Trevor Mudge, Scott Mahlke.
Chapter 3 Overview of Operating Systems Copyright © 2008.
tomos = slice, graphein = to write
The Medusa Proxy A Tool For Exploring User- Perceived Web Performance Mimika Koletsou and Geoffrey M. Voelker University of California, San Diego Proceeding.
DCL Concepts STL Concepts ContainerIteratorAlgorithmFunctorAdaptor What New Concepts are Needed for a “DCL”? (Distributed Computing Library) Distributed.
History of Digital Camera By : Dontanisha Williams P2.
Introduction to Longitudinal Phase Space Tomography Duncan Scott.
Performance Acceleration Module Tech ONTAP Live August 6 th, 2008.
Basic Principles of Computed Tomography Dr. Kazuhiko HAMAMOTO Dept. of Infor. Media Tech. Tokai University.
BENCHMARK SUITE RADAR SIGNAL & DATA PROCESSING CERES EPC WORKSHOP
Software systems for Computer Vision and Image Processing Sungsoo Ha Prof. Murali Subbarao (Stony Brook University) Prof. A.N. Rajagopalan (Indian Institute.
EXACT TM CT Scanner EXACT: The heart of an FAA-certified Explosives Detection Scanner 3-D Image.
material assembled from the web pages at
Seeram Chapter 7: Image Reconstruction
Optimizing Katsevich Image Reconstruction Algorithm on Multicore Processors Eric FontaineGeorgiaTech Hsien-Hsin LeeGeorgiaTech.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
Filtered Backprojection. Radon Transformation Radon transform in 2-D. Named after the Austrian mathematician Johann Radon RT is the integral transform.
NSF Center for Adaptive Optics UCO Lick Observatory Laboratory for Adaptive Optics Tomographic algorithm for multiconjugate adaptive optics systems Donald.
1 Embedded Distributed Real-Time Resource Management Carl Hein, Aron Rubin Lockheed Martin Advanced Technology Lab Cherry Hill NJ September 25, 2003 HPEC.
Computed Tomography Q & A
Military Diagnostics, Prognostics, and Logistics - A Way Forward Graham Tebby – Chief Engineer (presenter) David Stamm – Principal.
© 2007 SET Associates Corporation SAR Processing Performance on Cell Processor and Xeon Mark Backues, SET Corporation Uttam Majumder, AFRL/RYAS.
B3AS Joseph Lewthwaite 1 Dec, 2005 ARL Knowledge Fusion COE Program.
Pipelined and Parallel Computing Data Dependency Analysis for 1 Hongtao Du AICIP Research Mar 9, 2006.
Doc.: IEEE ad Submission September 2009 J. Karaoguz, Broadcom, Chair WirelessHD Coex SGSlide 1 WirelessHD Specification and Coexistence.
HPEC2002_Session1 1 DRM 11/11/2015 MIT Lincoln Laboratory Session 1: Novel Hardware Architectures David R. Martinez 24 September 2002 This work is sponsored.
MAPLD 2005/254C. Papachristou 1 Reconfigurable and Evolvable Hardware Fabric Chris Papachristou, Frank Wolff Robert Ewing Electrical Engineering & Computer.
High Performance Embedded Computing (HPEC) Workshop 23−25 September 2008 John Holland & Eliot Glaser Northrop Grumman Corporation P.O. Box 1693 Baltimore,
Fast Electron Temperature Scaling and Conversion Efficiency Measurements using a Bremsstrahlung Spectrometer Brad Westover US-Japan Workshop San Diego,
Computed Tomography Diego Dreossi Silvia Pani University of Trieste and INFN, Trieste section.
An Integrated Design Environment to Evaluate Power/Performance Tradeoffs for Sensor Network Applications Amol Bakshi, Jingzhao Ou, and Viktor K. Prasanna.
Thermal-Aware Scheduling for Real-time Applications in Embedded Systems Adam Lewis, Soumik Ghosh, and N.-F. Tzeng (A) Approved for public release; distribution.
1 THE EARTH SIMULATOR SYSTEM By: Shinichi HABATA, Mitsuo YOKOKAWA, Shigemune KITAWAKI Presented by: Anisha Thonour.
Real-Time Hardware-in-the-Loop Testing with Common Simulation Framework Dr. Judith Gardiner Ohio Supercomputer Center Columbus, OH 43212
-BY KUSHAL KUNIGAL UNDER GUIDANCE OF DR. K.R.RAO. SPRING 2011, ELECTRICAL ENGINEERING DEPARTMENT, UNIVERSITY OF TEXAS AT ARLINGTON FPGA Implementation.
QR Decomposition: Demonstration of Distributed Computing on Wireless Sensor Networks By Sherine Abdelhak, Soumik Ghosh, Rabi Chaudhuri, Magdy Bayoumi (A)
Single-Slice Rebinning Method for Helical Cone-Beam CT
Approved for Public Release, Distribution Unlimited The Challenge of Data Interoperability from an Operational Perspective Workshop on Information Integration.
Ultrasound Computed Tomography 何祚明 陳彥甫 2002/06/12.
New Sensor Signal Processor Paradigms: When One Pass Isn’t Enough
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408/CS483, University of Illinois, Urbana-Champaign 1 Graphic Processing Processors (GPUs) Parallel.
1 WRB 09/02 HPEC Lincoln Lab Sept 2002 Poster B: Software Technologies andSystems W. Robert Bernecky Naval Undersea Warfare Center Ph: (401) Fax:
(A) Approved for public release; distribution is unlimited. Experience and results porting HPEC Benchmarks to MONARCH Lloyd Lewins & Kenneth Prager Raytheon.
MIT Lincoln Laboratory 1 of 4 MAA 3/8/2016 Development of a Real-Time Parallel UHF SAR Image Processor Matthew Alexander, Michael Vai, Thomas Emberley,
KERRY BARNES WILLIAM LUNDGREN JAMES STEED
1 Leveraging Multi-Core Processors in a CDMA-2000 SDR Base Station Steve Muir, John Chapin, Andrew Chiu, Victor Lum and Jeremy Nimmer Vanu, Inc.
Explanation of SMS Grievance System & Components for DWO’s David Shirley September 26 th.
Computed tomography. Formation of a CT image Data acquisitionImage reconstruction Image display, manipulation Storage, communication And recording.
2nd GEO Data Providers workshop (20-21 April 2017, Florence, Italy)
Virtual memory.
2016 Maintenance Innovation Challenge
Unit OS9: Real-Time and Embedded Systems
Figure 11.1 A basic personal computer system
Basic Principles of Computed Tomography
Wireless Power GUI Presented by: Alex Zellner and Dr. Corey Bergsrud
Diagnosis and management of evacuated casualties with cervical vascular injuries resulting from combat-related explosive blasts  Colin A. Meghoo, MD,
Presentation transcript:

(A) Approved for public release; distribution is unlimited. High Performance Processing with MONARCH – A Case Study in CT Reconstruction High Performance Embedded Computing (HPEC) Workshop 23−25 September 2008 Lloyd Lewins Raytheon Company 2000 E. El Segundo Blvd El Segundo, CA David Rohler Multi-Dimensional Imaging Bainbridge Rd. Solon, OH Kenneth Prager Raytheon Company 2000 E. El Segundo Blvd El Segundo, CA Pat Marek Raytheon Company 2000 E. El Segundo Blvd El Segundo, CA Arthur Dartt Multi-Dimensional Imaging Bainbridge Rd. Solon, OH

Page 29/23/08 Compact VAC (Volume AngioCT) is an advanced Computed Tomography (CT) system proposed for development by DARPA to allow combat casualty assessment in forward battlefield positions Size, weight and power enable mobility and portability suitable for use in multiple vehicles and enclosures for forward and rear military applications Example transport vehicle - Stryker Compact VAC Overview

Page 39/23/08 VAC Algorithm Overview CT cone-beam reconstruction is a computational algorithm that transforms a sequence of raw views into volume images. Volume images are typically organized as a “stack” of cross-sectional 2D slice images. Reconstruction consists of two components: – cone-beam filter process Each view is independently filtered No storage required Low latency – cone-beam backprojection process each reconstruction slice has contributions from a large range of views each view impacts a large number of slices. The range of slices and/or views is too large to fit into cache or local memory Algorithms must be designed so that slice data and/or view data constantly flow in and out of local memory

Page 49/23/08 Problem Summary Problem Statement – Develop an embedded system for mobile, real-time reconstruction of cone beam CT imagery – System must provide 1 – 2 TFLOP/S of processing throughput while only consuming around a kilowatt of power – Spiral cone beam reconstruction requirements: TFLOP/S of throughput 100 GB/S Read/modify/write bulk memory access Predicted Result – Analysis predicts that MONARCH- based system satisifes the above requirments System efficiency of ~1 GFLOP/S per Watt versus <0.1 for GP-GPU-based system