Center for Subsurface Sensing & Imaging Systems Overview of Image and Data Information Management in CenSSIS David Kaeli Northeastern University Boston,

Slides:



Advertisements
Similar presentations
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Advertisements

DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
The Bernard M. Gordon Center for Subsurface Sensing and Imaging Systems Overview of Image and Data Information Management R3-Thrust David Kaeli, NU This.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
1 Coven a Framework for High Performance Problem Solving Environments Nathan A. DeBardeleben Walter B. Ligon III Sourabh Pandit Dan C. Stanzione Jr. Parallel.
Introduction to Database Management  Department of Computer Science Northern Illinois University January 2001.
O. Stézowski IPN Lyon AGATA Week September 2003 Legnaro Data Analysis – Team #3 ROOT as a framework for AGATA.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Blue Bear Systems Research Hardware Architectures for Distributed Agents Dr Simon Willcox 24 th Soar Workshop 9 th – 11 th June 2004 Building 32, Twinwoods.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Instrumentation and Profiling David Kaeli Department of Electrical and Computer Engineering Northeastern University Boston, MA
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Motivation “Every three minutes a woman is diagnosed with Breast cancer” (American Cancer Society, “Detailed Guide: Breast Cancer,” 2006) Explore the use.
Tomographic mammography parallelization Juemin Zhang (NU) Tao Wu (MGH) Waleed Meleis (NU) David Kaeli (NU)
EMC PresentationApril Northeastern University I/O storage modeling and performance –David Kaeli Soft error modeling and mitigation –Mehdi.
Development in hardware – Why? Option: array of custom processing nodes Step 1: analyze the application and extract the component tasks Step 2: design.
Center for Subsurface Sensing & Imaging Systems HP UPRM February 27, 2003 Overview of Research Thrust R3 R3 Fundamental Research Topics R3A Parallel.
Systems Analysis – Analyzing Requirements.  Analyzing requirement stage identifies user information needs and new systems requirements  IS dev team.
“SEMI-AUTOMATED PARALLELISM USING STAR-P " “SEMI-AUTOMATED PARALLELISM USING STAR-P " Dana Schaa 1, David Kaeli 1 and Alan Edelman 2 2 Interactive Supercomputing.
Imaging Workspace An Overview and Roadmap Eliot L. Siegel, MD Imaging Workspace Lead SME January 23, 2008.
Event Metadata Records as a Testbed for Scalable Data Mining David Malon, Peter van Gemmeren (Argonne National Laboratory) At a data rate of 200 hertz,
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Artdaq Introduction artdaq is a toolkit for creating the event building and filtering portions of a DAQ. A set of ready-to-use components along with hooks.
NEPTUNE Canada Workshop Oceans 2.0 Project Environment NEPTUNE Canada DMAS Team Victoria, BC February 16, 2009.
material assembled from the web pages at
BIRN Update Carl Kesselman Professor of Industrial and Systems Engineering Information Sciences Institute Fellow Viterbi School of Engineering University.
Ohio State University Department of Computer Science and Engineering 1 Cyberinfrastructure for Coastal Forecasting and Change Analysis Gagan Agrawal Hakan.
BLU-ICE and the Distributed Control System Constraints for Software Development Strategies Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
Center for Subsurface Sensing & Imaging Systems Overview of Research Thrust R3 R3 Fundamental Research Topics R3A Parallel Processing Middleware/Parallelization.
Enabling Grids for E-sciencE Grid enabling ThIS CAVIAR Hugues BENOIT-CATTIN LRMN.
ITPA/IMAGE 7-10 May 2007 Software and Hardware Infrastructure for the ITM B.Guillerminet, on behalf of the ITM & ISIP teams (P Strand, F Imbeaux, G Huysmans,
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
© 2012 xtUML.org Bill Chown – Mentor Graphics Model Driven Engineering.
Abstract We present two Model Driven Engineering (MDE) tools, namely the Eclipse Modeling Framework (EMF) and Umple. We identify the structure and characteristic.
The european ITM Task Force data structure F. Imbeaux.
1/30/2003 BARC1 Profile-Guided I/O Partitioning Yijian Wang David Kaeli Electrical and Computer Engineering Department Northeastern University {yiwang,
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Networked Audio Visual Systems and Home Platforms ADMIRE-P at Med-e-Tel 2005 April 6-8, Application of Video Technologies and Pattern Recognition.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
Abstract The Gordon-CenSSIS Web accessible Image Database System (CenSSIS-DB) is a scientific database that enables effective collaborative scientific.
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
Wang Chen, Dr. Miriam Leeser, Dr. Carey Rappaport Goal Speedup 3D Finite-Difference Time-Domain.
March 31, 1998NSF IDM 98, Group F1 Group F Multi-modal Issues, Systems and Applications.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
1 Service Creation, Advertisement and Discovery Including caCORE SDK and ISO21090 William Stephens Operations Manager caGrid Knowledge Center February.
XML-Based Grid Data System for Bioinformatics Development Noppadon Khiripet, Ph.D Wasinee Rungsarityotin, MS Chularat Tanprasert, Ph.D Royol Chitradon.
Visualization of Tumors in 4D Medical CT Datasets Visualization of Tumors in 4D Medical CT Datasets Burak Erem 1, David Kaeli 1, Dana Brooks 1, George.
1 Grid Activity Summary » Grid Testbed » CFD Application » Virtualization » Information Grid » Grid CA.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
A Remote Collaboration Environment for Protein Crystallography HEPiX-HEPNT Conference, 8 Oct 1999 Nicholas Sauter, Stanford Synchrotron Radiation Laboratory.
Backprojection and Synthetic Aperture Radar Processing on a HHPC Albert Conti, Ben Cordes, Prof. Miriam Leeser, Prof. Eric Miller
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
SensorWare: Distributed Services for Sensor Networks Rockwell Science Center and UCLA.
1 Geog 357: Data models and DBMS. Geographic Decision Making.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
Imaging Workspace An Overview and Roadmap Eliot L. Siegel, MD Imaging Workspace Lead SME January 23, 2008.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
Parallel Performance Wizard: A Generalized Performance Analysis Tool Hung-Hsun Su, Max Billingsley III, Seth Koehler, John Curreri, Alan D. George PPW.
Managing Data Resources File Organization and databases for business information systems.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Pathology Spatial Analysis February 2017
Data Management & Analysis in MATTER
CNRS applications in medical imaging
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Center for Subsurface Sensing & Imaging Systems Overview of Image and Data Information Management in CenSSIS David Kaeli Northeastern University Boston, MA

R1 R2 Overview of the Strategic Research Plan Fundamental Science Validating TestBEDs L1 L2 L3 R3 Image and Data Information Management S1 S4 S5 S3 S2 Bio-MedEnviro-Civil

R3 Research Thrust Overview  Utilize enabling hardware and software technologies to address CenSSIS barriers  Pursue research in enabling technologies  Develop a common set of tools and techniques to address SSI problems:  Hardware parallelization and acceleration  Software toolboxes  Image database management and tools  Utilize enabling hardware and software technologies to address CenSSIS barriers  Pursue research in enabling technologies  Develop a common set of tools and techniques to address SSI problems:  Hardware parallelization and acceleration  Software toolboxes  Image database management and tools Toolboxes

CenSSIS Middleware Tools  Parallelization of MATLAB, C/C++ and Fortran codes using Message Passing Interface (MPI) – a software pathway to exploiting GRID-level resources  Utilizing MPI-2 to address barriers in I/O performance  Building on existing Grid Middleware such as Globus Toolkit, MPICH-G2 and GridPort  Presently illustrating the impact of the GRID on system level projects (tomosynthesis reconstruction)  Parallelization of MATLAB, C/C++ and Fortran codes using Message Passing Interface (MPI) – a software pathway to exploiting GRID-level resources  Utilizing MPI-2 to address barriers in I/O performance  Building on existing Grid Middleware such as Globus Toolkit, MPICH-G2 and GridPort  Presently illustrating the impact of the GRID on system level projects (tomosynthesis reconstruction) MATLAB C/C++ Fortran Parallelization MPI MPICH-G2 UPC

Impact on CenSSIS Applications  Reduced the runtime of a single-body Steepest Descent Fast Multipole Method (SDFMM) application by 74% on a 32-node Beowulf cluster Hot-path parallelization Data restructuring  Reduced the runtime of a Monte Carlo scattered light simulation by 98% on a 16-node Silicon Graphics Origin 2000 Matlab-to-C compliation Hot-path parallelization Obtained superlinear speedup of Ellipsoid Algorithm run on a 16-node IBM SP2 Matlab-to-C compliation Hot-path parallelization  Reduced the runtime of a single-body Steepest Descent Fast Multipole Method (SDFMM) application by 74% on a 32-node Beowulf cluster Hot-path parallelization Data restructuring  Reduced the runtime of a Monte Carlo scattered light simulation by 98% on a 16-node Silicon Graphics Origin 2000 Matlab-to-C compliation Hot-path parallelization Obtained superlinear speedup of Ellipsoid Algorithm run on a 16-node IBM SP2 Matlab-to-C compliation Hot-path parallelization Soil Air Mine

Tomographic mammography  3D image reconstruction from x-ray projections  Used to detect and diagnose breast cancer  Based on well developed mammography techniques  Exposes tissue structure using multiple projections from different angles  Advantages Accuracy: provides at least as much useful information than x-ray film Flexibility: digital image manipulation, digital storage Structural information: using layered images Safe: low-dose x-ray Lower cost: compared to MRI  3D image reconstruction from x-ray projections  Used to detect and diagnose breast cancer  Based on well developed mammography techniques  Exposes tissue structure using multiple projections from different angles  Advantages Accuracy: provides at least as much useful information than x-ray film Flexibility: digital image manipulation, digital storage Structural information: using layered images Safe: low-dose x-ray Lower cost: compared to MRI

Image acquisition/reconstruction process  Acquisition: 11 uniform angular samples along Y-axis  X-ray projection: breast tissue density absorption radiograph  Algorithm: constrained non-linear convergence and iterative process  Uses a Maximum Likelihood Estimation  Acquisition: 11 uniform angular samples along Y-axis  X-ray projection: breast tissue density absorption radiograph  Algorithm: constrained non-linear convergence and iterative process  Uses a Maximum Likelihood Estimation detector X-ray source X Z Y Y Set 3D volume Compute projections Correct 3D volume 3D volume Satisfied ? No Yes Exit Initialization Forward Backward X-ray projections

Parallelization approaches  Reduce communication data  Segmentation along Y-axis  Using redundant computation to replace communication  Segmenting along x-ray beam  Reduce communication data  Segmentation along Y-axis  Using redundant computation to replace communication  Segmenting along x-ray beam First approach: Non inter-communication (more computation, less communication) Second approach: Overlap with inter-communication Third approach: Non-overlap with inter-communication (less computation, more communication) exchange data Overlap area

Tomosynthesis Acceleration Input data set: phantom 1600x2034x45 Serial implementations runs in 2- 3 hours on a P4 machine Platforms: – SGI Altix system – UIUC NCSA Titan cluster – UIUC NCSA IBM p690 – UMich Hypnos cluster – P4 cluster at MGH Number of processors: 32 Computation: SGI Altix with Itanium 2 processor outperforms the other CPUs Currently moving this work to the GRID and the Pittsburgh Supercomputer Center Prototype running on our GRID system at NU

Field Programmable Gate Arrays for Subsurface Imaging  Backprojection for Computed Tomography image reconstruction  Sponsored by Mercury Computer  Finite Difference Time Domain (FDTD) in hardware  Collaboration with Humanitarian Demining project  Retinal Vascular Tracing in real time  Collaboration with Real-time Retinal Imaging project  Phase Unwrapping  Collaboration with 3-D Fusion Microscope project  Diverse problems, similar solutions: FPGAs are particularly well suited for accelerating image processing and image understanding algorithms  Backprojection for Computed Tomography image reconstruction  Sponsored by Mercury Computer  Finite Difference Time Domain (FDTD) in hardware  Collaboration with Humanitarian Demining project  Retinal Vascular Tracing in real time  Collaboration with Real-time Retinal Imaging project  Phase Unwrapping  Collaboration with 3-D Fusion Microscope project  Diverse problems, similar solutions: FPGAs are particularly well suited for accelerating image processing and image understanding algorithms

Retinal Vascular Tracing: Register 2-D Image to 3-D in Real Time “Smart Camera” Direction of blood vessel PCI BUS

Some Recent Publications on Parallelization “Execution-Driven Simulation of Network Storage Systems,” Y. Wang and D. Kaeli, Proceedings of the 12 th ACM/IEEE International Symposium on Modeling, Analysis of Computer and Telecommunication Systems, October 2004, pp “Profile-guided File Paritioning on Beowulf Clusters,” Y. Wang and D. Kaeli, Journal o f Cluster Computing, Special Issue on Parallel I/O, to appear, “An Object-oriented Parallel Library,” C. Oaurrauri and D. Kaeli, International Journal of High Performance of Computing and Networking, to appear. “Digital Tomosynthesis Mammography using a Parallelized Maximum Likelihood Reconstruction Method,” T. Wu, R. Moore, E. Rafferty, D. Koppans, J. Zhang, W. Meleis and D. Kaeli, Medical Imaging, 5368, February “Mapping and characterization of applications in Heterogeneous Distributed Systems,” J. Yeckle and W. Rivera, To appear in Proceed. of the 7th World Multiconference on Systemics, Cybernetics and Informatics (SCI2003). “Profile-Guided I/O Partitioning,” Y. Wang and D. Kaeli, Proceedings of the 17 th ACM International Symposium on Supercomputing, June 2003, pp “Source-Level Transformations to Apply I/O Data Partitioning,” Y. Wang and D. Kaeli, Proceedings of the IEEE Workshop on Storage Network Architecture and Parallel IO, Oct. 2003, pp “Execution-Driven Simulation of Network Storage Systems,” Y. Wang and D. Kaeli, Proceedings of the 12 th ACM/IEEE International Symposium on Modeling, Analysis of Computer and Telecommunication Systems, October 2004, pp “Profile-guided File Paritioning on Beowulf Clusters,” Y. Wang and D. Kaeli, Journal o f Cluster Computing, Special Issue on Parallel I/O, to appear, “An Object-oriented Parallel Library,” C. Oaurrauri and D. Kaeli, International Journal of High Performance of Computing and Networking, to appear. “Digital Tomosynthesis Mammography using a Parallelized Maximum Likelihood Reconstruction Method,” T. Wu, R. Moore, E. Rafferty, D. Koppans, J. Zhang, W. Meleis and D. Kaeli, Medical Imaging, 5368, February “Mapping and characterization of applications in Heterogeneous Distributed Systems,” J. Yeckle and W. Rivera, To appear in Proceed. of the 7th World Multiconference on Systemics, Cybernetics and Informatics (SCI2003). “Profile-Guided I/O Partitioning,” Y. Wang and D. Kaeli, Proceedings of the 17 th ACM International Symposium on Supercomputing, June 2003, pp “Source-Level Transformations to Apply I/O Data Partitioning,” Y. Wang and D. Kaeli, Proceedings of the IEEE Workshop on Storage Network Architecture and Parallel IO, Oct. 2003, pp Held again in 2004

CenSSIS Solutionware – UPRM/NU/RPI Toolbox Development  Support the development of CenSSIS Solutionware that demonstrates our “Diverse Problems – Similar Solutions” model  Develop Toolboxes that support research and education  Establish software development and testing standards for CenSSIS Image and Sensor Data Database  Develop an web-accessible image database for CenSSIS that enables efficient searching and querying of images, metadata and image content  Develop image feature tagging capabilities Toolbox Development  Support the development of CenSSIS Solutionware that demonstrates our “Diverse Problems – Similar Solutions” model  Develop Toolboxes that support research and education  Establish software development and testing standards for CenSSIS Image and Sensor Data Database  Develop an web-accessible image database for CenSSIS that enables efficient searching and querying of images, metadata and image content  Develop image feature tagging capabilities Toolboxes

Status of the CenSSIS Toolboxes  Hyperspectral Image Analysis Toolbox (HIAT)  October 2004  Multiview Tomography Toolbox (MVT)  fddlib:  January 2003 (v. 1.0)  July 2003 (v. 1.1)  mvt:  October 2004  Rensselaer Generalized Registration Library (RGRL)  September 2004  Hyperspectral Image Analysis Toolbox (HIAT)  October 2004  Multiview Tomography Toolbox (MVT)  fddlib:  January 2003 (v. 1.0)  July 2003 (v. 1.1)  mvt:  October 2004  Rensselaer Generalized Registration Library (RGRL)  September 2004 HIAT MVT RGRL

New toolbox: Improving the quality of radiation MGH  Developed a 4D (3D + including time) visualization browser tool kit  Visualize Computed Tomography (CT) images, organ outlines (wire contours) and the isodose lines (treatment dosage)  Present all this information in a user friendly interface  Developed a 4D (3D + including time) visualization browser tool kit  Visualize Computed Tomography (CT) images, organ outlines (wire contours) and the isodose lines (treatment dosage)  Present all this information in a user friendly interface

4-D Visualization of Lung Tumors Dosage 4-D Visualization

The Future for CenSSIS Toolboxes SCIRun Collaboration with the University of Utah

 Deliver an web-accessible database for CenSSIS that enables efficient searching and querying of images, sensor data, metadata and image content  More that 4000 metadata-rich images/datasets presently available online (> 10,000 by 2006)  Database Characteristics: Relational complex queries (Oracle9i) Data security, reliability and layered user privileges Efficient search and query of image content and metadata Content-based image tagging using XML Indexing algorithms (2D, 3D, and 4D) Explore object relational technology to handle collections CenSSIS Image Database System mouse embryo

CenSSIS Image Database System

Utilize Machine Learning algorithms to improve query view

CenSSIS Image Database System Provides data description associated with initial collection, but does not allow for further elaboration or annotation.

Image Annotation  Provide the ability to markup image with searchable features  Enable image database to be more effectively data- mined  Provide the ability to markup image with searchable features  Enable image database to be more effectively data- mined Embryo developmental stages 1 cell embryo 2 cell embryo 4 cell embryo

XML and Java XML (Extensible Markup Language) Provides maximum flexibility and portability Well-supported standard Powerful querying tools available in Oracle The Java2 Platform Cross-platform compatibility Standard web-browser interface Native XML support

Image Tagging A raw image file from the CenSSIS Database QUERY: I want to be able to add to this image textual annotations, providing my medical team with questions about particular ROIs: Difficult to describe regions in an image Difficult to pinpoint specific features in images Global image metadata too coarse to facilitate low-level tagging

Image Tagging Image with tags Metadata associated with specific areas Query for specific image features

The Image Tagging Interface Drawing Tools View Options Tag Options

Tags and XML [custom XML tags go here] awilliam

The Future Role of Image Annotation  Provide a vehicle for natural collaboration A richer set of metadata to enable more detailed queries Potential to perform extensive data mining on image content An eye toward content-based image retrieval Tumor tracking paper recently accepted to SIGMOD 2005

The CenSSIS Image Database System  Hosts the image and sensor data of CenSSIS (>500 images online)   Provides metadata indexed image searching  Uses XML tags to allow for easy information interchange  Evolved into a project-based management system, allowing users to organize their data hierarchically  Key issue: how do we develop collaboration tools that increase the value of data stored in the database?  Presently exploring how best to integrate both visualization and image annotation into the existing framework (NIH proposal)  Hosts the image and sensor data of CenSSIS (>500 images online)   Provides metadata indexed image searching  Uses XML tags to allow for easy information interchange  Evolved into a project-based management system, allowing users to organize their data hierarchically  Key issue: how do we develop collaboration tools that increase the value of data stored in the database?  Presently exploring how best to integrate both visualization and image annotation into the existing framework (NIH proposal)

CenSSIS Image and Data Information Management  Addressing key research barriers in computational efficiency, embedded computing and image/sensor data management  Exploiting Grid resources to enable new discovery in SSI applications  Producing a image/data repository and software-engineered Subsurface Sensing and Imaging Toolsets  Developing enabling tools targeting system-level projects Near real-time reconstruction and visualization Visualization of complex motion Predicting motion in image data using database indexing techniques  Addressing key research barriers in computational efficiency, embedded computing and image/sensor data management  Exploiting Grid resources to enable new discovery in SSI applications  Producing a image/data repository and software-engineered Subsurface Sensing and Imaging Toolsets  Developing enabling tools targeting system-level projects Near real-time reconstruction and visualization Visualization of complex motion Predicting motion in image data using database indexing techniques MVT