The Virtual Microscope Umit V. Catalyurek Department of Biomedical Informatics Division of Data Intensive and Grid Computing.

Slides:



Advertisements
Similar presentations
Database Architectures and the Web
Advertisements

The Big Picture Scientific disciplines have developed a computational branch Models without closed form solutions solved numerically This has lead to.
March 2, 2004, BMI Biomedical Data Management Improving Performance of Multiple Sequence Alignment Analysis in Multi-client Environments Use of Inexpensive.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Chapter 9 Designing Systems for Diverse Environments.
1 Pertemuan 13 Servers for E-Business Matakuliah: M0284/Teknologi & Infrastruktur E-Business Tahun: 2005 Versi: >
Chapter 13 Physical Architecture Layer Design
Figure 1.1 Interaction between applications and the operating system.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Data-centric computing with Netezza Architecture DISC reading group September 24, 2007.
Distributed Systems: Client/Server Computing
Kangseok Kim, Marlon E. Pierce Community Grids Laboratory, Indiana University
Tiered architectures 1 to N tiers. 2 An architectural history of computing 1 tier architecture – monolithic Information Systems – Presentation / frontend,
Chapter 4 Database Management Systems. Chapter 4Slide 2 What is a Database Management System (DBMS)?  Database An organized collection of related data.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
Web Application Architecture: multi-tier (2-tier, 3-tier) & mvc
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
Virtualization Concept. Virtualization  Real: it exists, you can see it.  Transparent: it exists, you cannot see it  Virtual: it does not exist, you.
Shilpa Seth.  Centralized System Centralized System  Client Server System Client Server System  Parallel System Parallel System.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
1 CMPT 275 High Level Design Phase Architecture. Janice Regan, Objectives of Design  The design phase takes the results of the requirements analysis.
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
Presenter: Dipesh Gautam.  Introduction  Why Data Grid?  High Level View  Design Considerations  Data Grid Services  Topology  Grids and Cloud.
Slide 1 Systems Analysis and Design With UML 2.0 An Object-Oriented Approach, Second Edition Chapter 13: Physical Architecture Layer Design Alan Dennis,
Systems Support for Manipulating Large Scientific Datasets Joel Saltz Biomedical Informatics Department The Ohio State University Research/Development.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
High Performance I/O and Data Management System Group Seminar Xiaosong Ma Department of Computer Science North Carolina State University September 12,
SOFTWARE DESIGN AND ARCHITECTURE LECTURE 07. Review Architectural Representation – Using UML – Using ADL.
Ohio State University Department of Computer Science and Engineering 1 Supporting SQL-3 Aggregations on Grid-based Data Repositories Li Weng, Gagan Agrawal,
Introduction to the Adapter Server Rob Mace June, 2008.
Heavy and lightweight dynamic network services: challenges and experiments for designing intelligent solutions in evolvable next generation networks Laurent.
The Client/Server Database Environment Ployphan Sornsuwit KPRU Ref.
Impact of High Performance Sockets on Data Intensive Applications Pavan Balaji, Jiesheng Wu, D.K. Panda, CIS Department The Ohio State University Tahsin.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Compiler Support for Exploiting Coarse-Grained Pipelined Parallelism Wei Du Renato Ferreira Gagan Agrawal Ohio-State University.
Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Copyright © 2004 Pearson Education, Inc. Slide 2-1 Data Models Data Model: A set.
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
High-level Interfaces and Abstractions for Data-Driven Applications in a Grid Environment Gagan Agrawal Department of Computer Science and Engineering.
May 2003National Coastal Data Development Center Brief Introduction Two components Data Exchange Infrastructure (DEI) Spatial Data Model (SDM) Together,
Distributed database system
Design of a Framework for Data- Intensive Wide-Area Applications Michael D. Beynon, Tahsin Kurc, Alan Sussman, Joel Saltz High Performance Systems Lab.
Lecture 18: Object-Oriented Design
1 Grid Activity Summary » Grid Testbed » CFD Application » Virtualization » Information Grid » Grid CA.
Full and Para Virtualization
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Very Large Dataset Access and Manipulation: Active Data Repository (ADR) DataCutter.
Chapter 1 Revealed Distributed Objects Design Concepts CSLA.
High-level Interfaces for Scalable Data Mining Ruoming Jin Gagan Agrawal Department of Computer and Information Sciences Ohio State University.
Ohio State University Department of Computer Science and Engineering Servicing Range Queries on Multidimensional Datasets with Partial Replicas Li Weng,
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Packet Size optimization for Supporting Coarse-Grained Pipelined Parallelism Wei Du Gagan Agrawal Ohio State University.
Research Overview Gagan Agrawal Associate Professor.
BIT 3193 MULTIMEDIA DATABASE CHAPTER 5 : MULTIMEDIA DATABASE MANAGEMENT SYSTEM ARCHITECTURE.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
Background Computer System Architectures Computer System Software.
IT 5433 LM1. Learning Objectives Understand key terms in database Explain file processing systems List parts of a database environment Explain types of.
Tony Pan, Stephen Langella, Shannon Hastings, Scott Oster, Ashish Sharma, Metin Gurcan, Tahsin Kurc, Joel Saltz Department of Biomedical Informatics The.
Servicing Seismic and Oil Reservoir Simulation Data through Grid Data Services Sivaramakrishnan Narayanan, Tahsin Kurc, Umit Catalyurek and Joel Saltz.
Introduction to Distributed Platforms
Tools and Services Workshop Overview of Atmosphere
Grid Computing.
University of Technology
E. Borovikov, A. Sussman, L. Davis, University of Maryland
Li Weng, Umit Catalyurek, Tahsin Kurc, Gagan Agrawal, Joel Saltz
Data Path through host/ANP.
Introduction to Apache
Compiler Supported Coarse-Grained Pipelined Parallelism: Why and How
Database System Architectures
Automatic and Efficient Data Virtualization System on Scientific Datasets Li Weng.
LCPC02 Wei Du Renato Ferreira Gagan Agrawal
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

The Virtual Microscope Umit V. Catalyurek Department of Biomedical Informatics Division of Data Intensive and Grid Computing

The Virtual Microscope Joel Saltz Renato Ferreira Michael Beynon Chialin Chang Alan Sussman Tahsin Kurc Robert Miller Angelo Demarzo Mark Silberman Asmara Afework Anthony Wiegering

Virtual Microscope (VM) Interactive software emulation of high power light microscope for processing image datasets visualize and explore microscopy images screen for cancer categorize images for associative retrieval electronic capture of slide examination process used in resident training collaborative diagnosis Virtual Microscope (Hopkins/UMD), Distributed Telemicroscopy System (Rutgers), [Gu] Virtual Telemicroscope, Virtual Microscopy (UPMC), Baccus Virtual Microscope

The Virtual Microscope Data requirement Full cases consisting of multiple digitized glass slides with data acquired at 400X Single spot 1000x1000 pixels, 3-byte RGB=3MB A slide of 2.5cmx3.5cm requires 50x70 grid = 10GB uncompressed Each slide can have multiple focal planes Johns Hopkins alone generates 500,000 slides per year

The Virtual Microscope Client-server architecture Java 1.2 Client Portability Data storage & Image compression More efficient storage, reduced transmission time 2 server implementations: Customized instance of Active Data Repository Improved scalability, portability, user-defined processing Component-based implementation using DataCutter Heterogeneous systems, portability, user-defined processing Caching in the VM Client Improved response time Experimental Results

VM Client

Image Declustering

Image Compression JPEG compression - storage and network data reduction by a factor of 10 still may take long time to transmit images For example, 640x480 image 920 KB uncompressed ~ 90 KB jpeg compressed ~ 13 seconds to transfer using 56 Kb modem

Active Data Repository (ADR) A C++ class library and runtime system for building parallel databases of multi- dimensional datasets enables integration of storage, retrieval and processing of multiple datasets on parallel machines and clusters. provides support for common operations such as data retrieval, memory management, scheduling of processing across a parallel machine. can be customized for various applications. Front-end: the interface between clients and back- end. Back-end: data storage, retrieval, and processing. Distributed memory parallel machine or cluster, with multiple disks attached to each node Customizable services for application-specific processing

Query Interface Service Query Submission Service Front-end Virtual Microscope Front-end Dataset Service Attribute Space Service Data Aggregation Service Indexing Service Query Execution Service Query Planning Service Back-end Client... Query: * Slide number * Focal plane * Magnification * Region of interest Image blocks Virtual Microscope with ADR

DataCutter A suite of Middleware for subsetting and filtering multi-dimensional datasets stored in a distributed environment Indexing Service Multilevel hierarchical indexes based on spatial indexing methods – e.g., R-trees Filtering Service Distributed C++ component framework Specialized components for processing data filters – logical unit of computation, high level tasks, init,process,finalize interface streams – how filters communicate unidirectional buffer pipes uses fixed size buffers (min, good) manually specify filter connectivity and filter-level characteristics

Virtual Microscope with DataCutter zoomviewread_datadecompressclipclip-zoom-viewread_datadecompressdecompress-clip-zoom-viewread_data DC-5F DC-3F DC-2F

Caching in the Client Reduce data re-transmission Cache part of the retrieved data in the client Cache multiple resolutions/magnifications Cache only what the user views Two-level cache client memory is the first level cache local disk on the client machine is the second level

Caching Multiresolution Images

VM Server Performance

ADR VM Server Performance

VM ADR Server under workload

VM Servers: ADR vs DC

VM: ADR vs DC on SMP

Caching Client Performance

Summary 2 VM servers: Homogeneous systems tightly coupled parallel machines with attached local disks Heterogeneous systems, grid Java 1.2 Client Multiresolution image caching Try

End of Talk