Art Wetzel, Greg Hood and Markus Dittrich

Slides:



Advertisements
Similar presentations
Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.
Advertisements

Programming Paradigms and languages
SEP1 - 1 Introduction to Software Engineering Processes SWENET SEP1 Module Developed with support from the National Science Foundation.
Authors: Thilina Gunarathne, Tak-Lon Wu, Judy Qiu, Geoffrey Fox Publish: HPDC'10, June 20–25, 2010, Chicago, Illinois, USA ACM Speaker: Jia Bao Lin.
Contributions of Dr. David Parnas to the Development of Software Engineering Background History of Computer Technology Career of David Parnas Areas of.
OPERATING SYSTEM OVERVIEW
CUDA Programming Lei Zhou, Yafeng Yin, Yanzhi Ren, Hong Man, Yingying Chen.
Structured Data Types and Encapsulation Mechanisms to create new data types: –Structured data Homogeneous: arrays, lists, sets, Non-homogeneous: records.
Serial EM 3D Electron microscopy (serial sections) of the brain Large datasets terabytes High resolution: ~5 nm X-Y, 50 nm Z (10 5 x10 5x 10 4 x.
Case Study: The E1 Distributed Operating System Chris Krentz 3/20/2007.
JIT/Lean Production Chapter 13.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Lesson 3 — How a Computer Processes Data
© Fujitsu Laboratories of Europe 2009 HPC and Chaste: Towards Real-Time Simulation 24 March
Wilhelm Schickhard (1623) Astronomer and mathematician Automatically add, subtract, multiply, and divide Blaise Pascal (1642) Mathematician Mass produced.
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Effective User Services for High Performance Computing A White Paper by the TeraGrid Science Advisory Board May 2009.
© 2008 Pittsburgh Supercomputing Center Tour Your Future The Girls, Math & Science Partnership Pittsburgh Supercomputing Center Computer Network Engineering.
Overview of Computing. Computer Science What is computer science? The systematic study of computing systems and computation. Contains theories for understanding.
Ohio State University Department of Computer Science and Engineering Automatic Data Virtualization - Supporting XML based abstractions on HDF5 Datasets.
Rendering Adaptive Resolution Data Models Daniel Bolan Abstract For the past several years, a model for large datasets has been developed and extended.
Y. Kotani · F. Ino · K. Hagihara Springer Science + Business Media B.V Reporter: 李長霖.
The Four Parts of a Computer. Definition of a Computer A computer is an electronic device used to process data, converting the data into information that.
David N. Brown Lawrence Berkeley National Lab Representing the BaBar Collaboration The BaBar Mini  BaBar  BaBar’s Data Formats  Design of the Mini 
MARC: Developing Bioinformatics Programs July 2009 Alex Ropelewski PSC-NRBSC Bienvenido Vélez UPR Mayaguez Reference: How to Think Like a Computer Scientist:
Event Data History David Adams BNL Atlas Software Week December 2001.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
Lesson 3 — How a Computer Processes Data Unit 1 — Computer Basics.
1 Web interface for large scale neural circuit reconstruction processes for connectomics R. Clay Reid, Jeff Lichtman, Wei-Chung Allen Lee Harvard Medical.
VIRTUAL MEMORY By Thi Nguyen. Motivation  In early time, the main memory was not large enough to store and execute complex program as higher level languages.
NIH NCRR Overview The SCIRun and BioPSE Problem Solving Environments Chris Johnson, Rob MacLeod, and David Weinstein Scientific Computing and Imaging Institute.
Services for Object Storage and Preservation March 2008 All content in these slides is considered work in progress. In no way does it represent an absolute.
CHAPTER 4 The Central Processing Unit. Chapter Overview Microprocessors Replacing and Upgrading a CPU.
1 Text Reference: Warford. 2 Computer Architecture: The design of those aspects of a computer which are visible to the programmer. Architecture Organization.
1 Development of a High-Throughput Computing Cluster at Florida Tech P. FORD, R. PENA, J. HELSBY, R. HOCH, M. HOHLMANN Physics and Space Sciences Dept,
GPUs: Overview of Architecture and Programming Options Lee Barford firstname dot lastname at gmail dot com.
A NOVEL METHOD FOR COLOR FACE RECOGNITION USING KNN CLASSIFIER
Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.
The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015.
GPU Accelerated MRI Reconstruction Professor Kevin Skadron Computer Science, School of Engineering and Applied Science University of Virginia, Charlottesville,
IHE Workshop – June 2006What IHE Delivers 1 Todd Kantchev, Siemens Molecular Imaging Jerold Wallis, Mallinckrodt Institute of Radiology Kevin O’Donnell,
1 Accomplishments. 2 Overview of Accomplishments  Sustaining the Production Earth System Grid Serving the current needs of the climate modeling community.
Jean-Roch Vlimant, CERN Physics Performance and Dataset Project Physics Data & MC Validation Group McM : The Evolution of PREP. The CMS tool for Monte-Carlo.
Chapter 2 Introduction to OS Chien-Chung Shen CIS/UD
Parallel IO for Cluster Computing Tran, Van Hoai.
IMAGE/VIDEO COMPRESSION STANDARD JPEG-2000/JasPer/Motion JPEG/Wireless JPEG/Kakadu Jan T. Bialasiewicz.
CSC321: Neural Networks Lecture 1: What are neural networks? Geoffrey Hinton
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 251 Introduction to Computer Organization.
IC 3 BASICS, Internet and Computing Core Certification Computing Fundamentals Lesson 2 How Does a Computer Process Data?
ChinaGrid: National Education and Research Infrastructure Hai Jin Huazhong University of Science and Technology
1 The user’s view  A user is a person employing the computer to do useful work  Examples of useful work include spreadsheets word processing developing.
A PRESENTATION ON VIRTUAL MEMORY (PAGING) Submitted to Submitted by Prof. Dr. Ashwani kumar Ritesh verma Dept. Of Physics Mtech (Instrumentation) Roll.
COMPUTER SCIENCE AND THE FOUNDATION OF KNOWLEDGE NURSING INFORMATICS CHAPTER 5 1.
Computer Stuff By Raj and Brad. Virtual Memory This takes place if your computer is running low on RAM. It stimulates more RAM when your computer reaches.
Computer Organization and Architecture Lecture 1 : Introduction
JIT/Lean Production Chapter 13.
Advanced Computer Systems
Digital Science Center II
Tools and Services Workshop
Joslynn Lee – Data Science Educator
Chapter 11: File System Implementation
CNRS applications in medical imaging
CISC AND RISC SYSTEM Based on instruction set, we broadly classify Computer/microprocessor/microcontroller into CISC and RISC. CISC SYSTEM: COMPLEX INSTRUCTION.
Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.
Introduction to IT and Types of Computers
Markus Hadwiger, Johanna Beyer, Won-Ki Jeong, and Hanspeter Pfister
The SCIRun and BioPSE Problem Solving Environments
Design Space of Software Development Methodologies
Dtk-tools Benoit Raybaud, Research Software Manager.
A free open-source solution for electronic medical records
Presentation transcript:

Prototyping a virtual filesystem for storing and processing petascale neural circuit datasets. Art Wetzel, Greg Hood and Markus Dittrich National Resource for Biomedical Supercomputing Pittsburgh Supercomputing Center awetzel@psc.edu 412-268-3912 www.psc.edu and www.nrbsc.org R. Clay Reid, Jeff Lichtman, Wei-Chung Allen Lee Harvard Medical School, Allen Institute for Brain Science Center for Brain Science, Harvard University Davi Bock HMMI Janelia Farm David Hall and Scott Emmons Albert Einstein College of Medicine Jan 11, 2012 Connectomics Data Project Overview

Reconstructing brain circuits requires high resolution electron microscopy over “long” distances == BIGDATA Vesicles ~30 nm diam. A synaptic junction >500 nm wide with cleft gap ~20 nm Dendritic spine www.coolschool.ca/lor/BI12/unit12/U12L04.htm Recent ICs have 32nm features 22nm chips are being delivered. Dendrite Gate oxide 1.2nm thick

A10 Tvoxel dataset aligned by our group was an essential part of the March 2011 Nature paper with Davi Bock, Clay Reid and Harvard colleagues Now we are working on two datasets of 100TB each and expect to reach PBs in 2-3 years.

The CS project is to implement and test a prototype virtual filesystem to address common problems associated with neural circuit and other massive datasets. The most important aim is reducing unwanted data duplication as raw data are preprocessed for final analysis. The virtual filesystem addresses this by replacing redundant storage by on-the-fly computing. The second aim is to provide a convenient framework for efficient on-the-fly computation on multidimensional datasets within high performance parallel computing environments using both CPU and GPGPU processing. The Filesystem in User Space mechanism (FUSE) provides a convenient implementation basis that will work across a variety of systems. There are many existing FUSE codes that serve as useful examples.

We would eventually like to have a flexible software framework that allows a combination of common prewritten and user written application codes to operate together and take advantage of parallel CPU and GPGPU technologies.

Multidimensional data structures to provide efficient random and sequential access analogous to the 1D representations provided by standard filesystems will be part of this work. Students working on this project will have access to a parallel cluster which holds our large datasets along with the compilers and other tools required. Minimal end-to-end functionality with simple linear transforms can likely be achieved in about 8 weeks and then extended as time permits. Please contact Art Wetzel if there are further questions – awetzel@psc.edu.