Deconstructing Clusters for High End Biometric Applications NSF CCF-0621434 June 2007-2009 Douglas Thain and Patrick Flynn University of Notre Dame 5 August.

Slides:



Advertisements
Similar presentations
1 Scaling Up Data Intensive Scientific Applications to Campus Grids Douglas Thain University of Notre Dame LSAP Workshop Munich, June 2009.
Advertisements

BXGrid: A Data Repository and Computing Grid for Biometrics Research Hoang Bui University of Notre Dame 1.
Research Issues in Cooperative Computing Douglas Thain
1 Condor Compatible Tools for Data Intensive Computing Douglas Thain University of Notre Dame Condor Week 2011.
1 Opportunities and Dangers in Large Scale Data Intensive Computing Douglas Thain University of Notre Dame Large Scale Data Mining Workshop at SIGKDD August.
1 Models and Frameworks for Data Intensive Cloud Computing Douglas Thain University of Notre Dame IDGA Cloud Computing 8 February 2011.
1 Science in the Clouds: History, Challenges, and Opportunities Douglas Thain University of Notre Dame GeoClouds Workshop 17 September 2009.
1 Scaling Up Data Intensive Science to Campus Grids Douglas Thain Clemson University 25 Septmber 2009.
Programming Distributed Systems with High Level Abstractions Douglas Thain University of Notre Dame Cloud Computing and Applications (CCA-08) University.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Using Small Abstractions to Program Large Distributed Systems Douglas Thain University of Notre Dame 11 December 2008.
1998/5/21by Chang I-Ning1 ImageRover: A Content-Based Image Browser for the World Wide Web Introduction Approach Image Collection Subsystem Image Query.
1 Scaling Up Classifiers to Cloud Computers Christopher Moretti, Karsten Steinhaeuser, Douglas Thain, Nitesh V. Chawla University of Notre Dame.
Getting Beyond the Filesystem: New Models for Data Intensive Scientific Computing Douglas Thain University of Notre Dame HEC FSIO Workshop 6 August 2009.
Cooperative Computing for Data Intensive Science Douglas Thain University of Notre Dame NSF Bridges to Engineering 2020 Conference 12 March 2008.
An Introduction to Grid Computing Research at Notre Dame Prof. Douglas Thain University of Notre Dame
Workload Management Massimo Sgaravatto INFN Padova.
Using Abstractions to Scale Up Applications to Campus Grids Douglas Thain University of Notre Dame 28 April 2009.
Hadoop Team: Role of Hadoop in the IDEAL Project ●Jose Cadena ●Chengyuan Wen ●Mengsu Chen CS5604 Spring 2015 Instructor: Dr. Edward Fox.
SIDDHARTH MEHTA PURSUING MASTERS IN COMPUTER SCIENCE (FALL 2008) INTERESTS: SYSTEMS, WEB.
Ch 4. The Evolution of Analytic Scalability
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 1 Introduction Read:
Networked Storage Technologies Douglas Thain University of Wisconsin GriPhyN NSF Project Review January 2003 Chicago.
Performance Concepts Mark A. Magumba. Introduction Research done on 1058 correspondents in 2006 found that 75% OF them would not return to a website that.
Programming Distributed Systems with High Level Abstractions Douglas Thain University of Notre Dame 23 October 2008.
Fall 2000M.B. Ibáñez Lecture 01 Introduction What is an Operating System? The Evolution of Operating Systems Course Outline.
C++ Programming Language Lecture 1 Introduction By Ghada Al-Mashaqbeh The Hashemite University Computer Engineering Department.
المحاضرة الاولى Operating Systems. The general objectives of this decision explain the concepts and the importance of operating systems and development.
Building Scalable Scientific Applications using Makeflow Dinesh Rajan and Douglas Thain University of Notre Dame.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
The Cooperative Computing Lab  We collaborate with people who have large scale computing problems in science, engineering, and other fields.  We operate.
1 Computational Abstractions: Strategies for Scaling Up Applications Douglas Thain University of Notre Dame Institute for Computational Economics University.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
VIPIN VIJAYAN 11/11/03 A Performance Analysis of Two Distributed Computing Abstractions.
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
Copyright © 2006, GemStone Systems Inc. All Rights Reserved. Increasing computation throughput with Grid Data Caching Jags Ramnarayan Chief Architect GemStone.
Bio-IT World Conference and Expo ‘12, April 25, 2012 A Nation-Wide Area Networked File System for Very Large Scientific Data William K. Barnett, Ph.D.
MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.
Parallel IO for Cluster Computing Tran, Van Hoai.
1 Christopher Moretti – University of Notre Dame 4/30/2008 High Level Abstractions for Data-Intensive Computing Christopher Moretti, Hoang Bui, Brandon.
Silberschatz and Galvin  Operating System Concepts Module 1: Introduction What is an operating system? Simple Batch Systems Multiprogramming.
Northwest Indiana Computational Grid Preston Smith Rosen Center for Advanced Computing Purdue University - West Lafayette West Lafayette Calumet.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
DIRAC: Workload Management System Garonne Vincent, Tsaregorodtsev Andrei, Centre de Physique des Particules de Marseille Stockes-rees Ian, University of.
Fundamental Operations Scalability and Speedup
Applied Operating System Concepts
Workload Management Workpackage
Chapter 1: Introduction
Chapter 1: Introduction
Chapter 1: Introduction
Chapter 1: Introduction
AWS Batch Overview A highly-efficient, dynamically-scaled, batch computing service May 2017.
Chapter 1: Introduction
Chapter 1: Introduction
Haiyan Meng and Douglas Thain
Operating System Concepts
Chapter 1: Introduction
BXGrid: A Data Repository and Computing Grid for Biometrics Research
Language Processors Application Domain – ideas concerning the behavior of a software. Execution Domain – Ideas implemented in Computer System. Semantic.
Wide Area Workload Management Work Package DATAGRID project
Chapter 1: Introduction
Chapter 1: Introduction
Chapter 1: Introduction
Chapter 1: Introduction
Operating System Concepts
Chapter 1: Introduction
Presentation transcript:

Deconstructing Clusters for High End Biometric Applications NSF CCF June Douglas Thain and Patrick Flynn University of Notre Dame 5 August 2007

Data Intensive Abstractions for High End Biometric Applications NSF CCF June Douglas Thain and Patrick Flynn University of Notre Dame 5 August 2007

The Problem: It is far too easy for an ambitious user of a large batch system to submit large workloads that cripple a system’s network or I/O capacity. Why does this happen?  The user does not know (or care) how to tune the workload for the given environment.  The system does not know (in advance) the workload structure and has few tools for shaping the load. Solution: Introduce abstractions that describe both data and CPU needs, allowing the system to partition, optimize, and predict workloads.

Application Context: Biometrics Goal: Design robust face comparison function. F 0.05 F 0.97

Application of Biometrics FFFFFFF Probe Image 0.95 Challenge: Make it work on non-ideal images with different orientation, expression, lighting... Question: How to systematically evaluate F?

All-Pairs Image Comparison F Current Workload: 4000 images 256 KB each 10s per F (five days) Future Workload: images 1MB each 1s per F (three months)

Plenty of CPUs

Non-Expert User Using 500 CPUs Try 1: Each F is a batch job. Failure: Dispatch latency >> F runtime. HN CPU FFFF F Try 2: Each row is a batch job. Failure: Too many small ops on FS. HN CPU FFFF F F F F F F F F F F F F F F F F Try 3: Bundle all files into one package. Failure: Everyone loads 1GB at once. HN CPU FFFF F F F F F F F F F F F F F F F F Try 4: User gives up and attempts to solve an easier or smaller problem.

Solution: The All-Pairs Abstraction All-Pairs:  For a set S and a function F:  Compute F(Si,Sj) for all Si and Sj in S. The end user provides:  Set S: A bunch of files.  Function F: A self-contained program. The computing system determines:  Optimal decomposition in time and space.  Which (and how many) resources to employ.  What to do when failures occur.

All Pairs Production System Web Portal 300 active storage units 500 CPUs, 40TB disk FGH S T All-Pairs Engine 2 - AllPairs(F,S) FFF FFF 3 - O(log n) distribution by spanning tree. 6 - Return result matrix to user. 1 - Upload F and S into web portal. 5 - Collect and assemble results. 4 – Choose optimal partitioning and submit batch jobs.

Initial Results on Real Workload

Optimizing One Abstraction Challenges of Scaling in the Real World  User assertions are unreliable. Measure F runtime, file sizes, network and disk speeds via sampling.  Managing real limits: sockets, jobs, file size, dirs.  Comprehending and reacting to inline errors. Make it portable across architectures.  Multi-core, cluster, campus grid, national grid Deploy with new applications.  Data mining - Document comparison.  Bioinformatics – DNA sequence similarity.

Broader Goal: Suite of Abstractions A complete high level data-intensive programming environment that for high throughput processing of data sets on parallel computation and storage. Super Data Cluster =  Abstractions +  Object Storage +  Active Storage +  Databases +  Functional Language

Data Intensive Programming namesexheightfile FredM BettyF HarryM S = select males > 5 feet tall T = apply( S, Distort ) M = allpairs( S, T, Compare ) A = rank( T, P, Compare ) DistortCompare function library active storage cluster S T Distort M Compare A metadata database

Project began June Personnel  Douglas Thain (PI) – Grid Computing  Patrick Flynn (co-PI) – Biometrics  Christopher Moretti – All Pairs Engine  Jared Bulosan – Web Portal (REU)  Brandon Rich – High Level Language  (Hire second grad student fall 2007) Publications  “Challenges in Executing Data Intensive Biometric Workloads on a Desktop Grid”, Christopher Moretti, Timothy Faltemier, Douglas Thain, and Patrick J. Flynn, Workshop on Large-Scale and Volatile Desktop Grids March  “All-Pairs: An Abstraction for Data Intensive Grid Computing”, Christopher Moretti, Jared Bulosan, and Douglas Thain, IEEE Grid, September  Used by Ph.D. Thesis: Tim Faltemier, “Robust 3D Face Recognition”, 2007.

Data Intensive Abstractions for High End Biometric Applications University of Notre Dame Douglas Thain Cooperative Computing Lab Patrick Flynn Computer Vision Research Lab