View from Experiment/Observation driven Applications Richard P. Mount May 24, 2004 DOE Office of Science Data Management Workshop.

Slides:



Advertisements
Similar presentations
Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and Mass Storage to Diverse Scientific Communities.
Advertisements

Chapter 2 Introduction to Computer Networks INTRODUCTION TO COMPUTER NETWORKS.
Configuration management
High Performance Computing Course Notes Grid Computing.
Feedback training session
An Overview of RAID Chris Erickson Graduate Student Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849
1 1 File Systems and Databases. 1 1 Introducing the Database 4Major Database Concepts u Data and information l Data - Raw facts l Information - Processed.
Workshop Goals Richard P. Mount May 24, 2004 DOE Office of Science Data Management Workshop.
Grand Challenges Robert Moorhead Mississippi State University Mississippi State, MS 39762
Panel Summary Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University XLDB 23-October-07.
MIS 175 Spring Learning Objectives When you finish this chapter, you will: –Recognize major components of an electronic computer. –Understand how.
Network Done by: Athra sultan.
Introduction/overview, Process model. What is Software Engineering? Why we need Software Engineering? Software Process Models.
Guide to Linux Installation and Administration, 2e1 Chapter 13 Backing Up System Data.
During each stage of the construction a range of media technologies where used, including different hardware and software to produce professional products.
BTREE Indices A little context information What’s the purpose of an index? Example of web search engines Queries do not directly search the WWW for data;
Hard Drive / Hard Disk Functions of hard disk
Studying Geography The Big Idea
11:15:01 Storage device. Computer memory Primary storage 11:15:01.
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
CHEP 2004 September 2004Richard P. Mount, SLAC Huge-Memory Systems for Data-Intensive Science Richard P. Mount SLAC CHEP, September 29, 2004.
By: Dwayne Burl.  The Central Processing Unit (CPU) is responsible for interpreting and executing most of the commands from the computer's hardware and.
Selecting and Implementing An Embedded Database System Presented by Jeff Webb March 2005 Article written by Michael Olson IEEE Software, 2000.
WHAT IS A COMPUTER? Computer is an electronic device designed to manipulate data so that useful information can be generated. Computer is multifunctional.
Flash Cards Computer Technology.
Tom Allen Computer Science Department Trinity University.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
UNIX Unbounded 5 th Edition Amir Afzal Chapter 1 First Things First.
Computers Mrs. Doss.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CERN and Computing … … and Storage Alberto Pace Head, Data.
Objectives of the Lecture
Could You Use More Traffic?. If you’re like most marketers, the answer to this question is… YES!
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS (Cont’d) Instructor Ms. Arwa Binsaleh.
Lecture 2 “Structure of computer” Informatics. Computer is  general purpose device that can be programmed to carry out a set of arithmetic or logical.
Introduction: Databases and Database Users
May Richard P. Mount, SLAC Advanced Computing Technology Overview Richard P. Mount Director: Scientific Computing and Computing Services Stanford.
Software.
material assembled from the web pages at
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Cloud Computing Characteristics A service provided by large internet-based specialised data centres that offers storage, processing and computer resources.
Types of Computers Storage Technologies Computer Talk Computer Performance $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Chapter 1 – A Geographer’s World
High Energy Physics Data Management Richard P. Mount Stanford Linear Accelerator Center DOE Office of Science Data Management Workshop, SLAC March 16-18,
Neural Networks in Computer Science n CS/PY 231 Lab Presentation # 1 n January 14, 2005 n Mount Union College.
Lecture 3 Page 1 CS 111 Online Disk Drives An especially important and complex form of I/O device Still the primary method of providing stable storage.
Application Software System Software.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Giuseppe Andronico INFN Sez. CT / Consorzio COMETA Beijing,
Data Mining with Big Data. Abstract Big Data concerns large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development.
CHAPTER 1 COMPUTER SCIENCE II. HISTORY OF COMPUTERS (1.1) Eniac- one of the worlds first computers Used more electricity than an entire city block of.
IT-DSS Alberto Pace2 ? Detecting particles (experiments) Accelerating particle beams Large-scale computing (Analysis) Discovery We are here The mission.
Overview of Computer Systems Course: Introduction to Computers Course Code: CIT1101 Presented by: Bulbul Week: 01.
IT 5433 LM1. Learning Objectives Understand key terms in database Explain file processing systems List parts of a database environment Explain types of.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
COMPUTER NETWORKS Quizzes 5% First practical exam 5% Final practical exam 10% LANGUAGE.
Chapter 1 – A Geographer’s World
Forms of Numbers Standard Form, Expanded Form, and Word Form
Ch 1 A Geographer’s World
Computer Science II Chapter 1.
Advanced Topics in Concurrency and Reactive Programming: Case Study – Google Cluster Majeed Kassis.
Managing Storage in a (large) Grid data center
3 - STORAGE: DATA CAPACITY CALCULATIONS
The Six Parts of a GIS.
Storage Systems for Managing Voluminous Data
A BRIEF INTRODUCTION TO UNIX OPERATING SYSTEM
SDM workshop Strawman report History and Progress and Goal.
Chapter 1 – Introduction to Computers
Distributed File Systems
Information and documentation media systems.
Presentation transcript:

View from Experiment/Observation driven Applications Richard P. Mount May 24, 2004 DOE Office of Science Data Management Workshop

Richard P Mount View from Experiment/Observation Driven Science 2

3

4

5

6

7

8

9

10

Richard P Mount View from Experiment/Observation Driven Science 11

Richard P Mount View from Experiment/Observation Driven Science 12

Richard P Mount View from Experiment/Observation Driven Science 13

Richard P Mount View from Experiment/Observation Driven Science 14

Richard P Mount View from Experiment/Observation Driven Science 15

Richard P Mount View from Experiment/Observation Driven Science 16

Richard P Mount View from Experiment/Observation Driven Science 17

Richard P Mount View from Experiment/Observation Driven Science 18

Richard P Mount View from Experiment/Observation Driven Science 19

Richard P Mount View from Experiment/Observation Driven Science 20

Richard P Mount View from Experiment/Observation Driven Science 21

Richard P Mount View from Experiment/Observation Driven Science 22

Richard P Mount View from Experiment/Observation Driven Science 23

Richard P Mount View from Experiment/Observation Driven Science 24

Richard P Mount View from Experiment/Observation Driven Science 25

Richard P Mount View from Experiment/Observation Driven Science 26

Richard P Mount View from Experiment/Observation Driven Science 27

Richard P Mount View from Experiment/Observation Driven Science 28

Richard P Mount View from Experiment/Observation Driven Science 29

Richard P Mount View from Experiment/Observation Driven Science 30

Richard P Mount View from Experiment/Observation Driven Science 31

Richard P Mount View from Experiment/Observation Driven Science 32

Richard P Mount View from Experiment/Observation Driven Science 33

Richard P Mount View from Experiment/Observation Driven Science 34

Experiment/Observation Common Characterisitcs (Mildly Provocative)

Richard P Mount View from Experiment/Observation Driven Science 36 Experiment/Observation Common Characteristics Dominated by large, expensive devices and projects Correct project planning includes data- management hardware and software development –Not acceptable to build a $1Billion device and then face a Data-Management crisis –Development might be much more valuable if performed in a wider context Often hundreds or thousands of users Geographically distributed users

Richard P Mount View from Experiment/Observation Driven Science 37 Consequences of Common Characteristics Less worry about workflow management – part of the project from the start Multi-user concerns: –Keeping track of millions of data products (files?) created by people you barely know –Performance issues due to many concurrent queries –Data movement, grids and networks really matter to international collaborations Visualization can be a useful tool but rarely a major issue Responsiveness is a key issue –Taking months or years to answer a simple question is almost deadly

Final Comments and Pet Project Peddling

Richard P Mount View from Experiment/Observation Driven Science 39 Characterizing Scientific Data My petabyte is harder to analyze than your petabyte –Images (or meshes) are bulky but simply structured and usually have simple access patterns –Features are perhaps 1000 times less bulky, but often have complex structures and hard-to-predict access patterns

Richard P Mount View from Experiment/Observation Driven Science 40 Hydrogen Bubble Chamber Photograph 1970 CERN Photo

Richard P Mount View from Experiment/Observation Driven Science 41 Storage Issues Disks: –Random access performance is lousy, unless objects are megabytes or more independent of cost deteriorating with time at the rate at which disk capacity increases (Define random-access performance as time taken to randomly access entire contents of a disk)

Richard P Mount View from Experiment/Observation Driven Science 42 Latency and Speed – Random Access

Richard P Mount View from Experiment/Observation Driven Science 43 Latency and Speed – Random Access

The End