J. Templon Nikhef Amsterdam Physics Data Processing Group Storage federations Why & How : site perspective Jeff Templon Pre-GBD october 2012 1.CDN vision.

Slides:

Advertisements

Similar presentations

2000 Making DADS distributed a Nordunet2 project Jochen Hollmann Chalmers University of Technology.

Advertisements

Distributed Xrootd Derek Weitzel & Brian Bockelman.

Networking Problems in Cloud Computing Projects. 2 Kickass: Implementation PROJECT 1.

Grids: Why and How (you might use them) J. Templon, NIKHEF VLV T Workshop NIKHEF 06 October 2003.

Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.

A Taxonomy and Survey of Content Delivery Networks Meng-Huan Wu 2011/10/26 1.

P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.

Flash Crowds And Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites Aaron Beach Cs395 network security.

70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 12: Managing and Implementing Backups and Disaster Recovery.

Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.

Freenet A Distributed Anonymous Information Storage and Retrieval System I Clarke O Sandberg I Clarke O Sandberg B WileyT W Hong.

Wide-area cooperative storage with CFS

Client/Server Architecture

1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu

Servers Redundant Array of Inexpensive Disks (RAID) –A group of hard disks is called a disk array FIGURE Server with redundant NICs.

CVMFS: Software Access Anywhere Dan Bradley Any data, Any time, Anywhere Project.

70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 12: Managing and Implementing Backups and Disaster Recovery.

1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.

On the Use and Performance of Content Distribution Networks Balachander Krishnamurthy Craig Wills Yin Zhang Presenter: Wei Zhang CSE Department of Lehigh.

Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.

Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration.

1 BTEC HNC Systems Support Castle College 2007/8 Systems Analysis Lecture 9 Introduction to Design.

Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.

70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 12: Managing and Implementing Backups and Disaster Recovery.

D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.

2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.

Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.

1 Data Guard. 2 Data Guard Reasons for Deployment  Site Failures  Power failure  Air conditioning failure  Flooding  Fire  Storm damage  Hurricane.

Let’s ChronoSync: Decentralized Dataset State Synchronization in Named Data Networking Zhenkai Zhu Alexander Afanasyev (presenter) Tuesday, October 8,

- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.

ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.

Archive Access Tool Review of SSS Readiness for EVLA Shared Risk Observing, June 5, 2009 John Benson Scientist.

Stephen Burke – Data Management - 3/9/02 Partner Logo Data Management Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF.

1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.

06-1L ASTRO-E2 ASTRO-E2 User Group - 14 February, 2005 Astro-E2 Archive Lorella Angelini/HEASARC.

Report from the WLCG Operations and Tools TEG Maria Girone / CERN & Jeff Templon / NIKHEF WLCG Workshop, 19 th May 2012.

CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,

Maite Barroso - 10/05/01 - n° 1 WP4 PM9 Deliverable Presentation: Interim Installation System Configuration Management Prototype

Auditing Project Architecture VERY HIGH LEVEL Tanya Levshina.

Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal.

Data Placement Intro Dirk Duellmann WLCG TEG Workshop Amsterdam 24. Jan 2012.

INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.

CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.

Data management demonstrators Ian Bird; WLCG MB 18 th January 2011.

OOD OO Design. OOD-2 OO Development Requirements Use case analysis OO Analysis –Models from the domain and application OO Design –Mapping of model.

09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.

Distributed Data Access Control Mechanisms and the SRM Peter Kunszt Manager Swiss Grid Initiative Swiss National Supercomputing Centre CSCS GGF Grid Data.

1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.

P2P Networking: Freenet Adriane Lau November 9, 2004 MIE456F.

+ AliEn status report Miguel Martinez Pedreira. + Touching the APIs Bug found, not sending site info from ROOT to central side was causing the sites to.

EGEE is a project funded by the European Union under contract IST Issues from current Experience SA1 Feedback to JRA1 A. Pacheco PIC Barcelona.

J. Templon Nikhef Amsterdam Physics Data Processing Group Large Scale Computing Jeff Templon Nikhef Jamboree, Utrecht, 10 december 2012.

A CASE STUDY PRESENTATION ON “THE DATABASE BEHIND MYSPACE”

Database Replication and Monitoring

CCNA Routing and Switching Routing and Switching Essentials v6.0

Torrent-based software distribution

GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.

R-GMA as an example of a generic framework for information exchange

SRM2 Migration Strategy

Torrent-based software distribution

CHAPTER 3 Architectures for Distributed Systems

Chapter 10: Device Discovery, Management, and Maintenance

CCNA Routing and Switching Routing and Switching Essentials v6.0

TYPES OF SERVER. TYPES OF SERVER What is a server.

Chapter 10: Device Discovery, Management, and Maintenance

Distributed Hash Tables

Building a minimum viable Security Operations Centre

Fundamentals of a Task Sequence

L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher

Presentation transcript:

J. Templon Nikhef Amsterdam Physics Data Processing Group Storage federations Why & How : site perspective Jeff Templon Pre-GBD october CDN vision for wLCG 2.Relation to storage federation 3.“what would we need to do to make this ready for sites to deploy?”

J. Templon Nikhef Amsterdam Physics Data Processing Group “Menial Labor” Inefficient use of storage Jobs failing due to ◦ Disk server overloaded (NOT AN ERROR) ◦ Actual problem with disk server ◦ Experiment catalog out of sync Protocol Zoo What problems need solving? Jeff Templon, Storage Federations WG225 october 2012

J. Templon Nikhef Amsterdam Physics Data Processing Group Archiving: Storage of data. “Home” for a specific file. “Master” copy. “origin server” Like tape at T1, what else? Data delivery: focus on performance. Permanence not expected. “proxy/cache” Such servers “everywhere”? Esp. at T2s, T3s, local analysis clusters … CDN division of labor Jeff Templon, Storage Federations WG325 october 2012

J. Templon Nikhef Amsterdam Physics Data Processing Group Clients never access origin servers; they become less complex (do one thing well) Clients may not even notice origin server in maintenance (served directly from delivery layer) Popular files rapidly everywhere, boring files only on origin server Caches are read-only from user perspective: simpler. Focus on delivery. Consequences of model Jeff Templon, Storage Federations WG425 october 2012

J. Templon Nikhef Amsterdam Physics Data Processing Group Jeff Templon, Storage Federations WG525 october 2012 Amsterdam DM Workshop, June 2010

J. Templon Nikhef Amsterdam Physics Data Processing Group Jeff Tem plon, Stor age Fed erat ions WG 6 T1 100 Gbits/s T2 T3 jobs data cache The vision (René’s version) This is us … 25 october 2012

J. Templon Nikhef Amsterdam Physics Data Processing Group Some CDNs overlay on P2P network “Catalogue search” is a query operation on a distributed hash tabledistributed hash table ◦ No centralized catalog to fail ◦ DHT is resilient: not a “nothing works unless everything works” system. Delivery nodes can collaborate – client redirection or inter-node retrieval, avoids traffic to origin servers CDN + P2P Jeff Templon, Storage Federations WG725 october 2012

J. Templon Nikhef Amsterdam Physics Data Processing Group SF seems to be a milder version of CDN Origin servers are explicitly part of the federation. SF & CDN Jeff Templon, Storage Federations WG825 october 2012

J. Templon Nikhef Amsterdam Physics Data Processing Group Time for questions before next part Jeff Templon, Storage Federations WG925 october 2012

J. Templon Nikhef Amsterdam Physics Data Processing Group Focus on operational aspects: normally ignored by our middleware developers How much work is it to install? How much “crap” gets dragged in, how “specific” are packages? How much work is it to configure? How easy to automate? How much is redundant (pref. none). How stable is config? Site Perspective : how Jeff Templon, Storage Federations WG1025 october 2012

J. Templon Nikhef Amsterdam Physics Data Processing Group How easy is it to tell that the service is correctly working? How easy is it to tell that the service is not correctly working? How easy is it to tell why the service is not correctly working? How easy is it to start from a user- reported error and trace back to a cause? More site how Jeff Templon, Storage Federations WG1125 october 2012

J. Templon Nikhef Amsterdam Physics Data Processing Group Observation : developers focus on a different goal for logging: that is, to verify that the program follows the expected sequence of events / steps. Unfortunately this is operationally … the least interesting case. More more site how Jeff Templon, Storage Federations WG1225 october 2012

J. Templon Nikhef Amsterdam Physics Data Processing Group Clear time stamps in std format Make log file strings easy to parse For multithreaded or multi-machine actions, provide a uniqueID that will be present in all messages Follow conventions for log levels. Std level should for success produce the minimal output needed for auditability, but all errors and serious warnings should appear Example : logging EMI RT #1202 Jeff Templon, Storage Federations WG1325 october 2012

J. Templon Nikhef Amsterdam Physics Data Processing Group NO!!! Sites *respond* in the same way to queries. How site implements the responding mechanism is unimportant as long as it works and performs well enough. Are all sites configured in same way? Jeff Templon, Storage Federations WG1425 october 2012