BARC Scaleable Servers

Slides:



Advertisements
Similar presentations
Microsoft Research Microsoft Research Jim Gray Distinguished Engineer Microsoft Research San Francisco SKYSERVER.
Advertisements

Trying to Use Databases for Science Jim Gray Microsoft Research
1 Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research
Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research
Archive Task Team (ATT) Disk Storage Stuart Doescher, USGS (Ken Gacke) WGISS-18 September 2004 Beijing, China.
Case Study: Photo.net March 20, What is photo.net? An online learning community for amateur and professional photographers 90,000 registered users.
Concurrent Web Map Cache Server Zao Liu, Marlon Pierce, Geoffrey Fox Community Grids Laboratory Indiana University.
ArcGIS Server Architecture at the DNR GIS/LIS Conference, October 2013.
NAS vs. SAN 10/2010 Palestinian Land Authority IT Department By Nahreen Ameen 1.
16 months…. The Visibility Information Exchange Web System is a database system and set of online tools originally designed to support the Regional Haze.
Overview Roles of Servers Hardware User Management Security.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.
Storage Area Networks The Basics. Storage Area Networks SANS are designed to give you: More disk space Multiple server access to a single disk pool Better.
1 Where The Rubber Meets the Sky Giving Access to Science Data Jim Gray Microsoft Research Alex.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
The Mass Storage System at JLAB - Today and Tomorrow Andy Kowalski.
Loris Giovannini, Mauro Giacchini Epics Collaboration Meeting
1 © 2006 SolidWorks Corp. Confidential. Clustering  SQL can be used in “Cluster Pack” –A pack is a group of servers that operate together and share partitioned.
Supported by the National Science Foundation’s Information Technology Research Program under Cooperative Agreement AST with The Johns Hopkins University.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
Global Land Cover Facility The Global Land Cover Facility (GLCF) is a member of the Earth Science Information Partnership (ESIP) Federation providing data,
1 The Terabyte Analysis Machine Jim Annis, Gabriele Garzoglio, Jun 2001 Introduction The Cluster Environment The Distance Machine Framework Scales The.
Science with the Virtual Observatory Brian R. Kent NRAO.
Hotfoot HPC Cluster March 31, Topics Overview Execute Nodes Manager/Submit Nodes NFS Server Storage Networking Performance.
Public Access to Large Astronomical Datasets Alex Szalay, Johns Hopkins Jim Gray, Microsoft Research.
Designing and Deploying a Scalable EPM Solution Ken Toole Platform Test Manager MS Project Microsoft.
Module 10: Maintaining High-Availability. Overview Introduction to Availability Increasing Availability Using Failover Clustering Standby Servers and.
Building BIG Data Servers on the Web Jim Gray Microsoft Research Talk at Flash Mob Supercomputer.
Proposed Server Infrastructure for the EGIS Initiative.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
TATII ITS Network (Fiber ) Portal Server Fourth Avenue Building Database Server Dual Sparc SAN (RAID) 1.2 TB Direct Connection backup_tables raw_data_files.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Implementation of a reliable and expandable on-line storage for compute clusters Jos van Wezel.
Pan-STARRS PS1 Published Science Products Subsystem Presentation to the PS1 Science Council August 1, 2007.
CHAPTER 7 CLUSTERING SERVERS. CLUSTERING TYPES There are 2 types of clustering ; Server clusters Network Load Balancing (NLB) The difference between the.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Text Microsoft to Or Tweet #uktechdays Questions?
Data Management Conference Performance & Scalability Simon Sabin London September 29th.
IPS Infrastructure Technological Overview of Work Done.
Database CNAF Barbara Martelli Rome, April 4 st 2006.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Storage Netværk Mød Microsoft Feb 2005, Agenda Data Protection Server (opdatering) Microsoft og iSCSI Demo.
Azure 101 – Where do I start? Andrew Nakamura Keck Medical Center - USC
Microsoft Research San Francisco (aka BARC: bay area research center) Jim Gray Researcher Microsoft Research Scalable servers Scalable servers Collaboration.
CommVault Architecture
Aaron Stanley King. What is SQL Azure? “SQL Azure is a scalable and cost-effective on- demand data storage and query processing service. SQL Azure is.
Presented by: Aaron Stanley King.  Benefits of SQL Azure  Features of SQL Azure  Demos, Demos, Demos!  How to query in SQL Azure  More Demos!  Recent.
The Holmes Platform and Applications
QC-specific database(s) vs aggregated data database(s) Outline
Storage Area Networks The Basics.
How to use the GALEX SkyNode*
iSCSI Storage Area Network
Large-scale file systems and Map-Reduce
Scalable Database Services for Physics: Oracle 10g RAC on Linux
Clustering Technology For Fault Tolerance
Future Data Architecture Cloud Hosting at USGS
ASM-based storage to scale out the Database Services for Physics
JDAT Production Hardware
Jim Gray Alex Szalay SLAC Data Management Workshop
Scaleout vs. Scaleup Robert Barnes Microsoft
Rick, the SkyServer is a website we built to make it easy for professional and armature astronomers to access the terabytes of data gathered by the Sloan.
Jim Gray Researcher Microsoft Research
COMPASS Database SPACE TELESCOPE SCIENCE INSTITUTE Gretchen Greene
Modern cloud PaaS for mobile apps, web sites, API's and business logic apps
SharePoint What it is, What it isn’t and What it might be
Scalable Database Services for Physics: Oracle 10g RAC on Linux
Google Sky.
Johan Lindberg, inRiver
Presentation transcript:

BARC Scaleable Servers Tom Barclay Jim Gray Presentation Northrop Grumman 6 Nov. 2003 @ BARC

Outline TerraServer: Past, Present, Future 10 Min World Wide Telescope: 10 Min

TerraServer History “the gift that keeps taking” SQL\Inst1 SQL\Inst2 SQL\Inst3 Spare 2200 Fiber SAN Switches F G L K P Q E J O I H M N R S 1997 – 1998 SQL 7.0 Scalability Test 1.0 TB Db 1998 – 1999 SQL 7.0 Scalability Demo 2.0 Tb Db 2000 – 2002 Win 2k Cluster, SQL Availability & .NET Web Svcs Demo 3.3 TB Partitioned Db 2003 Retirement Donate to USGS / USDA Plan aborted due lack of funding

TerraServer Achievements Continues to be popular geo-spatial app Averages 40k visitors / day; 1 million map views .NET Web Service OpenGIS web map server used by public/private sector Out “scales” competitor systems Product X Geography Network does 50% of volume with 300% more equipment Product Y handles 10 images per second peak vs. TerraServer’s 255 images/sec Used in production applications – USDA, EPA, FEMA

TerraServer – What’s Next? KVM / IP “Storage Bricks” “White-box commodity servers” 4tb raw / 2tb Raid1 SATA storage Dual Hyper-threaded Xeon 2.4ghz, 4GB RAM “Bunch” 3 “Storage Bricks” = 1 copy of TerraServer data, a.k.a. “bunch” Data partitioned across 20 databases, and growing No formal clustering s/w, application aware Low Cost Availability 4 copies of the data RAID1 SATA Mirroring 2 redundant “Bunches” Web Application “bunch aware” Load balances between redundant databases Fails over to surviving database on failure

TerraServer – Non-System Stuff Dynamic Map Re-projection UTM to Geographic projection Dynamic texture mapping? New Data 1 foot resolution natural color imagery Census Tiger data Lights Out Management MOM Auto-backup / restore on drive failure

Outline TerraServer: Past, Present, Future 10 Min World Wide Telescope: 10 Min

World Wide Telescope All astronomy data could be online Federated as one database All spectral bands All instruments All (recorded) times Astronomers want to do this have been trying to do it since 1500 I believe we can help: Databases for the archives Web services for the federation Microsoft Connection? Information at your fingertips for scientists

SkyServer.SDSS.org A modern archive Also used for education Raw Pixel data lives in file servers Catalog data (derived objects) lives in Database Online query to any and all Also used for education 150 hours of online Astronomy Implicitly teaches data analysis Interesting things Spatial data search Client query interface via Java Applet Query interface via Emacs Popular -- 1% of Terraserver  Cloned by other surveys (a template design) Web services are core of it.

Federation: SkyQuery.Net Combine 4 archives initially Just added 6 more Send query to portal, portal joins data from archives. Problem: want to do multi-step data analysis (not just single query). Solution: Allow personal databases on portal Problem: some queries are monsters Solution: “batch schedule” on portal server, Deposits answer in personal database.

SkyQuery Structure Portal Each SkyNode publishes Plans Query (2 phase) Integrates answers Is itself a web service Each SkyNode publishes Schema Web Service Database Web Service Image Cutout INT SDSS SkyQuery Portal FIRST 2MASS

Interesting Things SkyQuery is the most functional Web Service on GriPhyN http://skyservice.pha.jhu.edu/develop/vo/adql/ Now the prototype for an Open Architecture Being copied onto Oracle/DB2, Linux, … Good test of .NET interop Good side-by-side comparison of .NET