NLM Digital Repository Server Architecture January 18, 2011.

Slides:



Advertisements
Similar presentations
From Startup to Enterprise A Story of MySQL Evolution Vidur Apparao, CTO Stephen OSullivan, Manager of Data and Grid Technologies April 2009.
Advertisements

HATHI TRUST A Shared Digital Repository Building A Future By Preserving Our Past The Preservation Infrastructure of HathiTrust Digital Library Jeremy York.
Refeng Wu CQ5 WCM System Administrator
ITIS 3110 Jason Watson. Replication methods o Primary/Backup o Master/Slave o Multi-master Load-balancing methods o DNS Round-Robin o Reverse Proxy.
Fedora 3: A Smooth Migration Michael Durbin. The Scenario  New versions of software promise exciting new capabilities and improvements.  They also present.
Adam Duffy Edina Public Schools.  The heart of virtualization is the “virtual machine” (VM), a tightly isolated software container with an operating.
McGraw-Hill/Irwin ©2008 The McGraw-Hill Companies, All Rights Reserved SECTION 5.1 HARDWARE AND SOFTWARE.
Cold Fusion High Availability “Taking It To The Next Level” Presenter: Jason Baker, Digital North Date:
Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank
SERVER LOAD BALANCING Presented By : Priya Palanivelu.
Slide 1 ISTORE: System Support for Introspective Storage Appliances Aaron Brown, David Oppenheimer, and David Patterson Computer Science Division University.
Tamir Orbach, Director of Product Management - Metalogix
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
Progress in Access Technologies: NLM Video Search Jennifer Marill Chief, Technical Services Division Edward Luczak Systems Architect, Office of Computer.
Capacity Planning in SharePoint Capacity Planning Process of evaluating a technology … Deciding … Hardware … Variety of Ways Different Services.
Harvard’s Digital Repository Service (DRS) Architecture Harvard University Library (HUL) Andrea Goethals, Randy Stern December 10, 2009.
PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG.
How WebMD Maintains Operational Flexibility with NoSQL Rajeev Borborah, Sr. Director, Engineering Matt Wilson – Director, Production Engineering – Consumer.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Windows Azure SQL Database and Storage Name Title Organization.
Experiences with AWS and RightScale By: Max Gribov Presented at New York PHP, March 22, 2011
Chapter 10 : Designing a SQL Server 2005 Solution for High Availability MCITP Administrator: Microsoft SQL Server 2005 Database Server Infrastructure Design.
5 September 2015 Culrur-exp project CULTURe EXchange Platform (CULTUR-EXP) project kick-off meeting, August 2013, Tbilisi, Georgia Joint Operational.
Components of Windows Azure - more detail. Windows Azure Components Windows Azure PaaS ApplicationsWindows Azure Service Model Runtimes.NET 3.5/4, ASP.NET,
DuraCloud Managing durable data in the cloud Michele Kimpton, Director DuraSpace.
Build Custom SharePoint Solutions with FrontPage 2003 & Windows SharePoint Services Deployment Strategies 최승현 대리 한국마이크로소프트.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
NLM Digital Collections Update for DCFedoraUsersGroup January 22, 2013 John Doyle National Library of Medicine.
University of Washington Windows and Unix Servers IEEAF – RENU Network Design Workshop Seattle - 30 Nov 2007 Lori Stevens, Director, Distributed Systems.
GigaSpaces Global HTTP Session Sharing October 2013 Massive Web Application Scaling.
© Pearson Education Limited, Chapter 16 Physical Database Design – Step 7 (Monitor and Tune the Operational System) Transparencies.
Goodbye rows and tables, hello documents and collections.
File Processing - Database Overview MVNC1 DATABASE SYSTEMS Overview.
Copyright © 2006 Pilothouse Consulting Inc. All rights reserved. Overview Scale out architecture Servers, services, and topology in Central Administration.
Databases March 14, /14/2003Implementation Review2 Goals for Database Architecture Changes Simplify hardware architecture Improve performance Improve.
NCICB Systems Architecture Bill Britton Terrapin Systems LPG/NCICB Dedicated Support.
Indiana University’s Name for its Sakai Implementation Oncourse CL (Collaborative Learning) Active Users = 112,341 Sites.
Adam Duffy Edina Public Schools.  Traditional server ◦ One physical server ◦ One OS ◦ All installed hardware is limited to that one server ◦ If hardware.
1 Windows 2000 Product family (Week 3, Monday 1/23/2006) © Abdou Illia, Spring 2006.
CH1. Hardware: CPU: Ex: compute server (executes processor-intensive applications for clients), Other servers, such as file servers, do some computation.
Module 10: Maintaining High-Availability. Overview Introduction to Availability Increasing Availability Using Failover Clustering Standby Servers and.
Announcements. Data Management Chapter 12 Traditional File Approach  Structure Field  Record  File  Fixed All records have common fields, and a field.
7. Replication & HA Objectives –Understand Replication and HA Contents –Standby server –Failover clustering –Virtual server –Cluster –Replication Practicals.
Proposed Server Infrastructure for the EGIS Initiative.
Consulting Services JobScheduler Architecture Decision Template Information for Consulting Parties Information for Consulting Parties.
Read/understand sizing, scalability, capacity guidance Documentation on technet, Exchange team blog, etc. Collect data on existing deployment.
Windows Azure Virtual Machines Anton Boyko. A Continuous Offering From Private to Public Cloud.
SYS364 Database Design Continued. Database Design Definitions Initial ERD’s Normalization of data Final ERD’s Database Management Database Models File.
The Million Point PI System – PI Server 3.4 The Million Point PI System PI Server 3.4 Jon Peterson Rulik Perla Denis Vacher.
R. Jiménez-Peris Scalability Evaluation of the Replication Support of JOnAS, an Industrial J2EE Application Server A. Paz, F. P é rez-Sorrosal, M. Patiño-Martínez,
IMS 4212: Data and Database Administration 1 Dr. Lawrence West, Management Dept., University of Central Florida Data & Database Administration.
Consulting Services JobScheduler Architecture Decision Template Information for Consulting Parties Information for Consulting Parties.
Replicazione e QoS nella gestione di database grid-oriented Barbara Martelli INFN - CNAF.
Fedora Service Framework Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Considerations and Benefits for Archive and Compliance Deploying Enterprise Vault on NetApp Storage 1.
Distributed databases A brief introduction with emphasis on NoSQL databases Distributed databases1.
High-Availability MySQL with DR:BD and Heartbeat: MTV Japan mobile services ©2008 MTV Networks Japan K.K.
Click to edit Master title style Sytel’s High Availability Strategy © 2012 Sytel Limited. All rights reservedVersion 2.5.
Sql Server Architecture for World Domination Tristan Wilson.
CommVault Architecture
SCALABILITY AND SECURITY Presentation. 01 Scalability.
Lessons learned administering a larger setup for LHCb
Workload Distribution Architecture
Learning MongoDB ZhangGang
MyRocks at Facebook and Roadmaps
Overview: Fedora Architecture and Software Features
Future Data Architecture Cloud Hosting at USGS
Arrested by the CAP Handling Data in Distributed Systems
Cloud Computing Architecture
Solution versatility Standalone Large scale
Presentation transcript:

NLM Digital Repository Server Architecture January 18, 2011

Design Considerations Consistency with NLM architecture and processes Consistency with NLM architecture and processes Remove single points of failure Remove single points of failure Data redundancy for preservation Data redundancy for preservation Availability Availability Scalability Scalability Ingest ease, speed Ingest ease, speed 2

3 Single Server Architecture NWU BookViewer Flash Video Player with Search Muradora 1.4b Fedora Solr GSearch OS: CentOS HW: virtual server, 3 CPU, 24 GB RAM Djatoka MySQL 5.0 Tomcat Fedora Managed Storage External Storage Solr Index Resource Index Application ServerDatabase ServerFile Server

Content and code Fedora managed content Fedora managed content Fedora database Fedora database Fedora Resource Index Fedora Resource Index Solr Index Solr Index External content External content Application code Application code Can and should these items be shared across Fedora servers? Can and should these items be shared across Fedora servers? 4

Data Center Environment Two locations with two virtual servers each Two locations with two virtual servers each –Primary: NLM data center –Backup: Contingency operations data center –Active/Active – both locations always in use –Each virtual server has 3 CPU, 24 GB RAM System tools System tools –3DNS – wide load-balancing –BIG-IP – local load balancing –Server monitoring, automatic failover –SnapMirror – NetApp filesystem replication 5

System Architecture Primary Data CenterBackup Data Center BIG-IP Fedora Primary #1 Fedora DB External Storage Managed Storage Solr Index Resource Index Fedora Primary #2 Fedora DB Managed Storage Solr Index Resource Index BIG-IP Fedora Backup #1 Fedora DB External Storage Managed Storage Solr Index Resource Index Fedora Backup #2 Fedora DB Managed Storage Solr Index Resource Index Browser 3DNS

Ingest considerations Our Fedora system is read-only with controlled periodic batch content updates Our Fedora system is read-only with controlled periodic batch content updates System is available during updates – use one data center while updating the other System is available during updates – use one data center while updating the other Code and content should be identical across servers Code and content should be identical across servers Reduce time to ingest to all servers in system. Approx. 10 hours for full re-ingest. Reduce time to ingest to all servers in system. Approx. 10 hours for full re-ingest. 7

Content replication Content replication strategies Content replication strategies 1.Fedora journaling (ingest to master, master-slave, messaging) 2.Ingest to master, copy managed content to slave, rebuild slave DB and resource index from managed content (rebuild is faster than full ingest) 3.Ingest to master, use system tools (NetApp SnapMirror) to copy all resources to slaves. 4.Ingest to each server independently Our approach Our approach –Turn off primary data center, use backup data center to serve public –Ingest to primary 1, copy managed content to primary 2, rebuild primary 2... –Turn off backup data center, use primary data center to serve public –Use SnapMirror to copy all resources from primary 1,2 to backup 1,2 –Turn on backup data center, both data centers available to serve public 8

NLM Content Replication Primary Data CenterBackup Data Center Fedora Primary #1 Fedora DB External Storage Managed Storage Solr Index Resource Index Fedora Primary #2 Fedora DB Managed Storage Solr Index Resource Index Fedora Backup #1 Fedora DB External Storage Managed Storage Solr Index Resource Index Fedora Backup #2 Fedora DB Managed Storage Solr Index Resource Index Ingest Rebuild SnapMirror