The Architecture of NetarchiveSuite

Slides:



Advertisements
Similar presentations
JSP and web applications
Advertisements

Status and plans for the H3 release NetarchiveSuite 5.0.
1 The IIPC Web Curator Tool: Steve Knight The National Library of New Zealand Philip Beresford and Arun Persad The British Library An Open Source Solution.
Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.
ECSE Software Engineering 1I HO 7 © HY 2012 Lecture 7 Publish/Subscribe.
Service Broker Lesson 11. Skills Matrix Service Broker Service Broker, provides a solution to common problems with message delivery and consistency that.
Project Implementation for COSC 5050 Distributed Database Applications Lab1.
Click to add text SAP XBP 3.0 exploitation TWS Education + Training.
Curation Tool June 11, Curation Tool Overview Architecture Implementation Dependencies Futures 2.
UNIT-V The MVC architecture and Struts Framework.
|Tecnologie Web L-A Anno Accademico Laboratorio di Tecnologie Web Sviluppo di applicazioni web Servlet
MAVEN-BLUEMARTINI Yannick Robin. What is maven-bluemartini?  maven-bluemartini is Maven archetypes for Blue Martini projects  Open source project on.
Message-Driven Beans and EJB Security Lesson 4B / Slide 1 of 37 J2EE Server Components Objectives In this lesson, you will learn about: Identify features.
Bonrix SMPP Client. Index Introduction Software and Hardware Requirements Architecture Set Up Installation HTTP API Features Screen-shots.
Concurrency: Mutual Exclusion and Synchronization Chapter 5.
(Business) Process Centric Exchanges
July 2011CMSC 341 CVS/Ant 1 CMSC 341 Java Packages Ant CVS Project Submission.
JBoss Overview J2EE Sig Presenter: Steve Davidson Stephen Davidson & Associates, INC.
1 Concurrency Architecture Types Tasks Synchronization –Semaphores –Monitors –Message Passing Concurrency in Ada Java Threads.
Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.
Chapter 6 Introduction to Defining Classes. Objectives: Design and implement a simple class from user requirements. Organize a program in terms of a view.
SOFTWARE DESIGN AND ARCHITECTURE LECTURE 13. Review Shared Data Software Architectures – Black board Style architecture.
Process Architecture Process Architecture - A portion of a program that can run independently of and concurrently with other portions of the program. Some.
Chapter 14 Applets and Advanced GUI  The Applet Class  The HTML Tag F Passing Parameters to Applets F Conversions Between Applications and Applets F.
Overview of the Automated Build & Deployment Process Johnita Beasley Tuesday, April 29, 2008.
Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.
CSI 3125, Preliminaries, page 1 SERVLET. CSI 3125, Preliminaries, page 2 SERVLET A servlet is a server-side software program, written in Java code, that.
Plug-in Architectures Presented by Truc Nguyen. What’s a plug-in? “a type of program that tightly integrates with a larger application to add a special.
Chapter 5 Introduction To Form Builder. Lesson A Objectives  Display Forms Builder forms in a Web browser  Use a data block form to view, insert, update,
TOPIC 7.0 LINUX SERVICES AND CONFIGURATION. ROOT USER Root user is called “super user” because it has power far beyond those of mortal user. As root,
Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
(1) Code Walkthrough robocode-pmj-dacruzer Philip Johnson Collaborative Software Development Laboratory Information and Computer Sciences University of.
ISA 95 Working Group (Business) Process Centric Exchanges Dennis Brandl A Modest Proposal July 22, 2015.
September 28, 2010COMS W41561 COMS W4156: Advanced Software Engineering Prof. Gail Kaiser
Final Presentation Smart-Home Smart-Switch using Arduino
Active-HDL Server Farm Course 11. All materials updated on: September 30, 2004 Outline 1.Introduction 2.Advantages 3.Requirements 4.Installation 5.Architecture.
Setup of database, JMS broker, and FTP server Søren V. Carlsen
Java Distributed Computing
The Holmes Platform and Applications
Software and Communication Driver, for Multimedia analyzing tools on the CEVA-X Platform. June 2007 Arik Caspi Eyal Gabay.
Architecture Review 10/11/2004
BUILD SECURE PRODUCTS AND SERVICES
Automated ADT Interface Version .02
Maven 04 March
Essentials of UrbanCode Deploy v6.1 QQ147
Distributed Computing
CMS High Level Trigger Configuration Management
OFBiz Internals.
Cross Platform Development using Software Matrix
Java Distributed Computing
Advanced Integration and Deployment Techniques
PHP / MySQL Introduction
Complete 1z0-161 Exam Dumps - Pass In 24 Hours - Dumps4download.us
Remote Method Invocation
Network and Distributed Programming in Java
Java Messaging Service (JMS)
Java Messaging Service (JMS)
Java Messaging Service (JMS)
Packages and Interfaces
Introduction to JBoss application server
Module 01 ETICS Overview ETICS Online Tutorials
J2EE Lecture 13: JMS and WebSocket
Enterprise Java Beans.
Process/Code Migration and Cloning
Message Passing Systems Version 2
Building LabKey with Gradle
SharePoint 2013 Best Practices
Message Passing Systems
Presentation transcript:

The Architecture of NetarchiveSuite Søren V. Carlsen (svc@kb.dk)

NetarchiveSuite Java Modules (1)‏ The software consists of the following modules: Common (dk.netarkivet.common)‏ Harvester (dk.netarkivet.harvester)‏ Archive (dk.netarkivet.archie)‏ Viewerproxy (dk.netarkivet.viewerproxy)‏ Monitor (dk.netarkivet.monitor)‏ ”Deploy (dk.netarkivet.deploy)‏

NetarchiveSuite Java Modules (2)‏ Common module: Contain common functionality (like exceptions, JMS sending and receiving, ARC utilities) not specific to one module. Declares most of the interfaces, that the plugins must implement (only exception DBSpecifics)‏ Applications: dk.netarkivet.common.webinterface.GUIApplication All other modules depend on the common module, but can otherwise be used independently

NetarchiveSuite Java Modules (3)‏ Harvester module: Contains software related to partitioning, scheduling, and running HarvestJobs. Main apps: dk.netarkivet.harvester.harvesting.HarvestControllerApplic ation (Accepts jobs, starts Heritrix, return results)‏ dk.netarkivet.harvester.sidekick.SideKick dk.netarkivet.harvester.datamodel.HarvestTemplateApplic ation (Creates, downloads, updates, and deletes templates in database)‏ dk.netarkivet.harvester.webinterface.HarvestDefinitionApplicatio n (GUIApplication + scheduler in one app)‏

NetarchiveSuite Java Modules (4)‏ Archive module: Contains software implementing an archive distributed among several locations. Main Applications: dk.netarkivet.archive.arcrepository.ARCRepository Application dk.netarkivet.archive.bitarchive.BitarchiveApplicati on dk.netarkivet.archive.bitarchive.BitarchiveMonitor Application dk.netarkivet.archive.indexserver.IndexServerAppli cation

NetarchiveSuite Java Modules (5)‏ Viewerproxy module: Implements software to browse in archived material using a proxyserver. Main Applications: dk.netarkivet.viewerproxy.ViewerProxyApplication Monitor module: Software that utilizes JMX to monitor the running NetarchiveSuite dk.netarkivet.monitor.tools.JMXProxy

NetarchiveSuite Java Modules (6)‏ The modules and their associated webpages: Harvester webpages/HarvestDefinition (DefinitionsSitesection)‏ webpages/History Archive (webpages/BitPreservation)‏ Viewerproxy (webpages/QA)‏ Monitor (webpages/Status)‏

Messages and datatransfer (1)‏ Data is transmitted between applications either embedded inside messages, or using remoteFiles Messages is sent by applications, and received by application using a JMS message broker as middleman JMS messages can be sent to either a TOPIC Channel or a QUEUE Channel. In case of a TOPIC, all listeners will get the message. But if no one is listening, no one gets the message.

Messages and datatransfer (2)‏ In case of a QUEUE channel, the message remains in the channel until a listener will accept the message, or the message is deleted when cleaning the JMSBroker during restart of NetarchiveSuite Some of the channels used: THE_SCHED (one in all) – receives messages from the harvesters (QUEUE – channel)‏ ARC_REPOS (one in all) – QUEUE channel Receives archive-requests (getRequests, UploadRequests) from clients Forwards the requests to the archive. ALL_BA (one for each location) – TOPIC channel Used to multitask batchjobs to the BitarchiveServers

Messages and datatransfer (3)‏ More channels used: ANY_HIGHPRIORITY_HACO (one in all) – QUEUE channel Scheduler sends selective harvest jobs to this channel The HarvestControllers (with queuePriority set to HIGHPRIORITY) are listening to this channel, when they are idle ANY_LOWPRIORITY_HACO (one in all) – QUEUE channel Scheduler sends snapshot harvest jobs to this channel The HarvestControllers (with queuePriority set to LOWPRIORITY) are listening to this channel, when they are idle

Messages and datatransfer (4)‏ The Message channels is named with the JMS environment as prefix, so multiple NetarchiveSuite instances can use the same JMS broker without interfering with each other And to make it easy to empty any remaining messages lying in the channels during restart of the NetarchiveSuite

Implemented messages (1)‏ All messages extend the class NetarkivetMessage dk.netarkivet.common.distribute.NetarkivetMessag e Harvester messages Extend dk.netarkivet.harvester.distribute.HarvesterMessage DoOneCrawlMessage (dk.netarkivet.harvester.harvesting.distribute.*)‏ CrawlStatusMessage (dk.netarkivet.harvester.harvesting.distribute.*)‏

Implemented messages (2)‏ Archive messages Extend dk.netarkivet.archive.distribute.ArchiveMessage (dk.netarkivet.harvester.harvesting.distribute.*)‏ Bitpreservation messages: (dk.netarkivet.archive.arcrepository.bitpreservation)‏ AdminDataMessage RemoveAndGetFileMessage, Batchmessages : BatchMessage, BatchEndedMessage, BatchReplyMessage Request messages: GetFileMessage, GetMessage, IndexRequestMessage, StoreMessage, UploadMessage

Using JMSConnection (1)‏ import dk.netarkivet.common.distribute.JMSConnection; JMSConnection con = JMSConnection getInstance(); NetarkivetMessage nm = … con.send(nm) - send the message (the message itself knows the correct destination queue or topic)‏ con.replyTo(nm) - send message back to the sender of this message (or where ever we want the messages to go)‏

Using JMSConnection (2)‏ ChannelID A = ..; con.resend(nm, A) - used to forward or resubmit a message to queue A when a listener accepts the message from a queue (which removes the message from queue), and then finds that he can't handle the message. con.setListener(ChannelID mq, MessageListener ml)‏ Start listening to channel mq using MessageListener ml con.removeListener(ChannelID mq, MessageListener ml) Stop listening to channel mq using MessageListener ml All listeners must implement the interface javax.jms.MessageListener; void onMessage(Message)‏

The layout of the package (1)‏ The package contains build.xml (ant file for building the code)‏ lib (3rdparty libraries required)‏ src (the Java source code)‏ tests (unittests, and integrity tests for the software, as well as additional 3rd party libraries required by the tests) conf (default settings.xml, and deploy-script)‏ webpages (the GUI)‏ scripts/simple_harvest (Easy deploy of NetarchiveSuite)‏ scripts/sql - SQL create scripts for Derby and MySQL docs (javadoc for NetarchiveSuite

The layout of the package (2)‏ The package contains harvestdefinitionbasedir/fullhddb.jar (default Derby database ready for use)‏ harvestdefinitionbasedir/order_templates (default harvest templates)‏

The build.xml and its targets Ant version: 1.6.2+ We have targets that enables us to run tests (unittest, fulltest)‏ generate a codecoverage report (clover.report)‏ generate the module jarfiles (jarfiles)‏ generate the webpages warfiles (warfiles)‏ generate the javadoc for the code (javadoc)‏ generate a NetarchiveSuite release package (releasezipball)‏ Show the build.xml