"Data sources index" a web application to list projects in Hadoop Luca Menichetti.

Slides:



Advertisements
Similar presentations
Jump to Contents Instructor Tutorial essignments.com Paperless assignment submission system.
Advertisements

Building Portals to access Grid Middleware National Technical University of Athens Konstantinos Dolkas, On behalf of Andreas Menychtas.
WEB DESIGN TABLES, PAGE LAYOUT AND FORMS. Page Layout Page Layout is an important part of web design Why do you think your page layout is important?
A Toolbox for Blackboard Tim Roberts
General Financial Supply Website & E-Commerce Solutions This presentation will demo the GFS corporate website and On-Line Order Inquiry options available.
Servlets and a little bit of Web Services Russell Beale.
Conference Calendar CS 337 Project Supervised by Professor Russell Abbott. Alexandre Lomovtsev, Haritha Sankavaram, Lewis Chen, Rasha Mohamed.
Conference Calendar 1.Description Overview 2.Conference Information 3.User Information 4.Use Cases 5.Schedule.
Conference Calendar 1.Description Overview 2.Conference Information 3.User Information 4.Use Cases 5.Schedule.
Peoplesoft: Building and Consuming Web Services
LHCbPR V2 Sasha Mazurov, Amine Ben Hammou, Ben Couturier 5th LHCb Computing Workshop
1 of 5 This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2006 Microsoft Corporation.
Talend 5.4 Architecture Adam Pemble Talend Professional Services.
Michael Atkins. Note:  This is a non-technical overview  Some light technical background is given, to put things in context  Some of the content is.
OMap By: Haitham Khateeb Yamama Dagash Under Suppervision of: Benny Daon.
The template site was designed so that if a school principal chose they could task someone other than the webmaster to maintain the content of the website.
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 4 Web technologies: HTTP, CGI, PHP,Java applets)
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.
Chapter 33 CGI Technology for Dynamic Web Documents There are two alternative forms of retrieving web documents. Instead of retrieving static HTML documents,
® IBM Software Group © 2009 IBM Corporation Rational Publishing Engine RQM Multi Level Report Tutorial David Rennie, IBM Rational Services A/NZ
BBS CONTACT CAPABILITY REVIEW: WEB WIREFRAMES PROPOSAL VERSION.
Using Visual Basic 6.0 to Create Web-Based Database Applications
Performance and Insights on File Formats – 2.0 Luca Menichetti, Vag Motesnitsalis.
Google Data APIs Google Data APIs : Integrando suas aplicações Java com os serviços Google.
Putting What We Learned Into Context – WSGI and Web Frameworks A290/A590, Fall /16/2014.
Marianne BargiottiBK Workshop – CERN - 6/12/ Bookkeeping Meta Data catalogue: present status Marianne Bargiotti CERN.
AWG 2014 Data Model Description Christian Nieke – IT-DSS AWG 2014: Christian Nieke1.
David Adams ATLAS AJDL: Analysis Job Description Language David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
A NoSQL Database - Hive Dania Abed Rabbou.
Chapter 6 Server-side Programming: Java Servlets
Introducing HingX now with Capacity Development Network.
Databases. What is a database?  A database is used to store data. The word DATA is actually Latin for FACTS. A database is, therefore, a place, or thing.
TAKE – A Derivation Rule Compiler for Java Jens Dietrich, Massey University Jochen Hiller, TopLogic Bastian Schenke, BTU Cottbus.
What's New in Kinetic Calendar 2.0 Jack Boespflug Kinetic Data.
DM_PPT_NP_v01 SESIP_0715_JR HDF Server HDF for the Web John Readey The HDF Group Champaign Illinois USA.
6 th Annual Focus Users’ Conference 6 th Annual Focus Users’ Conference Import Testing Data Presented by: Adrian Ruiz Presented by: Adrian Ruiz.
Copyright © 2006 Pilothouse Consulting Inc. All rights reserved. Search Overview Search Features: WSS and Office Search Architecture Content Sources and.
IS-907 Java EE World Wide Web - Overview. World Wide Web - History Tim Berners-Lee, CERN, 1990 Enable researchers to share information: Remote Access.
CERN IT Department CH-1211 Geneva 23 Switzerland t CF Computing Facilities Agile Infrastructure Monitoring CERN IT/CF.
Implementing and Using the SIRWEB Interface Setup of the CGI script and web procfile Connecting to your database using HTML Retrieving data using the CGI.
TopCAT Use Cases Priorities User Interface 1 ICAT developer workshop, August 2009 Laurent Lerusse – STFC
RESTful Web Services What is RESTful?
DSpace System Architecture 11 July 2002 DSpace System Architecture.
Copyright (c) 2014 Pearson Education, Inc. Introduction to DBMS.
HUBzero® Platform for Scientific Collaboration Copyright © 2012 HUBzero Foundation, LLC Collaboration and Contribution Emily Kayser Hub Liaison, HUBzero®
Global ADC Job Monitoring Laura Sargsyan (YerPhI).
Collaborative Work Module Gwen Kerdiles European Solution Centre SunGard Higher Education.
Filtering, aggregating and histograms A FEW COMPLETE EXAMPLES WITH MR, SPARK LUCA MENICHETTI, VAG MOTESNITSALIS.
Spark and Jupyter 1 IT - Analytics Working Group - Luca Menichetti.
Google Code Libraries Dima Ionut Daniel. Contents What is Google Code? LDAPBeans Object-ldap-mapping Ldap-ODM Bug4j jOOR Rapa jongo Conclusion Bibliography.
QC – User Interface QUALITY CENTER. QC – Testing Process QC testing process includes four phases: Specifying Requirements Specifying Requirements Planning.
Orion Contextbroker PROF. DR. SERGIO TAKEO KOFUJI PROF. MS. FÁBIO H. CABRINI PSI – 5120 – TÓPICOS EM COMPUTAÇÃO EM NUVEM
1 A Look at the Application Authorized users can access Communicator! NXT from any Internet-capable computer via the Web.
Esri UC 2014 | Technical Workshop | Administering ArcGIS for Server with Python Jon Bodamer.
REST API Design. Application API API = Application Programming Interface APIs expose functionality of an application or service that exists independently.
| 1 EBSCOadmin EBSCO Support EDS Wiki Renata Wlodarczyk | EBSCO.
International Planetary Data Alliance Registry Project Update September 16, 2011.
Open-O CLI (Command-Line Interface ) Architecture
Component and Deployment Diagrams
Business Directory REST API
Section 6 Object Storage Gateway (RADOS-GW)
z/Ware 2.0 Technical Overview
Global Grid Forum GridForge
Database application MySQL Database and PhpMyAdmin
VistA on Doug Martin, MD.
Ashish Pandit IT Architect, Middleware & Integration Services
Database Management Systems
Web APIs In computer programming, an application programming interface (API) is a set of subroutine definitions, protocols, and tools for building application.
Week 05 Node.js Week 05
Status and plans for bookkeeping system and production tools
Presentation transcript:

"Data sources index" a web application to list projects in Hadoop Luca Menichetti

Scope, Problem One goal of the AWG: to collect data (in our Hadoop clusters) coming from different IT service projects, allowing easy and fast approach to the analysis. TWiki page: contains an inventory of all data sources and their available metrics. TWiki page Limitations: ◦ It is a static list (no ETL updates from origin sources) ◦ Focused on the origin metrics description written by the provider (no future manipulations or alternative formats) ◦ No offering defined APIs 2

Purpose To offer a web application where users can: ◦ browse all collected data sources and see which are the actual current availabilities for each project, ◦ monitor the daily ETL state from the origin source to the cluster, ◦ list other formats besides the main one, ◦ share new derived dataset, (e.g. created from a join with other datasets or enriched with external information, and so on) ◦ provide public APIs …without replacing the TWiki function which is the main documentation reference for each project. 3

How A Java Web application with REST API implementing CRUD logic. Running on: OpenStack, win.medium size Container: Tomcat Database: MongoDB (Morphia for object mapping)Morphia Content Type standard: Collection+JSONCollection+JSON Web interface: Bootstrap 4 Web homepage: Source code: Gitlab projectGitlab projectLinks

REST model For each data source in the TWiki, there is one or more “Projects” stored in the web index application. A Project may have many “Formats” ◦ CSV, Parquet, Avro, … A Format may have a list of “Entries” ◦ Representing single imports “Notes” can be attached to a Format and optionally to an Entry ◦ General purpose messages 5

Web interface and API 6

Example – landb (1) 7 TWikiData Sources Index

Example – landb (2) Compact view for a fast visuali- zation of all formats and relative schemas Bookmark the web page for a quick reference Not meant to replace the TWiki, where there are described all the metrics Every resource is accessible through its REST path No “Entries” available for these Formats (click on links will produce an empty result) 8

Example – experiment job monitoring (1) 9 Link: TWikiTWikiLink: Data Sources IndexData Sources Index

Example – experiment job monitoring (2) 10 Listing formats…Listing entries…

Example – EOS logs (1) 11 Listing formats…Listing notes…

Example – REST API (1) $ cat templates/jm-atlas_CJ_template.json { "template": { "data": [ { "name": "project_name", "value": "jm-atlas" }, { "name": "full_name", "value": "Experiment Dashboard Job Monitoring Atlas" }, { "name": "description", "value": "the job monitoring logs of all executions submitted by Atlas in.. },... 12

Example – REST API (2) # Create a project curl -X POST -H "Content-Type: application/vnd.collection+json" awg-virtual/data-sources-index/rest/projects/ # Retrieve curl awg-virtual/data-sources-index/rest/projects/jm-atlas # Delete curl -X DELETE awg-virtual/data-sources-index/rest/projects/jm-atlas # Create a format curl -X POST -H "Content-Type: application/vnd.collection+json" awg-virtual/data-sources-index/rest/projects/jm-atlas/formats 13

Example – REST API (3) $ curl -v awg-virtual/data-sources-index/rest/projects/jm-atlas/formats > GET /data-sources-index/rest/projects/eos-alice HTTP/1.1 < HTTP/ OK, Content-Type: application/vnd.collection+json { "collection": { "version":"1.0", "href":"/projects/jm-atlas/formats", "items": [ { "href":"/formats/55b72f41080d827ec79968f9", "data": [ { "name":"ID", "prompt":"the internal ID given by morphia to store the object in mongodb", "value":"55b72f41080d827ec79968f9“ }, { "name":“format_name", "prompt":"the short name for the project, commonly used in the HDFS path and as parameter for awgrepotool", "value":“jm-atlas-avro"}, … 14

References Web application: ◦ Not visible outside CERN REST API: Gitlab project / Wiki Gitlab projectWiki ◦ Visible to all CERN members ◦ Only restricted users can modify the project Documentation and examples Documentationexamples ◦ REST APIs detailed description ◦ Python scripts, ready to use Contact: ◦ or open an issueopen an issue 15

Questions or suggestions ? Thank You 16