Indicate Research Pilots An e-Infrastructure enabled semantic search service Technical Conference Catania 20/04/2012 NTUA Kostas Pardalis 1.

Slides:



Advertisements
Similar presentations
1 Ontolog OOR Use Case Review Todd Schneider 1 April 2010 (v 1.2)
Advertisements

Ivan Pleština Amazon Simple Storage Service (S3) Amazon Elastic Block Storage (EBS) Amazon Elastic Compute Cloud (EC2)
Amazon Web Services and Eucalyptus
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 
Cloud based linked data platform for Structural Engineering Experiment Xiaohui Zhang
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
Networking Session: Global Information Structures for Science & Cultural Heritage - The Interoperability Challenge «INTEROPERABILITY FROM THE CULTURAL.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
Lecture 15 – Amazon Network as a Service. Recall IaaS Server as a Service Storage as a Service Network as a Service.
Cloud MapReduce : a MapReduce Implementation on top of a Cloud Operating System Speaker : 童耀民 MA1G Authors: Huan Liu, Dan Orban Accenture.
1/8 Enhancing Grid Infrastructures with Virtualization and Cloud Technologies Ignacio M. Llorente Business Workshop EGEE’09 September 21st, 2009 Distributed.
PhD course - Milan, March /09/ Some additional words about cloud computing Lionel Brunie National Institute of Applied Science (INSA) LIRIS.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
A Cloud is a type of parallel and distributed system consisting of a collection of inter- connected and virtualized computers that are dynamically provisioned.
Cloud Computing. What is Cloud Computing? Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable.
Software Architecture
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
M.A.Doman Short video intro Model for enabling the delivery of computing as a SERVICE.
Distributed Indexing of Web Scale Datasets for the Cloud {ikons, eangelou, Computing Systems Laboratory School of Electrical.
Amazon Web Services BY, RAJESH KANDEPU. Introduction  Amazon Web Services is a collection of remote computing services that together make up a cloud.
Cloud Computing & Amazon Web Services – EC2 Arpita Patel Software Engineer.
Cloud Computing Instructor: Pankaj Mehra Teaching Assistant: Raghav Gautam Lec. 5 April 22, 2010 ISM 158.
Dimitrios Skoutas Alkis Simitsis
The New Zealand Institute for Plant & Food Research Limited Use of Cloud computing in impact assessment of climate change Kwang Soo Kim and Doug MacKenzie.
A Systemic Approach for Effective Semantic Access to Cultural Content Ilianna Kollia, Vassilis Tzouvaras, Nasos Drosopoulos and George Stamou Presenter:
1/22/08 RTR Project Presentation to TPTF RTR Project Michael Daskalantonakis & Brian Cook.
1.Registration block send request of registration to super peer via PRP. Process re-registration will be done at specific period to info availability of.
Technical Update 2008 Sandy Payette, Executive Director Eddie Shin, Senior Developer April 3, 2008 Open Repositories 2008, Fedora User Group.
Semantic Publishing Benchmark Task Force Fourth TUC Meeting, Amsterdam, 03 April 2014.
The Mint Mapping tool The MoRe aggregator Vassilis Tzouvaras, Dimitris Gavrilis National Technical University of Athens Digital Curation Unit - IMIS, Athena.
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License Cloud Hosting Practices Lessons DuraSpace has learned Bill Branan Open Repositories.
Cloud Computing is a Nebulous Subject Or how I learned to love VDF on Amazon.
DuraCloud Open technologies and services for managing durable data in the cloud Michele Kimpton, CBO DuraSpace.
A N I N - MEMORY F RAMEWORK FOR E XTENDED M AP R EDUCE 2011 Third IEEE International Conference on Coud Computing Technology and Science.
1 NETE4631 Using Google Web Services Lecture Notes #6.
LoCloud Conference - Sharing local cultural heritage online with LoCloud services Microservices in LoCloud Walter Koch Gerda Koch
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
The StratusLab Distribution and Its Evolution 4ème Journée Cloud (Bordeaux, France) 30 November 2012.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Cloud Technology and the NGS Steve Thorn Edinburgh University (Matteo Turilli, Oxford University)‏ Presented by David Fergusson.
Amazon AWS Solution Architect Associate Exam Questions PDF associate.html AWS Solution Training Exam.
Amazon Network as a Service
Cloud based linked data platform for Structural Engineering Experiment
INTAROS WP5 Data integration and management
LOCO Extract – Transform - Load
Open Source distributed document DB for an enterprise
StratusLab Final Periodic Review
StratusLab Final Periodic Review
Tools and Services Workshop Overview of Atmosphere
VI-SEEM Data Repository
Web Based Application Cloud services, in the form of centralized web-based applications, also appeal to the IT professional. One instance of an application.
Amazon AWS Solution Architect Associate Exam Dumps For Full Exam Info Visit This Link:
Amazon AWS Solution Architect Associate Exam Questions PDF associate-dumps.html AWS Solution Training.
2018 Amazon AWS DevOps Engineer Professional Dumps - DumpsProfessor
2018 Valid Amazon AWS-Solution-Architect-Associate Exam Study Guide - AWS-Solution-Architect-Associate Questions Answers Realexamdumps.com
Replication Middleware for Cloud Based Storage Service
Unit 27: Network Operating Systems
Textbook Engineering Web Applications by Sven Casteleyn et. al. Springer Note: (Electronic version is available online) These slides are designed.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Design pattern for cloud Application
AWS Cloud Computing Masaki.
LOD reference architecture
Final Review 27th March Final Review 27th March 2019.
Presentation transcript:

Indicate Research Pilots An e-Infrastructure enabled semantic search service Technical Conference Catania 20/04/2012 NTUA Kostas Pardalis 1

Pilot Objectives Establish a search system using MICHAEL data Enrich the search system with semantic search capabilities Evaluate the feasibility of these requirements using e-infrastructures, presenting the main benefits from this integration 2

Use Case Scenario Adopt a typical but simplified workflow from the digital culture domain consisting of the following steps: – Aggregate data – Transform data into a common reference schema – Data Enrichment – Store data into an appropriate semantic repository – Semantic search 3

Implemented Tasks Data Manipulation – RDFization using a simple data model – Semantic Enrichment using DBpedia Semantic Repository for data storage E-Infrastructures architecture Evaluation of the architecture 4

Data Manipulation - Data Model Exploration of data – Every xml item represents a collection of digital cultural objects Mapping of xml elements to RDF properties for achieving semantic representation of data – Language → dcterms:language – Digital Format → dcterms:format 5

Data Manipulation - RDFization XML Instance Dambusters JPEG … RDF Representation Dambusters JPEG English Defence Economic and social development UNITED KINGDOM … 6

Data Manipulation - Enrichment Specific values of the examined dataset were discovered as DBpedia resources. Additional semantic information is added to the dataset – Countries : area, capital, density, currency, etc – Languages : spokenIn, languageFamily, speakers, etc – Famous Persons : dates of birth death, professions, works, etc 7

Enrichment Results TotalFoundPercentage Countries % Languages % Persons % 8

Semantic Repository for data storage Triplestore Evaluation – Requirements Distributed Licensing (open source) Sparql language support Web based access – Candidates 4store Sesame Bigowlim 9

Infrastructure Deployment Steps Decide about the Cloud platform that is utilized Deploy the Semantic Enrichment API Deploy the Semantic Repository 10

E-Infrastructures - Cloud Platform Amazon EC2 is used as the Cloud environment for deployment. – It provides a concrete pricing model for comparisons. – It is one of the most technologically mature Cloud environments. 11

Amazon EC2 Utilized Services Amazon Elastic Compute Cloud – Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit platform, were used to form the Indicate Cluster. Elastic IP Addresses – Were assigned to each instance to ensure the existence of static IPs Amazon Elastic Block Store (EBS) – Was used for providing persistence storage to the Indicate Cluster Instances. 12

Processing Infrastructure Message Queues and a Consumer-Producer Model are used for scalability. – Producer: Queues XML Items on the Message Queue for RDFization, Enrichment and Storing – A Number of N Consumers do the data manipulation concurrently and Store the results (Parallelization on the data level). 13

Data Amazon EC2 One Amazon EC2 Instance is acting as the producer and hosts the Message Queue (RabbitMQ). Five Large Amazon EC2 Instances are hosting the consumers. 14

Distributed Semantic Amazon EC2 The 4store Distributed Semantic Repository was installed on 4 Large EC2 Instances. The number of Nodes attached to the Semantic Repository can be adjusted in order to check scalability and performance. 15

Amazon EC2 Pricing Model 16

Evaluation Evaluation was performed for rdfization, enrichment and storage (load) tasks using – Single thread process on local host – Multi-thread process on local cluster (3 nodes) – Multi-thread process on Amazon cloud (9 nodes) 17

RabbitMQ Processing 18

Amazon EC2 19

Results MICHAEL : 8511 items Europeana:~ items Method UsedTime in Millisecs Local Host ms (~6.2hrs) Local Cluster (~1.39hrs) Amazon Cloud (~23.7 min) 20

Demonstration of Semantic Search Querying on data – Search for items from a specific country (e.g Greece) Semantic Querying using – Search for items from a specific country (e.g Greece) – Search for items which are hold by Countries of Mediterranean Sea that are about living politicians 21

Sparql Endpoint 22

Deployment on ~okeanos An IaaS Service. Developed by GRNET. Aims to deliver production-quality IaaS to the Greek academic and research community. Open source ( 23

Integration with the Indicate Portal cometa.it/semantic-search cometa.it/semantic-search 24

Conclusions Semantic Search using e-Infrastructures – Provides scalability that is vital for semantic enrichment, since frequent updates are required for remaining consistent. – Cost Processing: $ 0.68 per node per hour (~ 1.7 €) Storage: $ 0.11 per Gb per month (~ 4.4 €) 25

Questions ? 26