INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org gLibrary: a Multimedia Contents Management System on the grid Tony Calanducci INFN Catania,

Slides:



Advertisements
Similar presentations
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
Advertisements

EGEE-II INFSO-RI Enabling Grids for E-sciencE AMGA Metadata Access on the Grid Mike Mineter.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America The AMGA metadata catalog with use cases.
SOFTWARE PRESENTATION ODMS (OPEN SOURCE DOCUMENT MANAGEMENT SYSTEM)
E-science grid facility for Europe and Latin America A Data Access Policy based on VOMS attributes in the Secure Storage Service Diego Scardaci.
The AMGA metadata catalog Riccardo Bruno - INFN Madrid, 07-11/05/2007.
Asterios Katsifodimos Saturday, May 23, 2015 High Performance Computing systems Lab University of Cyprus The AMGA metadata catalog – An Overview Slides.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America The AMGA metadata catalog with use cases.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America AMGA metadata catalog with use cases Tony.
INFSO-RI Enabling Grids for E-sciencE University of Coimbra AMGA Use Cases Tony Calanducci NA4 Generic Applications Meeting January.
Makrand Siddhabhatti Tata Institute of Fundamental Research Mumbai 17 Aug
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
INFSO-RI Enabling Grids for E-sciencE The GENIUS Grid portal Tony Calanducci INFN Catania - Italy First Latin American Workshop.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
IST E-infrastructure shared between Europe and Latin America The AMGA metadata catalog with use cases Domenico Vicinanza, CERN.
EGEE-III INFSO-RI Enabling Grids for E-sciencE The Medical Data Manager : the components Johan Montagnat, Romain Texier, Tristan.
FESR Trinacria Grid Virtual Laboratory The AMGA metadata catalog with use cases Riccardo Bruno - INFN gLite Tutorial Istanbul, July.
EGEE-II INFSO-RI Enabling Grids for E-sciencE AMGA Metadata Access on the Grid Mike Mineter.
Enabling Grids for E-sciencE EGEE-III INFSO-RI I. AMGA Overview What is AMGA Metadata Catalogue of EGEE’s gLite 3.1 Middleware Main Feature of.
EGRIS-1 E-infrastructure shared between Europe and Latin America AMGA Metadata Services: examples and usage scenarios Tony Calanducci INFN.
INFSO-RI Enabling Grids for E-sciencE How to join GILDA Riccardo Bruno INFN gLite Tutorial at the First EGEE User Forum CERN,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks AMGA PHP API Claudio Cherubino INFN - Catania.
INFSO-RI Enabling Grids for E-sciencE SA1 and gLite: Test, Certification and Pre-production Nick Thackray SA1, CERN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Next steps with EGEE EGEE training community.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks State of Interoperability Laurence Field.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The GILDA training infrastructure.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Hands on session: the AMGA Metadata Catalogue.
INFSO-RI Enabling Grids for E-sciencE gLite Data Management and Interoperability Peter Kunszt (JRA1 DM Cluster) 2 nd EGEE Conference,
INFSO-RI Enabling Grids for E-sciencE A Grid Approach to Distributed Image Analysis for Early Diagnosis of Alzheimer Disease Livia.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The GILDA t-Infrastructure Roberto Barbera.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Progress on first user scenarios Stephen.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Medical Data Manager 1 Dicom retrieval : overview of the DPM One command line to retrieve a file:
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Data management in LCG and EGEE David Smith.
INFSO-RI Enabling Grids for E-sciencE Information System Valeria Ardizzone INFN EGEE NA4 Generic Applications Meeting Catania,
INFSO-RI Enabling Grids for E-sciencE VOMS & MyProxy interaction Emidio Giorgio INFN NA4 Generic Applications Meeting 10 January.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks A GRID based platform to host multiple repositories.
INFSO-RI Enabling Grids for E-sciencE Summary of the data access session EGEE User Forum, March 3 rd, 2006 Johan Montagnat Birger.
FP6−2004−Infrastructures−6-SSA Enabling Grids for E-sciencE The AMGA Metadata Catalog Introduction and hands-on exercises Nuno Santos.
INFSO-RI Enabling Grids for E-sciencE Introduction to Grid Computing, EGEE and Bulgarian Grid Initiatives, Sofia, South.
Storing digital assets on Grid/EGI FedCloud with gLibrary Giuseppe La Rocca, INFN DARIAH ERIC.
INFSO-RI Enabling Grids for E-sciencE University of Coimbra GSAF Grid Storage Access Framework Salvatore Scifo INFN of Catania EGEE.
INFSO-RI Enabling Grids for E-sciencE gLite Test and Certification Effort Nick Thackray CERN.
INFSO-RI Enabling Grids for E-sciencE gLite Information System: R-GMA Tony Calanducci INFN Catania gLite tutorial at the EGEE User.
EGEE-II INFSO-RI Enabling Grids for E-sciencE AMGA Metadata Service Gergely Sipos.
EGEE-II INFSO-RI Enabling Grids for E-sciencE More on gLite: 2 services you have not seen! Mike Mineter.
FESR Consorzio COMETA - Progetto PI2S2 The AMGA Metadata Catalog with use cases Salvatore Scifo, Tony Calanducci INFN Catania Grid.
FESR Trinacria Grid Virtual Laboratory University of Coimbra AMGA Use cases: gLibrary & gMOD Tony Calanducci INFN Catania, NA3 & NA4 First.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Architecture of LHC File Catalog Valeria Ardizzone INFN Catania – EGEE-II NA3/NA4.
Miguel Ángel Saúl Soto INFN - Sezione di Catania Supervisor: Antonio Calanducci
Storing digital assets on Grid/EGI FedCloud with gLibrary Giuseppe La Rocca, INFN DARIAH ERIC.
FESR Consorzio COMETA - Progetto PI2S2 AMGA Official Metadata Service for EGEE Salvatore Scifo – Consorzio Cometa - Catania, ITALY.
FESR Consorzio COMETA - Progetto PI2S2 AMGA Official Metadata Service for EGEE Salvatore Scifo – Consorzio Cometa - Catania, ITALY.
EGEE-II INFSO-RI Enabling Grids for E-sciencE AMGA Metadata Service Mike Mineter.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
INFSO-RI Enabling Grids for E-sciencE Security needs in the Medical Data Manager EGEE MWSG, March 7-8 th, 2006 Ákos Frohner on behalf.
INFSO-RI Enabling Grids for E-sciencE ESR Database Access K. Ronneberger,DKRZ, Germany H. Schwichtenberg, SCAI, Germany S. Kindermann,
Grid based telemedicine application
NA4/medical imaging. Medical Data Manager Installation
AMGA - Official Metadata Service for EGEE
Medical Data Manager use case: 3D medical images analysis workflow.
AMGA Web Interface Salvatore Scifo INFN sez. Catania
GSAF Grid Storage Access Framework
GSAF Grid Storage Access Framework
Introduction to DSpace
AMGA Web Interface Vincenzo Milazzo
The AMGA metadata catalog
EGEE Middleware: gLite Information Systems (IS)
Presentation transcript:

INFSO-RI Enabling Grids for E-sciencE gLibrary: a Multimedia Contents Management System on the grid Tony Calanducci INFN Catania, NA3 & NA4 EGEE User Forum March 2006, CERN, Geneva

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March Outline Motivations gLibrary features Implementation details Security features Future planned improvements Conclusions

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March Motivations Huge amounts of data can be saved on SEs (did we forget about the existence of Data Grids?) But how can we easily find later a file that we need? –(if you have good memory, its GUID could be a solution ) –File Catalogues just let us to arrange files in folders and subfolders, no way to query on their contents –Metadata Catalogues are a possible solution, but not always “affordable” especially for non expert users (powerful but complex to use) Our solution: a higher level application built on top of many gLite grid services: a Metadata Catalogue + File Catalogues + Storage Elements  gLibrary Requirements: easy to use, fast, secure, extensible

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March gLibrary goals Attempt to create a Multimedia Management System on the Grid –Examples of Multimedia Contents handled by gLibrary:  Images  Movies  Audio Files  Office Documents (Powerpoint, Word, Excel, OpenOffice)  s, PDFs, HTMLs  Customized versions of well-know document type (ex. EGEE PPTs)  …. Keep track and organize in a uniform way all the additional details (metadata) of files saved in Storage Elements and registered in File Catalogues Provide users an easy way to locate and retrieve files based on their contents

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March Usage scenarios Example 1: –Locate all theoretical (PPTType) PowerPoint (Type) presentations about FireMan (Keywords) given in 2005 (Date) by Uncle Sam (Speaker); –Find all the movies (Type) in which Julia Roberts (Cast) performed together with Hugh Grant (Cast) produced in USA (Country) in 2004 (ReleaseDate); or all the acoustic (Genre) mp3 (Format) audio files (Type) of Alanis Morissette (Singer) that last more than 3 minutes (Runtime). Example 2: –A doctor is looking for brain (keyword) DICOM (Type) images of male (Gender) patients older than 65 (Age). Example 3: –A job can behave as a storage crawler: it scans pre-existing files in Storage Elements to extract relevant metadata that will be published on gLibrary for further data mining.

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March gLibrary prototype implementation Files are saved on SEs and registered into file catalogues (LFC and/or FiReMan) The AMGA Metadata Catalogue is used to archive and organize metadata and to answer users’ queries. gLibrary is built using the following AMGA collections: –/gLibrary contains generic metadata for each entry –/gLAudio, /gLImage, /gLVideo, /gLPPT, /EGEEPPT, /gLDoc, … are examples of collections of “additional features” (shown later) –/gLTypes  keeps the associations between document types and the names of the collection that contains the “additional features”  is used by gLibrary to find out where it has to look when new document types are added into the system (extensibility) –/gLKeys is used to store Decryption Keys

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March Example of entries Collection/gLibrary Entry Names Attributes FileNamePathNameTypeSubmitter 4ffaffc8-26e b460-3d5bf08081a4 DedicatoAte.mp3/grid/gilda/calanducciAudioTony Calanducci 00454dca-a269-4b93-8a45-c4012af05600 ardizzonelarocca_is_ ppt.gpg/grid/gilda/calanducci/ EGEE EGEEDOCTony Calanducci /gLibrary (continuum) Attributes SubmissionDateEncryptionDescriptionKeywordsCreationDate :00:00falseCanzone delle vibrazioni che ha ricevuto un enorme successo tra i teenagers nel 2003 Vibrazioni :00: :44:22truegLite Information SystemR-GMA, RGMA, BDII, IS :40

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March Example of gLibrary collections Collection/gLTypes Entry names Attributes Path (refers to a collection) Audio /gLAudio Image /gLImage Video /gLVideo Documents /gLDOC PowerPoint /gLPPT EGEEDOC /EGEEPPT Collection/EGEEPPT Entry names Attributes TitleRuntimeAuthorTypeDateEventSpeakerTopic 00454dca-a269- 4b93-8a45- c4012af05600 Information Systems 00:30:00Valeria Ardizzione, Giuseppe La Rocca Theorical th EGEE Conferen ce Giuseppe La Rocca, Valeria Ardizzone R-GMA, BDII Collection/gLAudio Entry names Attributes SongTitleDurationAlbumGenreSingerFormat 4ffaffc8-26e b460-3d5bf08081a4 Dedicato A Te00:03:27Dedicato A TePopLe VibrazioniMP3 Collection/gLKeys Entry names Attributes Passphrase 00454dca-a269-4b93-8a45- c4012af05600 ardizzo “additional features”

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March gLibrary query examples Query> selectattr /gLibrary:FILE /gLibrary:FileName /gLibrary:Description /EGEEPPT:Author /EGEEPPT:Title /EGEEPPT:Event '/gLibrary:FILE=/EGEEPPT:FILE and like(/gLibrary:Keywords, "%VOMS%")‘ >> 1f6e9ac6-5c b03b-560e0e7ea38a >> VOMS_server_Installation.ppt.gpg >> VOMS Server installation tutorial done in Venezuela >> ziggy, Giorgio >> Installing a gLite VOMS Server >> First Latin American Workshop for Grid Administrators Query> selectattr /gLibrary:FileName SubmissionDate Submitter /gLAudio:SongTitle Singer Duration Genre '/gLibrary:FILE=/gLAudio:FILE and /gLAudio:Format=“MP3”' >> DedicatoAte.mp3 >> :00:00 >> Tony Calanducci >> Dedicato A Te >> Le Vibrazioni >> 00:03:27 >> Pop

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March gLibrary Security User Requirements: –a valid proxy with VOMS extensions –VOMS Role and Group needed to be recognized by gLibrary as a contents manager. 3 kinds of users: –gLibraryManager: (s)he can create new content type and allows a generic VO user to become gLibrarySubmitter –gLibrarySubmitters: they can add new entries and define access rights on the entries they create.  Fine-grained permission (reading, writing, listing, decrypting) settings on each entry: whole VO members, VO groups, list of DNs –generic VO users: browse and make queries (on entries they have access to) Basic level of cryptography: –New files saved on SEs can be encrypted beforehand with a symmetric passphrase that will be saved in /gLKeys. Only selected users (that have a specific DN in the subject of their VOMS proxy) can access the passphrase and decrypt the file.

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March Security example (I) Connecting to amga.ct.infn.it: ARDA Metadata Server Query> whoami >> tony Query> user_listcred tony >> 'C = IT, O = GILDA, OU = Personal Certificate, L = INFN Catania, CN = Tony Calanducci, Address = Query> grp_member >> gilda:users >> gLibraryManager:glibrarysubmitters Query> addentry /gLibrary/1f6e9ac6-5c b03b-560e0e7ea38a FileName VOMS_server_Installation.ppt.gpg PathName /grid/gilda/calanducci/EGEE Type EGEEDOC Submitter 'Tony Calanducci' SubmissionDate ' :44' DecryptKeyDir '/DLKeys/gildateam' Description 'VOMS Server installation tutorial done in Venezuela' Keywords 'VOMS Server' CreationDate ' :28‘ Query> acl_show /gLibrary/1f6e9ac6-5c b03b-560e0e7ea38a >> tony rwxr-x >> gLibraryManager:glibrarysubmitters rwx

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March Security example (II) Query> dir /gLibrary >> /gLibrary/00454dca-a269-4b93-8a45-c4012af05600 >> entry >> /gLibrary/abd52d35-1bee-4de9-b234-a9abd920307e >> entry >> /gLibrary/1f6e9ac6-5c b03b-560e0e7ea38a >> entry Let’s logout and login again using a VOMS proxy with just VO Gilda membership (No Role or group) ARDA Metadata Server Query> whoami >> gilda Query> grp_member >> gilda:users Query> dir /gLibrary >> /gLibrary/00454dca-a269-4b93-8a45-c4012af05600 >> entry Query> acl_show /gLibrary/00454dca-a269-4b93-8a45-c4012af05600 >> gLibraryManager rwxr-x >> gilda:users rx The entry previously created does not even appear to non authorized users

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March Implementation Heavy exploitation of AMGA features –support for VOMS proxy authentication –fine-grained authorization capabilities to set ACLs per entry basis to restrict access to the decryption keys.  Allow gLibrarySubmitters to control which users (based on DNs, VOMS Roles and Groups) can list and get the attributes’ value for the submitted entries GUI Front-ends (to achieve the “easy of use” promise): –Java SWING GUI to be run on a Grid UserInterface (JVM required) -- prototype is under way –Portlet based front-end will be deployed in GENIUSPHERE and made available for any other JSR168 compliant portlets cointainer  Both use AMGA Java APIs

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March gLibrary Deployment scenario Authenticate with X509 Certificate VOMS Proxy with Group & Role Information AMGA Server PostGreSQL VOMS (gLibraryManager, gLibrarySubmitter, VO user) LFC (or Fireman) Catalog VOMS Proxy w/Role & Group SE VOMS Proxy UI

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March gLibrary JAVA GUI screenshot Alpha Prototype

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March gLibrary JAVA GUI Screenshot (II) Alpha Prototype

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March Future planned improvements Splitting of big files among several SEs (different chunks stored in different SEs): –Enforce security of data: even if a chunk is intercepted it has no meaning alone. –Increase upload/download bandwidth –Possible implementation:  one more NumberOfChunks attribute in /gLibrary collection.  /gLChunks collection keeps track of FirstChunkGUID-Chunk#-ChunkGUID Automatic extraction and population of metadata for well known document types –use of GNU libextractor to extract metadata from HTML, PDF, PS, OLE2 (DOC, XLS, PPT), OpenOffice (sxw), StarOffice (sdw), DVI, MAN, MP3 (ID3v1 and ID3v2), OGG, WAV, EXIV2, JPEG, GIF, PNG, TIFF, DEB, RPM, TAR(.GZ), ZIP, ELF, REAL, RIFF (AVI), MPEG, QT and ASF –use of Lucenne algorithm for indexing document types containing text Evaluation of gLite Hydra Key Store to save decryptions keys

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March Splitting Implementation UI SE EGEE_Movie.mpg EGEE_Movie.mpg_gpg_1 EGEE_Movie.mpg_gpg_2 EGEE_Movie_mpg_gpg_3 EGEE_Movie.mpg_gpg_4

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March Conclusion Born as an use case to demonstrate AMGA features Built on top of many gLite services Considering collaboration and integration with NA3 Document Digital Library System Fast → thanks to AMGA Secure → ACLs, encryption, and splitting Easy to use → User friendly Java GUI and portal soon available Easily extensible to support any document types (Medical Images and files, Invoices, Proceedings, Scientific Publications, Newspapers clips, …)

Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, March Any questions? Thanks for the attention