A Domain-Specific Modeling Language for Scientific Data Composition and Interoperability Hyun ChoUniversity of Alabama at Birmingham Jeff GrayUniversity.

Slides:



Advertisements
Similar presentations
2 Introduction A central issue in supporting interoperability is achieving type compatibility. Type compatibility allows (a) entities developed by various.
Advertisements

Language Specification using Metamodelling Joachim Fischer Humboldt University Berlin LAB Workshop Geneva
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Lecture # 2 : Process Models
UNDERSTANDING JAVA APIS FOR MOBILE DEVICES v0.01.
Introduction to Database Management  Department of Computer Science Northern Illinois University January 2001.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
7M701 1 Software Engineering Object-oriented Design Sommerville, Ian (2001) Software Engineering, 6 th edition: Chapter 12 )
EE442—Multimedia Networking Jane Dong California State University, Los Angeles.
Software Connectors. Attach adapter to A Maintain multiple versions of A or B Make B multilingual Role and Challenge of Software Connectors Change A’s.
WMES3103 : INFORMATION RETRIEVAL
1 MPEG-21 : Goals and Achievements Ian Burnett, Rik Van de Walle, Keith Hill, Jan Bormans and Fernando Pereira IEEE Multimedia, October-November 2003.
Data Management I DBMS Relational Systems. Overview u Introduction u DBMS –components –types u Relational Model –characteristics –implementation u Physical.
Mobile Application Development
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
ICS (072)Database Systems Background Review 1 Database Systems Background Review Dr. Muhammad Shafique.
Image Metadata Summary of 4/18/99 NISO/DLF Image Metadata Meeting ( Howard Besser UCLA School of Education & Information.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Overview of Database Languages and Architectures.
Mining Metamodels From Instance Models: The MARS System Faizan Javed Department of Computer & Information Sciences, University of Alabama at Birmingham.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Prepared by Abzamiyeva Laura Candidate of the department of KKGU named after Al-Farabi Kizilorda, Kazakstan 2012.
Nat 4/5 - Software Design and Development – Low Level Operations - 1 National 4/5 – Computing Science Information Systems Design and Development Media.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse 2.
2 1 Chapter 2 Data Model Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
ROOT: A Data Mining Tool from CERN Arun Tripathi and Ravi Kumar 2008 CAS Ratemaking Seminar on Ratemaking 17 March 2008 Cambridge, Massachusetts.
EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
What is Enterprise Architecture?
Chapter 8: Digital Media1 Digital Media Chapter 8.
An Introduction to Software Architecture
Mihir Daptardar Software Engineering 577b Center for Systems and Software Engineering (CSSE) Viterbi School of Engineering 1.
Database System Concepts and Architecture
Exploring the Applicability of Scientific Data Management Tools and Techniques on the Records Management Requirements for the National Archives and Records.
Introduction to MDA (Model Driven Architecture) CYT.
Key Challenges for Modeling Language Creation by Demonstration Hyun Cho, Jeff Gray Department of Computer Science University of Alabama Jules White Bradley.
Document Formats How to Build a Digital Library Ian H. Witten and David Bainbridge.
Computer Concepts 2014 Chapter 8 Digital Media. 8 Digital Audio Basics  Sampling a sound wave Chapter 8: Digital Media 2.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
Copyright © Texas Education Agency, All rights reserved.1 Introduction to Scanners Principles of Information Technology.
1 Mpeg-4 Overview Gerhard Roth. 2 Overview Much more general than all previous mpegs –standard finished in the last two years standardized ways to support:
ENOMA - European Network of Online Musical Archives ENOMA Workshop – The Grieg Academy, UiB 26 May 2006 Leif Arne Rønningen and Lars Erik Løvhaug NTNU.
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
© 2012 xtUML.org Bill Chown – Mentor Graphics Model Driven Engineering.
Chapter 2 MEDIA FORMAT INTEROPERABILITY. Section 2.1 Background.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
Chapter 10 Analysis and Design Discipline. 2 Purpose The purpose is to translate the requirements into a specification that describes how to implement.
By Courtney Field Creative digital graphics. Types of graphics and examples There are a number of different types of graphics file formats. Each type.
What’s MPEG-21 ? (a short summary of available papers by OCCAMM)
May 2003National Coastal Data Development Center Brief Introduction Two components Data Exchange Infrastructure (DEI) Spatial Data Model (SDM) Together,
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
Marwan Al-Namari 1 Digital Representations. Bits and Bytes Devices can only be in one of two states 0 or 1, yes or no, on or off, … Bit: a unit of data.
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
CSCI-100 Introduction to Computing Hardware Part II.
Application Software System Software.
MULTIMEDIA Multimedia is the field concerned with the computer- controlled integration of text, graphics, drawings, still and moving images (Video), animation,
CCSDS Meeting data Archive Ingest - January 2007 CNES 1 CCSDS - MOIMS Area Data Archive Ingest WG CNES Report Colorado Springs meeting – January 2007 Claude.
SDM Center Parallel I/O Storage Efficient Access Team.
Connecting Architecture Reconstruction Frameworks Ivan Bowman, Michael Godfrey, Ric Holt Software Architecture Group University of Waterloo CoSET ‘99 May.
MULTIMEDIA DATA MODELS AND AUTHORING
AUTOMATIC GENERATION OF MODEL TRAVERSALS FROM METAMODEL DEFINITIONS Authors: Tomaž Lukman, Marjan Mernik, Zekai Demirezen, Barrett Bryant, Jeff Gray ACM.
NetCDF Data Model Details Russ Rew, UCAR Unidata NetCDF 2009 Workshop
Databases and Database User ch1 Define Database? A database is a collection of related data.1 By data, we mean known facts that can be recorded and that.
Introduction to Scanners
Visual Information Retrieval
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Overview What is Multimedia? Characteristics of multimedia
An Introduction to Software Architecture
Lesson 5: Multimedia on the Web
Metadata The metadata contains
Presentation transcript:

A Domain-Specific Modeling Language for Scientific Data Composition and Interoperability Hyun ChoUniversity of Alabama at Birmingham Jeff GrayUniversity of Alabama

File Formats: Image Files  Organize and store digital images that are composed of either pixel or vector (geometric) data  Bitmap-based  Created by scanner and digital camera  TIF, JPG, BMP  Vector-based  Geometric description + Bitmap  Resolution Independent & Infinitely scalable  Font, DRW, CGM

File Formats: Music and Audio Files  Storing audio data that are produced by audio-to-digital converters  Key Parameters  Sample Rate, Resolution, Number of channels  Uncompressed formats  WAV, AIFF and AU  Lossless compression Formats  FLAC, Lossless Windows Media Audio (WMA)  Lossy compression Formats  MP3, Lossy Windows Media Audio (WMA)

File Formats: Text Files  File formats that are structured as plain text, representing a sequence of lines  ASCII, TXT

File Formats: Compound File Formats  Used to structure the contents of a document in the file  Contain a number of independent data streams that are organized in a hierarchy  Stream: files in a file system  Storage: sub-directories in a file system  MS Office, OpenOffice

Characteristics of Generic File Formats  Can handle one or two data types  Numeric data or alphanumeric data  May have a limitation of the file size  Mostly limited to a maximum file size of 2GB  May increase file I/O time linearly as the file size grows An In-Depth Examination of Java I/O Performance and Possible Tuning Strategies

Characteristics of Generic File Formats  Can handle one or two data type  Numeric data or alphanumeric data  May have a limitation of the file size  Mostly limited to a maximum file size of 2GB  May increase file I/O time linearly as the file size is grew An In-Depth Examination of Java I/O Performance and Possible Tuning Strategies These generic file formats are not appropriate for storing and retrieving scientific data because the files were not designed to maintain high volume of complex scientific data, such as high resolution images, massive numerical data, and graphs.

Scientific Data Format: NetCDF3  Network Common Data Format  Machine-independent file format  Support a wide variety of platforms including Linux, MacOS, & Windows  Representing multi-dimensional arrays with ancillary data Time = 1Time = n …

Scientific Data Format: HDF5  Hierarchical Data Format  File format for managing any kind of data  Support high volume and/or complex data  Platform-independent  Flexible, efficient storage and I/O

Characteristics of the Scientific Data File Formats  Self-Descriptive  Contain metadata to inform the contained data type and their organization  Directly Accessible  Can access arbitrary data through APIs  Concurrently Accessible  Multiple threads or processes can access data simultaneously  Enable high performance computing and speedier access  Archivable  Have their own archiving mechanism to backup and restore a high volume of data

Challenges in Using the Scientific Data File Formats  Use different representations to organize the file structure  Each file format needs its own data visualization and composition  It is difficult to exchange data between two or more scientific data formats  Manage the evolution of APIs  Challenging to verify that APIs are evolved in accordance with the evolution of file specification  Maintain stability of existing applications from API evolution  User applications are subject to change of APIs  Limited support for data integration among heterogeneous scientific data formats

Framework for Scientific Data File Management

NEW SLIDES NEEDED HERE TO INTRODUCE DSM!

Model-Driven Engineering (MDE) and Domain-Specific Modeling (DSM)  MDE: specifies and generates software systems based on high-level models  Domain-Specific Modeling (DSM): a paradigm of MDE that uses notations and rules from an application domain  Metamodel: defines a Domain- specific Modeling language (DSML) by specifying the entities and their relationships in an application domain  Model: an instance of the metamodel  Model Transformation: a process that converts one or more models to various levels of software artifacts (e.g., other models, source code)

Unifying the representation of file structure organization  Adapt a DSML to build a tool for visualizing & composing the scientific file format in a unified way Analyze data model of each scientific file format Feature Model Define DSML from Feature Model Common Data Model Variable Data Model Grammar & Syntax Implement DSML DSML Tool

Unifying the representation of file structure organization  Feature Model for Scientific File Format  Describe some highlights here  And here

Unifying the representation of file structure organization  Content Composer  DSML Modeling tool for scientific data file  Implemented by using GEMS

API Abstraction Layer  Help to protect user applications from the evolution of APIs NetCDFHDF5 int nc_create ( const char* path, int cmode, int *ncidp) H5File ( const char *name, unsigned int flags) Abstraction createFile( const char *path, FileCreationProperty fileCreationProperty)

Integrating data among heterogeneous data formats  Content Mapper  Define rules how to map data from a scientific data format to another  Content Verifier  Verify the correctness of the file composition  Verify the correctness of mapping rule

Summary  From the prototype of the framework  A DSML can help to build a graphical tool to compose and support interoperability across scientific file structures  Adoption of the layered architecture in the framework can help to maintain the independence of each layer  Both the API abstraction layer and the layered architecture are essential to develop and maintain user applications  Further works  Create metamodels that include full specification of each scientific file  Categorizing APIs in accordance to their intended use for API abstraction layer  Develop metamodels for managing API evolution

Thank you!

Example of Scientific Data Format: OPeNDAP  Client-server protocol for scientific data access  Targeted oceanographic data management