Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience.

Slides:



Advertisements
Similar presentations
Lecture plan Information retrieval (from week 11)
Advertisements

SE 370: Programming Web Services Week 4: SOAP & NetBeans Copyright © Steven W. Johnson February 1, 2013.
Technology Overview JAVA Servlets CS-611 S. Witherspoon.
Muhammad Taimoor Khan
ARCHIMÈDE Presented by Guy Teasdale Directeur, Services soutien et développement Bibliothèque de l’Université Laval CARL Workshop on Institutional Repositories.
Caching the MDSPlus Data via Hibernate By Ajith M Jose Comp6703 Project Client: Raju Karia Supervisor: Dr. Henry Gardner (Development of “WebScope”)
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Unicode: The Right Tools, but How to Use Them? Presentation to the Digital Library Federation Fall Forum November 18, 2003 Elizabeth A.S. Beaudin, OACIS.
Multiple Tiers in Action
מבנה מחשב תרגול 2 ייצוג תווים בחומרה. A programmer that doesn’t care about characters encoding in not much better than a medical doctor who doesn’t believe.
1 HTML’s Transition to XHTML. 2 XHTML is the next evolution of HTML Extensible HTML eXtensible based on XML (extensible markup language) XML like HTML.
Computer Science 101 Web Access to Databases Overview of Web Access to Databases.
Session-01. What is a Servlet? Servlet can be described in many ways, depending on the context: 1.Servlet is a technology i.e. used to create web application.
2440: 141 Web Site Administration Web Server-Side Programming Professor: Enoch E. Damson.
Web Programming Language Dr. Ken Cosh Week 1 (Introduction)
Gayle J Yaverbaum, PhD Professor of Information Systems Penn State Harrisburg.
Unicode, character sets, and a a little history. Historical Perspective First came EBCIDIC (6 Bits?) Then in the early 1960s came ASCII – Most computers.
Introduction to Computing Using Python Chapter 6  Encoding of String Characters  Randomness and Random Sampling.
Sys Prog & Scripting - HW Univ1 Systems Programming & Scripting Lecture 15: PHP Introduction.
Digital Library Architecture and Technology
Unicode & W3C Jataayu Software C. Kumar January 2007.
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 4 Web technologies: HTTP, CGI, PHP,Java applets)
Chapter 33 CGI Technology for Dynamic Web Documents There are two alternative forms of retrieving web documents. Instead of retrieving static HTML documents,
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
October 2005CSA3180: Text Processing I1 CSA3180: Natural Language Processing Text Processing 1 Language Encoding Issues Common Corpora Handling Large Document.
What’s New in Sage SalesLogix V Release Overview Sage SalesLogix v7.5.2 focuses on: −User Enhancements streamline the user experience furthering.
Chapter 16 The World Wide Web. 2 The Web An infrastructure of information combined and the network software used to access it Web page A document that.
Dspace 1 Introduction to DSpace Mukesh Pund Scientist NISCAIR, New Delhi.
Information and Databases Chapter Outline 4 Data Modeling: Documenting Information Architecture 4 User’s View of a Computerized Database 4 Database Management.
Offline aAQUA. Developmental Informatics Lab Availability: Offline Access Works in resource constrained environment –intermittent and low bandwidth connectivity.
Nynox.com Nynox Help Desk Affordable Help Desk Solution.
Building digital libraries in Indian languages: case studies with Hindi and Kannada B.S. Shivaram Trainee ( ) National Center for Science Information.
Java CGI Lecture notes by Theodoros Anagnostopoulos.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
Webcommerce Computer Networks Webcommerce by Linnea Reppa Douglas Martindale Lev Shalevich.
Chapter 6 Server-side Programming: Java Servlets
C# AND ASP.NET What will I do in this course?. MAJOR TOPICS Learn to program in the C# language with the Visual Studio IDE (Interactive Development Environment)
Introduction to PHP Advanced Database System Lab no.1.
1 MSCS 237 Overview of web technologies (A specific type of distributed systems)
Copyright © 2002 ProsoftTraining. All rights reserved. JavaServer Pages.
Data Files on Computers Text Files (ASCII) Files that can be created by typing on the keyboard while using a text editor such as notepad or TextEdit.
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
ICalendar Compatible Collaborative Calendar- Server (CCS) Web Services Ahmet Fatih Mustacoglu Indiana University Computer Science Department Community.
Unicode Normalize Engine Submitted by: Jose Yallouz Shlomi Ben-Shabat Supervisor: Maxim Gurevich.
Operating Systems Lesson 12. HTTP vs HTML HTML: hypertext markup language ◦ Definitions of tags that are added to Web documents to control their appearance.
Web Technologies Lecture 8 Server side web. Client Side vs. Server Side Web Client-side code executes on the end-user's computer, usually within a web.
Implementation of SCENS Yan Zhao. Current Status Current implementation is web-based –
Understanding Character Encodings Basics of Character Encodings that all Programmers should Know. Pritam Barhate, Cofounder and CTO Mobisoft Infotech.
Keenan Adamson Supervisor: Dr Bill Tucker.
Invitation to Computer Science 6 th Edition Chapter 10 The Tower of Babel.
Notes Test #2 will be held one week from this Thursday Check to see if you have a Vision account –Launch Netscape –Point & Click to location and type vision.
Java High level programming language ◦ Sun Microsystems ◦ ORACLE acquired Java Development Kit – JDK Java Runtime Environment – JRE Java Virtual Machine.
A Presentation Presentation On JSP On JSP & Online Shopping Cart Online Shopping Cart.
Breeda Herlihy, IR Manager, UCC Library. UCC selected DSpace in 2008 Software selection group Staff from Library IT, Computer Centre, Special Collections,
Web Programming Language
Web Concepts Lesson 2 ITBS2203 E-Commerce for IT.
Introduction to Computing Lecture # 13
IST541 Interactive Media Miguel Lara, PhD.
Introduction and Principles
Chapter 6 Server-side Programming: Java Servlets
Data Encoding Characters.
TOPICS Information Representation Characters and Images
Web App vs Mobile App.
Introduction to DSpace
Plan Attendance Files Posted on Campus Cruiser Homework Reminder
Digital Encodings.
How Computers Store Data
Learning Intention I will learn how computers store text.
CIS 133 mashup Javascript, jQuery and XML
Introduction to UNICODE (ஒருங்குறி)
Presentation transcript:

Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience

Developmental Informatics Lab Introduction aAqua’s (almost All questions answered) –An online forum for answering questions from the grassroots by the experts in the field. Bridges gaps in use of ICT –Usability –Availability –Multi-Linguality –Multi-media Support –Multi-Lingual Storage and Retrieval –Reusability

Developmental Informatics Lab Usability

Developmental Informatics Lab A Sample Thread

Developmental Informatics Lab aAqua in Operation

Developmental Informatics Lab aAqua Server Crop Doctor Crop Recommendation Keyword Browser Bhav Puchiye aAqua Interne t HTTP aAqua Offline Mobile networ k aAqua Mobile Gateway SMS

aAqua Demo

Developmental Informatics Lab aAQUA- a technical perspective Employs three tier web architecture Uses mvnforum which is based on the MVC architecture. Lucene used as search engine. Compatible with any servlet container which supports JSP1.2 and Servlet2.3 Runs on tomcat Works with unicode UTF-8 compliant Oracle 9i as well as mysql database Is integrated with open source digital library software

Developmental Informatics Lab Multi-Linguality

Developmental Informatics Lab Multi-lingual Storage and Retrieval Query in Hindi UNL Document Result in Hindi “flowers Scorch” UNL Document Info repository …The plants blossom but the flowers scorch… flower(icl>reproductive UNL graph

Developmental Informatics Lab Unicode Computers store letters and other characters by assigning a number for each. Hundreds of different encoding systems for assigning these numbers. Before unicode, no single encoding could contain enough characters. Universal encoded character set –Enables information from any language to be stored using a single character set. –Provides a unique code value for every character, regardless of the platform, program, or language.

Developmental Informatics Lab Unicode standard UTF-8 encoding –Popular with html –A way of transforming all Unicode characters into a variable length encoding of bytes. –The Unicode characters corresponding to the familiar ASCII set have the same byte values as ASCII –UTF-8 can be used with much existing software without extensive software rewrites. UTF-16 encoding –UTF-16 used when efficient access to characters is needed with economical use of storage. –Most of the heavily used characters fit into a single 16-bit code unit, while all other characters are accessible via pairs of 16-bit code units. –Better compatibility with Java

Developmental Informatics Lab Unicode Encodings C E68480 EDA081 B0 C3B6 D E F D801DC02 á t c ö d A4 UTF-8UTF-16Characters

Developmental Informatics Lab Unicode and the Web Preferred encoding form for Unicode characters on the web is UTF-8 HTTP header of a document should contain the line –Content-Type: text/html; charset=utf-8 (for HTML files) –Content-Type: text/plain; charset=utf-8 (for TEXT files) Or in a HTML document, add the following line under HEAD the element

Developmental Informatics Lab Creating unicode databases Mysql/Oracle –CREATE DATABASE database_name CHARACTER SET character_set –CREATE DATABASE confluence CHARACTER SET utf8; –Oracle 9i supports UTF 16 also. (CHARACTER SET : AL16UTF16 ) Postgres –CREATE DATABASE database_name WITH ENCODING 'UTF8';

Thank You