Download presentation
Presentation is loading. Please wait.
Published byHarry Franklin Modified over 9 years ago
1
Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience
2
Developmental Informatics Lab Introduction aAqua’s (almost All questions answered) –An online forum for answering questions from the grassroots by the experts in the field. Bridges gaps in use of ICT –Usability –Availability –Multi-Linguality –Multi-media Support –Multi-Lingual Storage and Retrieval –Reusability
3
Developmental Informatics Lab Usability
4
Developmental Informatics Lab A Sample Thread
5
Developmental Informatics Lab aAqua in Operation
6
Developmental Informatics Lab aAqua Server Crop Doctor Crop Recommendation Keyword Browser Bhav Puchiye aAqua Interne t HTTP aAqua Offline Mobile networ k aAqua Mobile Gateway SMS
7
aAqua Demo
8
Developmental Informatics Lab aAQUA- a technical perspective Employs three tier web architecture Uses mvnforum which is based on the MVC architecture. Lucene used as search engine. Compatible with any servlet container which supports JSP1.2 and Servlet2.3 Runs on tomcat Works with unicode UTF-8 compliant Oracle 9i as well as mysql database Is integrated with open source digital library software
9
Developmental Informatics Lab Multi-Linguality
10
Developmental Informatics Lab Multi-lingual Storage and Retrieval Query in Hindi UNL Document Result in Hindi “flowers Scorch” UNL Document Info repository …The plants blossom but the flowers scorch… and(blossom(icl>develop(obj>thing)):0S.@entry.@custom, scorch(icl>dry(obj>thing)):2E.@contrast.@custom) obj(blossom(icl>develop(obj>thing)):0S.@entry.@custom, plant(icl>organism):04.@def.@pl) obj(scorch(icl>dry(obj>thing)):2E.@contrast.@custom, flower(icl>reproductive structure):1P.@pl.@def) UNL graph
11
Developmental Informatics Lab Unicode Computers store letters and other characters by assigning a number for each. Hundreds of different encoding systems for assigning these numbers. Before unicode, no single encoding could contain enough characters. Universal encoded character set –Enables information from any language to be stored using a single character set. –Provides a unique code value for every character, regardless of the platform, program, or language.
12
Developmental Informatics Lab Unicode standard UTF-8 encoding –Popular with html –A way of transforming all Unicode characters into a variable length encoding of bytes. –The Unicode characters corresponding to the familiar ASCII set have the same byte values as ASCII –UTF-8 can be used with much existing software without extensive software rewrites. UTF-16 encoding –UTF-16 used when efficient access to characters is needed with economical use of storage. –Most of the heavily used characters fit into a single 16-bit code unit, while all other characters are accessible via pairs of 16-bit code units. –Better compatibility with Java
13
Developmental Informatics Lab Unicode Encodings C3 91 74 63 E68480 EDA081 B0 C3B6 D0 64 0063 00E1 0074 6100 0064 00F6 0424 D801DC02 á t c ö d A4 UTF-8UTF-16Characters
14
Developmental Informatics Lab Unicode and the Web Preferred encoding form for Unicode characters on the web is UTF-8 HTTP header of a document should contain the line –Content-Type: text/html; charset=utf-8 (for HTML files) –Content-Type: text/plain; charset=utf-8 (for TEXT files) Or in a HTML document, add the following line under HEAD the element
15
Developmental Informatics Lab Creating unicode databases Mysql/Oracle –CREATE DATABASE database_name CHARACTER SET character_set –CREATE DATABASE confluence CHARACTER SET utf8; –Oracle 9i supports UTF 16 also. (CHARACTER SET : AL16UTF16 ) Postgres –CREATE DATABASE database_name WITH ENCODING 'UTF8';
16
Thank You
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.