Project Tukaram Sagar Tamhane Centre for Indian Language Technology Solutions IIT Bombay 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions The Goal To make Saint Tukaram’s Abhangas available over web for browsing and searching Locate the right Abhangas that you need. Present the pages to the user in an order of importance. 12 June 2002 Center For Indian Language Technology Solutions
“EaI tukaramabaavaaMcyaa ABaMgaaMcaI gaaqaa” The Source The Abhangas are typed from a book called “EaI tukaramabaavaaMcyaa ABaMgaaMcaI gaaqaa” published on 6th November 1973 by the Govt. of Maharashtra Previous editions: 1950 and 1955. Number of Abhangas: 4644 12 June 2002 Center For Indian Language Technology Solutions
Creation of Web Content Software used for typing: MS Word with Akruti_Priya_Expanded font and Akruti keyboard driver Problems faced: Non displayable characters Eg: This was typed as mna Automated page splitting 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions Converters Used Akruti_Priya_Expanded ISCII converter: required for indexing the text ISCII Monolingual ISFOC converter: required for displaying the text through DV-TTYogesh XDVNG ISCII: for query strings to ISCII 12 June 2002 Center For Indian Language Technology Solutions
Technologies used for the Tukaram Search Engine Input Technology: Jtrans: XDVNG font Keyboard Mapping: Phonetic English Result Display at client: ISFOC Encoding for indexing (storage): ISCII 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions Architecture 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions Input Technology 12 June 2002 Center For Indian Language Technology Solutions
Components of the Search Engine Index Case sensitive ISCII Database structure Searcher In-memory search Algorithm: Hybrid of Hashing & Binary search 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions Database Structure 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions Snap shot of result 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions Relevancy Criteria Number of query words in the abhang Position Adjacency Total number of words in the abhang 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions General information Number of abhangas : 4,644 Total number of words : 2,09,702 Number of distinct words : 34,773 Languages used for converters: Lex & C Language used for search engine: Java 2 Scripting on client side : JavaScript 12 June 2002 Center For Indian Language Technology Solutions
Center For Indian Language Technology Solutions Thank You 12 June 2002 Center For Indian Language Technology Solutions