Presentation is loading. Please wait.

Presentation is loading. Please wait.

Project Tukaram Sagar Tamhane

Similar presentations


Presentation on theme: "Project Tukaram Sagar Tamhane"— Presentation transcript:

1 Project Tukaram Sagar Tamhane
Centre for Indian Language Technology Solutions IIT Bombay 12 June 2002 Center For Indian Language Technology Solutions

2 Center For Indian Language Technology Solutions
12 June 2002 Center For Indian Language Technology Solutions

3 Center For Indian Language Technology Solutions
The Goal To make Saint Tukaram’s Abhangas available over web for browsing and searching Locate the right Abhangas that you need. Present the pages to the user in an order of importance. 12 June 2002 Center For Indian Language Technology Solutions

4 “EaI tukaramabaavaaMcyaa ABaMgaaMcaI gaaqaa”
The Source The Abhangas are typed from a book called “EaI tukaramabaavaaMcyaa ABaMgaaMcaI gaaqaa” published on 6th November 1973 by the Govt. of Maharashtra Previous editions: 1950 and 1955. Number of Abhangas: 4644 12 June 2002 Center For Indian Language Technology Solutions

5 Creation of Web Content
Software used for typing: MS Word with Akruti_Priya_Expanded font and Akruti keyboard driver Problems faced: Non displayable characters Eg: This was typed as mna Automated page splitting 12 June 2002 Center For Indian Language Technology Solutions

6 Center For Indian Language Technology Solutions
Converters Used Akruti_Priya_Expanded ISCII converter: required for indexing the text ISCII Monolingual ISFOC converter: required for displaying the text through DV-TTYogesh XDVNG ISCII: for query strings to ISCII 12 June 2002 Center For Indian Language Technology Solutions

7 Technologies used for the Tukaram Search Engine
Input Technology: Jtrans: XDVNG font Keyboard Mapping: Phonetic English Result Display at client: ISFOC Encoding for indexing (storage): ISCII 12 June 2002 Center For Indian Language Technology Solutions

8 Center For Indian Language Technology Solutions
Architecture 12 June 2002 Center For Indian Language Technology Solutions

9 Center For Indian Language Technology Solutions
Input Technology 12 June 2002 Center For Indian Language Technology Solutions

10 Components of the Search Engine
Index Case sensitive ISCII Database structure Searcher In-memory search Algorithm: Hybrid of Hashing & Binary search 12 June 2002 Center For Indian Language Technology Solutions

11 Center For Indian Language Technology Solutions
Database Structure 12 June 2002 Center For Indian Language Technology Solutions

12 Center For Indian Language Technology Solutions
Snap shot of result 12 June 2002 Center For Indian Language Technology Solutions

13 Center For Indian Language Technology Solutions
Relevancy Criteria Number of query words in the abhang Position Adjacency Total number of words in the abhang 12 June 2002 Center For Indian Language Technology Solutions

14 Center For Indian Language Technology Solutions
12 June 2002 Center For Indian Language Technology Solutions

15 Center For Indian Language Technology Solutions
12 June 2002 Center For Indian Language Technology Solutions

16 Center For Indian Language Technology Solutions
12 June 2002 Center For Indian Language Technology Solutions

17 Center For Indian Language Technology Solutions
12 June 2002 Center For Indian Language Technology Solutions

18 Center For Indian Language Technology Solutions
12 June 2002 Center For Indian Language Technology Solutions

19 Center For Indian Language Technology Solutions
12 June 2002 Center For Indian Language Technology Solutions

20 Center For Indian Language Technology Solutions
12 June 2002 Center For Indian Language Technology Solutions

21 Center For Indian Language Technology Solutions
General information Number of abhangas : 4,644 Total number of words : 2,09,702 Number of distinct words : 34,773 Languages used for converters: Lex & C Language used for search engine: Java 2 Scripting on client side : JavaScript 12 June 2002 Center For Indian Language Technology Solutions

22 Center For Indian Language Technology Solutions
Thank You 12 June 2002 Center For Indian Language Technology Solutions


Download ppt "Project Tukaram Sagar Tamhane"

Similar presentations


Ads by Google