Sanskrit and Natural Language Processing A new Paradigm in Sanskrit Education Center for Advanced Studies and Research in Shabdabodha and NLP RASHTRIYA.

1 Sanskrit and Natural Language Processing A new Paradigm in Sanskrit Education Center for Advanced Studies and Research in Shabdabodha and NLP RASHTRIYA SANSKRIT VIDYAPEETHA DEEMED UNIVERSITY Tirupati(A.P)

3 Present situation of Sanskrit  Sanskrit colleges are like 'zoo'!  No Govt. support unless we are productive !!  Sanskrit is being neglected  How far this support will continue ?  Great tradition of learning is being lost  No scope for novel research

4 Innovation is the key Sanskrit Shastras are competent enough to enter the science world. Move out of Humanities and get merged with science. Analogy : Maths, psychology, Logic. We must find practical approach for these Sanskrit Sciences.

5 we have lost 80% Meemamsa - No practical approach ! Nyaya- No use in modern dialectics ? Vyakarana – No application ?? What to do ?

6 Relevance of Sanskrit Shastras in Modern Technology fortunately these shastras are found relevent in today’s technology  Computing ideas in Panini  Text processing principles in Meemamsa  Formal languages in Nyaya we lack the technology and application area

7 Know How… Ultimate aim :finding appropriate place for sanskrit Shastras Method: solutions to contemporory problems adopting modern technology Resource needed : Adequate manpower, who act as a bridge between modern scientists and technologists one side and sanskrit scholars on the other side.

8 Change the scenario Technology Western Theories INDIAN THEORIES

9 Opportunities missed Industrial revolution We missed this with some hasty decisions IT revolution Indians are serving in the level of coding ; not in designing level ! Knowledge Revolution we should take this advantage

10 Need of the hour we need to understand how technology works to understand the contempomporary problems Then we will be able to give solutions in the light of sashtras and show the relevence of Indian theories

11 History and Progress Conference held at Bangalore in Dec 1987 on “Knowledge Representation and Sanskritam” generated tremendous interest Nothing much has been archived, except some efforts and projects here and there in small scale that too in technical institutions Time running out ! What progress has been made since then?

12 Who would bell the cat ? It needs a long interaction between technologists and Traditional Sanskrit Scholars Technical institutions are always ready for such activities There is NO much interest is seen in Sanskrit Institutions It is we Sanskritists should to bell the cat

13 Long process like extraction of ghee from milk Nothing miracle happens in the initial stage It’s a big challenge, one OR two persons are not enough We need hundreds of dedicated persons to achieve a small goal A person can climb a small hill ; Team can climb the Everest

14 Platform For Innovation To achieve this Rashtiya Sanskrit Vidyapeetha has set up a view Innovative centre for advanced study and research in shabdabodha and language technology Center has faculty from shabdabodha (Nyaya Vyakarana Meemamsa), NLP and computer science Center has full-fledged computer lab

15 Possible areas Machine Translation Speech Processing Summary Extraction from huge texts Indo Wordnet as a base for IL-wordnets Developing Tools for IL Researchers Knowledge Representation schemes

16 Machine Translation English To Indian Languages Word sense disambiguation Karaka & Syntax Relation Word-grouping Idiomatic Expression Shabdasutra MT among Indian Languages Bi-language Electronic Dictionaries Karaka & Vibhakti Relation

17 Summary Extraction Meemamsa Principles applied to extract the summary of a text Upakramaadi Tatparya Lingas are used to extract the summary of a text in Indian Institute of Science, Bangalore, in our consultancy.

18 Wordnet / Concept-net based on NN ontology Wordnet is an electronic lexical reference resource system designed on the basis of semantic relations of words Synonymy {Graha, nivaasa,….} Hypernymy {Amra, vriksha, vanaspati…} Antonnymy {Shreemaan, akinchana} Mecronymy {nAsika, mukha, shariira..} Gradation {Shushka,…tara,….tama}

19 Sanskrit Corpus Annotating the relation in Sanskrit Texts Tagging Samasas Identifying the topics of the texts Make available Sanskrit Texts along with Simple translations on web and CD R form Statistical analysis of Sanskrit Texts

20 Knowledge Engineering Representation For Data representation, several databse management systems are available. For representing and retrieving useful information, there are various worked out methodologies Finally Knowledge Representation needs special treatment where Indian Knowledge systems can be applied

21 Knowledge and its importance in AI AI researchers are interested in building Intelligent systems Web technologies looking forward to Semantic webs instead of syntactic web Knowledge is more valuable than data and Information Data – simple DoB. Info – Age calculated. Knowledge – the judgment about suitability for job at hand etc. This requires a lot of inputs from various K- sources.

22 Sansk - Net an online gigantic electronic library of Sanskrit works more than 500 works(3,00,00,000 pages of E-content) Dhathuratnakara is available on web. It can be accessed through web http:/

23 CD R Production Paniniya Udaharanakosha is now available in CD form 'koshas' will be made available in CD form. Vachaspathyam, Sabdakalpadruma Dhaturatnakara – All the forms of all roots will be made available on CD R. Morphological analyzer for Sanskrit

24 Vatmikiramayana on NET - Vatmikiramayana moolam in all Indian scripts -Audio recording -Transalation in five foriegn languages. -Eight Sanskrit commentories -English transalation and commentories -Summary, Glossary -Beautiful picture gallary

25 Sanskrit language processing tools Sandhi concator - Ready Morphological analyser – Hosted on web Sandhi spliter (Under progress) Samasa tag interpretor - Ready

26 Future Projects Text to speach for Sanskrit texts High quality search engine for Sanskrit E-library Hypertext archive for Sanskrit Literature

27 Dream Projects Paninian Grammar for English (MT) Ground work is done A national Symposium conducted Validity checking of Paninian system through computing Basing teaching material is ready Sanskrit Wordnet Prototype project is undertaken by a student

