Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Intelligent Retrieval System for Chinese Agricultural Scientific Literature Ping Qian, Xiaolu Su Scientech Documentation and Information Center , Chinese.

Similar presentations


Presentation on theme: "An Intelligent Retrieval System for Chinese Agricultural Scientific Literature Ping Qian, Xiaolu Su Scientech Documentation and Information Center , Chinese."— Presentation transcript:

1 An Intelligent Retrieval System for Chinese Agricultural Scientific Literature Ping Qian, Xiaolu Su Scientech Documentation and Information Center , Chinese Academy of Agricultural Sciences, China. {pingq, suxiaolu}@mail.caas.net.cn

2 Introduction How to find out desired information from huge information resources faster and accurately, has become the serious harassment for people to develop and utilize the network information resources. This project attends to use new theory and technology to explore a solution to above problem. Currently, knowledge engineering concerning ontology under research is an important theoretical foundation and applied technology to solve knowledge discovery and acquisition.

3 Information Retrieval Based on Ontology Build up the domain ontology Create the database, referring to the ontology Conduct the retrieval with the help of ontology Process the results, then display the results Import the classification method based on ontology theory Create agricultural navigation information database Create index database ( Agricultural Scientific literature database ) Create Web information retrieval system Display the results Establish Process of the System

4 Foundation of Building Agricultural Scientech Navigation Information Database Theory: Ontology Data Source: Agricultural Scientech Literature Database (more than 560,000 records) Tool: Statistical Analysis Standard: Chinese Library Classification Method

5 Stages of Building Agricultural Navigation Information Database 1.Agricultural Theoretical Classification Tree 2.Agricultural Actual Classification Tree 3.Class - Keyword Cross Table 4.Keyword - Class Cross Table 5.Agricultural Navigation Information Database

6 Agricultural Theoretical Classification Tree –Component All of the Classes relevant to Chinese Library Classification Method –Purpose Solve the problems in creating actual classification tree: –The relation between class number and its name –The gradation relation of some class numbers –Data Amount Class and subclass: 42,948 First Layer Class:17

7 序号类号类名记录数 1S 农业、农业科学 470,213 2F 经济 47,503 3T 工业技术 23,555 4Q 生物科学 10,440 5X 环境科学、劳动保护科学(安全科学) 6,252 6P 天文学、地球科学 1,109 7G 文化、科学、教育、体育 1,106 8O 数理科学和化学 433 9U 交通运输 398 10R 医药、卫生 391 11C 社会科学总论 209 12D 政治、法律 102 13Z 综合性图书 22 14N 自然科学总论 21 15K 历史、地理 19 16H 语言、文字 5 17V 航空、航天 2 First-Order Class Name in the Theoretical Tree

8 Actual Agricultural Actual Classification Tree –Component : All of the classes indexed actually –Purpose : Founding the navigation information database Knowing the actual distribution of agricultural information to find new growing points of the development of agricultural sciences –Data amount: Classes: 21,391 , Among them. Coordinated classes: 10,748 Non-Coordinated classes: 10,643

9 Actual Agricultural Actual Classification Tree  Key Point :  More than 100,000 class number and its corresponding class name  Solution:  Create Professional modeled class tables ( 9 )  Create modeled class tables (6), among them:  General modeled class tables ( 2 )  Professional modeled class tables ( 4 )

10 Modeled Class Table 表名仿分范围仿分范围名称仿分类号 f401_406F407.1/.9 各工业部门经济 F401/406 s220S221/229 各种农机具 S220 s50S51/59 各种农作物 S50 s60S63/68 各种园艺 S60 S763_30S763.31/.49 各种虫害及其防治 S763.30 s821S822/829.9 各种家畜 S821 s831S823/839 各种家禽 S831 s881_884 _9 S885.1/.9 其他各种蚕类 S881/884. 9 s965S943 各种鱼类的病害、敌 害及其防治 S965

11 General Compound Class Table 表名仿分范围名称记录数字段数 fb2 世界地区复分表 F401/4065 fb3 中国地区复分表 S2204 Professional Compound Class Table 表名复分范围复分范围类名记录数 F33_37F33/37 各国农业经济 21 F43_47F43/47 各国工业经济 19 S727_728S727/728 各林种、各类特殊地区的造林 5 S79S791/796 各种森林树种 8

12 Examples of Modeled Class Table

13 Examples of General Modeled Class Table

14 Examples of Professional Modeled Class Table

15 Class - Keyword Cross Table (17,582)

16 Keyword - Class Cross Table Before delete replication about 1,210,000 words After delete replication About 320,000 words

17 Agricultural Navigation Information Database Determine the regulations for organizing the information Make XML files for navigation information Choose the database management system Define database structure

18 The Regulations for Organizing the Information Never lose any class or sub-class having record Display order: Class having more records listed first, then listed from higher class layer to lower If one node does not have record as well as one sub- node only, this node is deleted and move its sub-node to upper layer Sub-class below the third layer class merge up to the third class Less than 30 records in the subclass are ignored temporarily

19 XML files for Navigation Information(33MB)

20 Data Check and Display Menu

21 Database Management System Relational Database –XML - Enabled Database Need transfer, low efficiency Native XML Database –Software AG Tamino Read XML data directly Save data in XML format

22 Define Database Structure

23 System Framework XMLDBMS/RDBMS+XML+JAVA/JSP Browser / Server 3 Layer system structure Environment for running JSP and XML Java SDK 1.3.1 Xalan2.2.0 Tomcat3.2

24 Demo of The Retrieval System

25 Registration

26 Login

27 Browse Retrieval

28 Enter Keyword

29 Display the Results

30 Second-Order Retrieval

31 Retrieval from the Tree Directly

32

33 Intelligent Retrieval

34 Fined Retrieval

35

36 Conclusion The establish of the agricultural scientific navigation information database and the development of its web search system change the traditional retrieval method from based on keyword to based on knowledge organization structure. It is also a foundation work. The actual classification table and the cross tables between class and keyword established in the project are valuable Chinese agricultural semantic resources. It is useful for the further studies on the automatic distinguish and classification of agricultural information as well as constructing strict agriculture domain ontology. The work is just the beginning of the study on ontology and its application in agriculture.

37 The End Thanks for All


Download ppt "An Intelligent Retrieval System for Chinese Agricultural Scientific Literature Ping Qian, Xiaolu Su Scientech Documentation and Information Center , Chinese."

Similar presentations


Ads by Google