Presentation is loading. Please wait.

Presentation is loading. Please wait.

CLAIMS CLassification Automated InforMation System

Similar presentations


Presentation on theme: "CLAIMS CLassification Automated InforMation System"— Presentation transcript:

1 CLAIMS CLassification Automated InforMation System Computer-Assisted Categorisation of Patent Documents in the International Patent Classification Patrick Fiévet, CLAIMS Project Manager WIPO & Caspar J. Fall, CLAIMS Consultant ELCA ICIC’03, Nîmes, 22 October 2003

2 Agenda Introduction to CLAIMS project (PF)
Computer-assisted categorization prototypes (CJF) CLAIMS Categorizer perspectives (PF)

3 1. Introduction to CLAIMS Project

4 1.1 CLAIMS Context World Intellectual Property Organization (WIPO)
International Patent Classification (IPC) Classification Automated Information System (CLAIMS)

5 1.2 CLAIMS Project Objectives
IT support for the promotion of the IPC IPC Reform and revision support IPC Tutorials Translation and Natural Language Search in the IPC IPC Categorization assistance to Patent Offices

6 2. Computer-assisted Categorization

7 2.1 Objectives Develop a solution for predicting International Patent Classification (IPC) codes Facilitate accurate classification in small and medium patent offices Support for documents in multiple languages Categorization assistance tool Open questions Depth of computer-assisted categorization What accuracy?

8 2.1 Key issues Survey of automated categorization research
Patent categorization The IPC is a hierarchical classification 120 classes, 628 subclasses, 69’000 groups Patents have secondary IPC codes The categories are modified over time Vocabulary very diverse and technical

9 2.1 Patent categorization approach
Machine-learning method to recognize categories Statistical distribution of words Establish training data Training documents with good IPC codes 210’000 to 830’000 documents Disadvantages No need for keywords Easy to train the tools Can support many languages Never absolute certainty in the results Difficult to have reliable full automation Advantages

10 2.2 Prototype Custom development Measure categorization success
State-of-the-art algorithm Language independent Measure categorization success Compare the predictions with other manually classified documents

11 2.2 Prototype results

12 2.2 Improving accuracy with category refining
Scenario 1 Scenario 2 validate refine direct

13 2.3 Conclusions It works well! To get accurate results, one needs:
Useful user assistance Direct categorization at subclass level possible IPC codes can be refined accurately to main group level To get accurate results, one needs: Large datasets Good category coverage Accurate IPC codes Read the proceedings for more details Demonstration available after the presentation

14 3. IPCCAT

15 3.1 CLAIMS Categorizer Perspectives
1. Implementation : IPCCAT 2. Training sets for IPC Categorization: English, French, Spanish and Russian, German possibly chinese 3. IPC Data sets improvement & Categorizer Retraining

16 3.2 CLAIMS Categorizer Perspectives
4. Improve integration of the IPC Categorizer with other CLAIMS tools 5. CLAIMS policy for distribution of data sets in various Languages

17 3.2 Access to IPCCAT for PCT
Login: IBGST01 Password: clobterib

18 Questions / Answers Patrick Fiévet:

19 Thank you for your attention
CLAIMS CLassification Automated InforMation System Thank you for your attention


Download ppt "CLAIMS CLassification Automated InforMation System"

Similar presentations


Ads by Google