Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mr. JOTL: A User Friendly Matching Software Stéphane Lhuillery, Julio Raffo & Fernando Lladós December 20101 2nd "NameGame" APE-INV workshop.

Similar presentations


Presentation on theme: "Mr. JOTL: A User Friendly Matching Software Stéphane Lhuillery, Julio Raffo & Fernando Lladós December 20101 2nd "NameGame" APE-INV workshop."— Presentation transcript:

1 Mr. JOTL: A User Friendly Matching Software Stéphane Lhuillery, Julio Raffo & Fernando Lladós December 20101 2nd "NameGame" APE-INV workshop

2 Outline Background Objectives & Rationale Results User Friendly Software –Concept –Alpha test Further steps December 20102 2nd "NameGame" APE-INV workshop

3 Background Automatic patent retrieval is becoming compulsory due to the size of data sets. Growing literature looking at this NameGame: –On firms’ names: Derwent, 2002; Mageman et al., 2006; Hall, 2006; Thoma et al. 2007. –On inventors’ names: Trajtenberg et al., 2006; Hoisl, 2006; Lissoni et al., 2006; Mariani et al., 2007; Raffo & Lhuillery, 2009; etc. ‏ Our ESF Project outcomes: –New matching best practices –APE-INV database December 20103 2nd "NameGame" APE-INV workshop

4 Minimize False positive (=higher precision) ‏ Minimize False negative (=higher recall) ‏ Objectives of the NameGame December 20104 2nd "NameGame" APE-INV workshop ? Maximizing True positives

5 Rationale behind: A three step game December 20105 2nd "NameGame" APE-INV workshop

6 Examples on matching (EPFL) 6December 2010 2nd "NameGame" APE-INV workshop

7 Examples on filtering (EPFL) 7December 2010 2nd "NameGame" APE-INV workshop

8 What we learned so far? General –Matching algorithms are not perfect, but improve considerably the results. Cleaning step –Data origin changes substantially the data preparation process Matching step –There is a hierarchy pattern across algorithms, although specific to each particular case Filtering step –Supplementary data availability enhances or constraints the disambiguation process December 2010 2nd "NameGame" APE-INV workshop 8

9 Why to create a user friendly software? December 20109 2nd "NameGame" APE-INV workshop PATSTAT / APE-INV Database PATSTAT / APE-INV Database SurveyPATVAL EU FW Program SCOPUS ISI Thomson

10 Concept behind Mr. JOTL Intuitive for beginner users Flexible on inputs and its preparation Fair variety of standard matching processes Adaptable on the disambiguation filters But soundly customizable for advanced users Conceived and coded to be expanded in the future by multiple developers December 2010 2nd "NameGame" APE-INV workshop 10

11 From concept to real (ok for the moment just an alpha!) December 201011 2nd "NameGame" APE-INV workshop

12 Inputs IPTS, Sevilla May 2010.12

13 13IPTS, Sevilla May 2010. Parsing

14 Matching IPTS, Sevilla May 2010.14

15 Disambiguation IPTS, Sevilla May 2010.15 SSM

16 LET’S TEST IT! December 2010 2nd "NameGame" APE-INV workshop 16

17 Technical notes OS supported (so far): –Windows XP, Vista, Seven (Server & x64) Coded in C sharp –Pros: Free Development Environment Low cost of entry Large Developer community –Cons: Proprietary language and libraries Less performing memory management Libraries needed: Scintella: open source lexer, syntax highlighter Customizable code: –C sharp & VBA Suggested environment for future development: –Visual Studio (Express version is free to use) –Mono in Linux December 2010 2nd "NameGame" APE-INV workshop 17

18 Further developments Full coding existing algorithms. Testing performance against large dataset (>Million records). Pre-setting standard routines (as XML). Drafting documentation (+Video). Proof-testing with first time users (at EPFL). December 201018 2nd "NameGame" APE-INV workshop

19 Openness and its governance How to share it? –GitHub? –Forums How to develop a dynamic sharing community? December 2010 2nd "NameGame" APE-INV workshop 19

20 Thank you! December 2010 2nd "NameGame" APE-INV workshop 20


Download ppt "Mr. JOTL: A User Friendly Matching Software Stéphane Lhuillery, Julio Raffo & Fernando Lladós December 20101 2nd "NameGame" APE-INV workshop."

Similar presentations


Ads by Google