Inventor Mobility Index Thorsten Doherr Zentrum für Europäische Wirtschaftsforschung Center of Economic Research, Mannheim Germany
Two inventors with the same name are not neccessarily the same person Defining an inventor only by its name results in too much false mobility especially for inventors with common names Restricting the definition too much (i.e.: name and home address) will cancel any mobility You have to decide wether two patents from inventors with the same name are actually from the same person or from different persons that share the same name Mission: The complete patent data Problem: Tools: Mission
if they are inventing for the same applicant if they have the same home address if they are working with the same co-inventors if one is citing the other if they have patents in the same area of technology (ipc) Two inventors with the same name are the same person… Plausibility Rules Inventor: A single inventor entry in a patent document Person: All inventors with a specific name that are linked by at least one plausibility rule
Harmonization of Applicants The SearchEngine is an in-house developed software package specialized in company address matching. It implements the following steps: Normalizing of the search fields (company name, address fields) by transforming them to uppercase, replacing special letters to their common (phonetic) representation (i.e.: Ü UE, ß SS), compressing abbreviations (i.e.: S.P.A. SPA) and replacing special characters with blanks Creating a dictionary containing all the words of the search fields along with their occurrence. To preserve the context, every search field has its own chapter. The occurence is the base for the heuristic search algorithm. There are also supporting tables that link the dictionary entries back to the company table. The search algorithm separates a search term into words. Each word is associated with the occurrence counter of the appropriate dictionary entry. The occurrence reflects the identification potential of the word. A low occurrence has a high identity, because the resulting list of potential hits is small. SearchEngine
ENTRYOCCURSIDENTITY ……… CORPORATION161/16 = ……… ITALIA4911/491 = ……… LEAR41/4 = ……… SPA61191/6119 = DICTIONARY - Chapter: APPLICANT_NAME LearCorporationITALIAS.p.A. LEARCORPORATIONITALIASPA SUM %19.860%0.647%0.052%100% NAMEIDENTITY LEAR CORPORATION ITALIA S.p.A % Lear Corporation Italia S.r.l % LEAR ITALIA SEATING S.p.A % Searching for… Result Example of the SearchEngine Algorithm Harmonization of Applicants
The resulting list of matching pairs is not symmetric: A can be linked to B but it is not required that B is linked to A linked pairs create a network Network Analysis: if A is linked to B and B is linked to C, the analysis identifies the group A,B,C Re-iteration of the network analysis for too large groups with an increased cutoff limit for their members. Finalization A cutoff limit for the identity is applied to filter all results (i.e. 90%)
Creating phonetic representations of the name using the Metaphone algorithm by Lawrence Philips, 1990 Phonetic algorithms create unique representations for similar sounding words (names) and can be indexed direct database access Originally the results they delivered were manually validated because of their strong tendency for false positives automated matching requires an automated validation process Harmonization of Inventor Names Automated comparison of the retrieved names with the searched name The function is based on the least relative character position deltas and requires two words as parameters can not be used for index based direct access Needs phonetic indexing to quickly generate a list of potential candidates Tolerance for typing errors increases with the length of the words longer words are more prone to typing errors The SearchEngine is of limited use because… it is most efficient with search terms consisting of multiple words the main problem are typing errors and misspellings
Harmonization of Inventor Names MRBRTN MAUROBARATONI MARIOBERRETTONI MARIOBERTINI MARIOBERTON MAUROBERTONI MAUROBORDIN FIRST NAMELAST NAME Example for the Metaphone Search
Harmonization of Inventor Names 01.0 CZARNITZKI CHARNIZKI == Example for the Least Relative Character Position Deltas
if they are inventing for the same applicant. if they have the same home address. if they are working with the same co-inventors. if one citing the other. if they have patents in the same area of technology (ipc). Two inventors with the same name are the same person… Plausibility Rules Inventor: A single inventor entry in a patent document. Person: All inventors with a specific name that are linked by at least one plausibility rule.
All Patents of an Inventor Name
The Same Applicant Rule
The Same Home Address Rule
The Co-Inventor Rule
The Citation Rule
The IPC Rule
Italian Inventor Mobility Index patents from Italian applicants and inventors different harmonized inventor names nodes after applying the same applicant rule nodes after applying the co-inventor rule nodes after applying the citation rule nodes after applying the same home address rule nodes after applying the ipc rule Espace Bulletin (March 2010), EPO Patstat (September 2010), OECD Main Database: Citations: Development:Microsoft Visual FoxPro 9.0
FROMTO …… …… Traversal of a Network Table GROUPMEMBER