Presentation is loading. Please wait.

Presentation is loading. Please wait.

The NBER Patent Data Project: Past data uses and future plans

Similar presentations


Presentation on theme: "The NBER Patent Data Project: Past data uses and future plans"— Presentation transcript:

1 The NBER Patent Data Project: Past data uses and future plans
Prof. Bronwyn H. Hall University of California at Berkeley, University of Maastricht, NBER, and IFS London

2 Outline Currently available NBER patent data from the USPTO
Uses of these data The new PDP (Patent Data project) at NBER What do we add Where do we stand now discussion Jan 2006 EPIP Bocconi Workshop

3 NBER Patent Citations Data File
~3 million U.S. patents granted between January 1963 and December 1999 (now updated to 2002) Patent number, application and grant dates Name of first inventor; name and type of assignee Country and state of first inventor Main US patent class; number of claims; main IPC class from 76 Number of citations, forward and backward; generality and originality measures based on citations All citations made to these patents between 1975 and 1999 (over 16 million) Match of patenting organizations to Compustat (the data set of all firms traded in the U.S. stock market) Available at emlab.berkeley.edu/users/bhhall/bhdata.html (2002 update) Jan 2006 EPIP Bocconi Workshop

4 Use of NBER patent data >100 significant research projects (at least one quarter outside the US) ~100 published papers ~50 doctoral dissertations in accounting(1), agric econ (1), econ (22), finance (3), history (1), info tech (1), law(2), management (15), public policy (1), unknown(3) Jan 2006 EPIP Bocconi Workshop

5 Research areas Positive Normative
Individual inventors – migration and co-invention, spillovers Organizations, networks, and innovation Geography of innovation Knowledge spillovers, local and international Citations as a value indicator Normative The patent explosion and its implications for firms, the patent office, and social welfare Patent policy – legal and administrative; the examination process Patent and patent litigation strategy University and laboratory patenting Jan 2006 EPIP Bocconi Workshop

6 The PDP project at NBER A new project to update and extend the publicly available USPTO data Principal investigators: Iain Cockburn (BU), Bronwyn Hall (UCB), Walter (Woody) Powell (Stanford), Manuel Trajtenberg (Tel Aviv) Senior investigators: Ajay Agarwal (Toronto), James Bessen (BU), Stuart Graham (GA Tech), Megan Macgarvie (BU) Jan 2006 EPIP Bocconi Workshop

7 Database design principles
Accessibility Provide public tools (xml based) to allow others to extract data Modularity (as in the OECD/EPO effort) Linking out - an open source-like environment so that others can link their data to the patent data Provide attribution and citation so contributors are recognized Annotation By users – e.g., error correction, identification of SW or gene patents, etc Jan 2006 EPIP Bocconi Workshop

8 Tasks and objectives Update existing data to 2007
Clean and standardize Compute normalization coefficients to correct for truncation, differences across fields in citation practice Additional data (see next slide) Link outs to Patstat Litigation data Assignee name data Geo-coded data Jan 2006 EPIP Bocconi Workshop

9 Additional data Detailed tech class info – full set of USPC and IPC codes Priority information Foreign application data Continuation and divisional relationships Multiple assignee information Inventors names for tracking migration, co-invention, etc. detailed location info for all inventors Source of citations (applicants vs. examiners) Attorney and patent agent names Reexamination requests and outcomes Jan 2006 EPIP Bocconi Workshop

10 Jan 2006 EPIP Bocconi Workshop

11 Currently: cleaning raw data
ASSG - assignees (Cockburn, Agarwal) CLAS - classification (MacGarvie, Hall) FREF - foreign references (Cockburn) GOVT - government interest (Graham) INVT - inventors (Cockburn, Agarwal) LREP - legal representatives (not used for now) OREF - other references (not used for now) PATN - basic patent info (Bessen) PRIR - priority information (Graham) PCTA - PCT information (Cockburn) REIS - reissue information (Graham) RLAP - related application info (Graham) UREF - US references (Cockburn, Trajtenberg) Jan 2006 EPIP Bocconi Workshop

12 Geographical data (inventor and assignee address)
data from USPTO (to be cleaned) City (8000 Munchen?) State (CA vs CA) Country data from non-USPTO sources Regions (SMSA, Canadian provinces, European NUTS/LAU regions) Latitude-longitude coordinates Need to normalize geographical names to match source e.g., Vienna vs. Wien) Jan 2006 EPIP Bocconi Workshop

13 Assignee names K U Leuven project for Eurostat OECD/WIPO/EPO consortium Derwent world patent index 117 pages of 4 char codes for std names A large number of rules (Russian inst, Japanese cos, etc.) Rules for nonstandard co codes IFS project – more later One question: what to do about the extended character set? München vs Munich vs Muenchen vs Munchen Derwent uses ue Jan 2006 EPIP Bocconi Workshop

14 Jan 2006 EPIP Bocconi Workshop


Download ppt "The NBER Patent Data Project: Past data uses and future plans"

Similar presentations


Ads by Google