Presentation is loading. Please wait.

Presentation is loading. Please wait.

Creating a Longitudinal Research Worker- Establishment Matched Dataset from Patent Data: Description and Application to Understanding International Knowledge.

Similar presentations


Presentation on theme: "Creating a Longitudinal Research Worker- Establishment Matched Dataset from Patent Data: Description and Application to Understanding International Knowledge."— Presentation transcript:

1 Creating a Longitudinal Research Worker- Establishment Matched Dataset from Patent Data: Description and Application to Understanding International Knowledge Flows SEWP Research Conference October 19, 2005 Jinyoung Kim (SUNY-Buffalo) Sangjoon John Lee (Alfred University) Gerald Marschke (SUNY-Albany)

2 Issues Construction of a longitudinal research worker-establishment matched panel data Knowledge flow across national borders

3 Idea Policy implications on immigration, labor market, and education arena productivity of scientific researchers transmittal mechanism of knowledge Technology spillover appears to be geographically limited Firms access externally-located technology partly through hiring of and collaboration with researchers from the outside.

4 We examined: 1.Trends in U.S. firms’ access to the researchers overseas and those with foreign research experience in the late 1980s through the 1990s 2.Role of research personnel as a pathway for the diffusion of ideas from foreign countries to U.S. innovators 3.The firm-level determinants of accessing innovations developed overseas.

5 Main findings: a.In recent years, an increase in the extent that U.S. innovators access researchers residing in foreign country The fraction U.S. residents with foreign research experience in US firms appears to be falling. U.S. pharmaceutical and semiconductor firms are increasingly going to foreign countries to employ such researchers b.Retaining researchers with overseas research experience seems to facilitate access to innovations developed overseas. c.In the semiconductor industry, smaller firms and older firms are more likely to make use of the output of non-U.S. R&D. d.In the pharmaceutical industry, younger firms are more likely to make use of the output of non-U.S. R&D.

6 Outline Literature Review Data Construction Process Empirical findings Conclusions

7 Literatures Various mechanisms for technology and knowledge transfer across institutional boundaries. Informal Contact Agrawal, Cockburn, and McHale (2003), Von Hippel (1988) Spillovers Henderson, Jaffe, and Trajtenberg (REStat 1998), Jaffe (AER 1989), Zucker, Darby, and Brewer (AER 1998), Audretsch and Feldman (AER 1996), Mowery, Ziedonis (NBER 2001).

8 Transmission of Tacit knowledge Feldman (1994) Collaboration and Hiring Cohen, Nelson, and Walsh (Mgt Science 2002), Almeida and Kugot (Mgt Science 1999), Zucker, Darby, and Armstrong (NBER 2001), Adams, Black, Clemmons, and Stephan (NBER 2004)

9 Data 1.Patent Bibliographic data (Patents BIB) U.S. utility patents issued between January 1975 and February 2002. Patent ID number, patent application and granting, patent assignee, and geographic information (country, state, city, address) on all inventors involved. The number of patents during this period is 2,493,610 and the number inventor records is 5,105,754

10 2. ProQuest Digital Dissertations Abstracts Author, title of dissertation, degree conferring institution, date of degree, academic field, and type of degree From over 1,000 North American graduate schools and European universities. For those who earned degrees in all natural science and engineering fields between 1945 and 2003 1,068,551 degree holders.

11 3. The Compact D/SEC 12,000 publicly traded firms at least $5 million in assets and at least 500 shareholders Information obtained from Annual Reports, 10-K and 20-F filings, and Proxy Statements for those companies. pharmaceutical and semiconductor firms in the Compact D/SEC data by their primary SIC. selected only the years 1989 through 1997 due to patent grant lag

12 4. Standard & Poor’s Annual Guide to Stocks – Directory of Obsolete Securities histories of firm ownership changes due to mergers and acquisitions, bankruptcy, dissolution, and name changes, updated through December 2002. 5. NBER Patent-Citations collected by Hall, Jaffe and Trajtenberg (2001) all citations made and received by patents granted between 1975 and 1999. (16,522,438 citation records) 6. Thomas Register Firm founding year

13 3 Steps in Data Construction 1.Identifying the same inventor among ‘same/similar’ names (Patent BIB) 2.Identifying the Ownership Structure of Subsidiaries (Compact D/SEC, S&P) 3.Combining Patent-Inventor Data with Firm Data and Patent Citation Data Patent BIB Compact D/SEC Proquest S&P Thomas Citation +

14 Front page of patent

15 Step 1: Identifying the Same Inventor Inventor name variants Adam Smith vs. Adam Smith? Adam E. Smith vs. Adam Smith? Adam Smyth vs. Adam Smith? : :

16 The size of data (1975-2002) 2,493,610 million patents 5,105,754 million inventor names Name of the inventor (last, first, middle, surname modifier) Street address, zip City, state, country Over 16 million patent citations (A. Jaffe)

17 How to identify? Pair each name with other names and compare N(N-1)/2 number of unique pairs. = (5,105,754 x 5,105,753) / 2 ≈ 13 trillion pairs Trajtenberg (2004)

18 How to Identify? a. The pair is a ‘Match’ if Last names (SOUNDEX coded) and First Names in the pair are the same and at least one of below categories are the same i) Full Address: same street address+ city + country ii) Self Citation: same name is found in the patent that is citing iii) Shared Partner (s): two names from the pair share the same partner c.f. Strong Criteria (Trajtenberg 2004)

19 SOUNDEX Coding Method Code on the way a last name sounds rather than the way it is spelled. Expand the list of similar last names to overcome the potential for inconsistent foreign name translations into English. PETTIT (P330000), Chang (C520000), Chiang (C520000) Giving letters numerical values from 1 to 6 1 for B, F, P, V; 2 for C, G, J, K, Q, S, X, Z; 3 for D, T; 4 for L; 5 for M, N; 6 for R; 0 for punctuation, H, W, Y

20 b. The pair is a ‘Match’ if Full Last (not a Soundex coded) and First Names in the pair are the same and at least one of below categories are the same i) Zip Code ii) Full Middle Name c.f. Medium Criteria (Trajtenberg 2004) c. The pair is a ‘Mismatch’ if middle name initials are different.

21 Impose Transitivity A matched to B  B matched to C, A matched to C

22 An Example -Match: 1:2, 1:5, 1:6, 2:3, 2:4, 2:5, 2:6, 5:6: 3:6 -ID 5 is identified to be the same inventor through Transitivity

23 126 mismatches found after imposing transitivity 3 categories of Mismatches i) from data error ‘Laszlo Andra Szporny’ vs. ‘Laszlo Eszter Szporny’ ii) Inventor with 2 Middle names iii) same Last and First names appear in the same patent

24 Matching Results 2.3 million unique inventors (45%) out of 5.1 million names c.f. Trajtenberg (2004) 1.6 million distinctive inventors (37%) out of 4.3 million names. (Our patent database is larger because it includes additional years, 2000-2002.) a matching criterion of the same Assignee -> can yield a bias in mobility among inventors. assigns scores for each matching criteria Instead we apply the criterion that two inventors are not treated as a match if their middle name initials differ. SOUNDEX coding system sometimes so loosely specifies names that apparently different last names are considered a match.

25 Add Dissertation Abstract Information to Inventor data Match degree holders in the Dissertation Abstract data with the Inventor data. contains a full name in a string for each individual author Convert the last, first, middle names in the inventor data to a string of aggregated names 64,507 (3 percent) Ph.D. or equivalent degree holders out of 2.3 million uniquely identified inventors

26 Step 2: Ownership Structure of Subsidiaries Necessary when combine firm-level information with patent data file Patent Assignee: either a parent firm or its subsidiaries. Firm identifier does not exist. Frequent changes in firm ownership and corporate names - During 1989 and 1997, 152 firms were merged, 15 firms were acquired, 145 firms changed their firm names Firm ownership structure of subsidiaries, M&A, and name change history Relate each assignee to a firm Enables to identify each inventor’s firm for which he/she is innovating

27 1.Select two industry firms in the Compact D/SEC Primary SIC 2834 (pharmaceutical preparation) or Primary SIC 3674 (semiconductor and related devices) 2. Use S&P data whether the change of an inventor’s firm is due to firm- level M&A and/or corporation name changes. 3. List of subsidiary in the Compact D/SEC throughout the period 1989-1997 not always complete – if once a subsidiary of the firm, it is a subsidiary throughout 1989-1997 4. Combined firms’ founding year

28 Step 3: Combining Inventor data with firm data and Patent Citation data Combine inventor file with firm-level data Patent-inventor-firm matched data Link to Hall, Jaffe, and Trajtenberg citation data (2001) 16,522,438 citations for all granted patents applied from 1975 through 1999.

29 Descriptive Statistics 1975 - 2002 2,493,610 patents 2.05 inventors per patent 2,299,579 unique inventors

30 Descriptive Statistics TotalPharmaceuticalSemiconductor Inventors (a)2.299,57925,60933,683 Total No.Patents2.222.82.60 No. Patent/Year1.311.621.72 Degree holders (b)122,1683,3993,941 Total No. Patent3.073.702.95 No. Patent/Year1.521.841.91 (b/a)5.3%*13.3%11.7% * 3 percent (64,507) of Ph.D. or equivalent degree holders

31 Number of Patents Granted by Year of Application * Grant lag - 97 % of patents are granted within the first 4 years of the applications date (Hall, Griliches, and Hausman 1986)

32 International Knowledge Flow 1.Trends in U.S. firms’ access to the researchers with overseas research experience 2.Role of research personnel as a pathway for the diffusion of ideas from foreign to U.S. 3.The firm-level determinants of accessing innovations developed overseas.

33 Inventors with Foreign Experience in US Domestic Patents YearNumber of InventorsFraction of Inventors by Foreign-Experience Type (%) Current Foreign Residents Current US Residents w/ Foreign Experience † Current US Residents w/o Foreign Exp. AllPharmaSemiAllPharmaSemiAllPharmaSemiAllPharmaSemi 198542,368 8.150.9990.86 198644,828 8.301.0790.63 198748,810 8.211.1390.66 198854,947 8.491.1390.37 198959,1642,1431,139 8.6014.479.041.172.011.1490.2383.5389.82 199063,8122,2591,362 8.0217.357.781.221.511.2590.7681.1490.97 199167,6573,3322,791 7.7619.096.021.261.231.2290.9879.6892.76 199273,6403,8763,370 7.8620.387.151.301.211.1390.8578.4191.72 199380,4284,5054,190 8.0625.887.061.211.311.0390.7372.8191.91 199490,9105,3205,739 8.4426.8614.761.200.980.9490.3672.1684.30 1995104,7756,6297,450 8.7828.8715.181.130.870.8690.0870.2583.96 1996104,8294,8947,916 9.1931.5513.261.070.900.7889.7567.5585.95 1997119,5566,0939,993 9.1129.7115.311.010.750.8089.8769.5483.89 † Resided in foreign countries in the previous 10 years

34 Patent-Inventor Ratio by Foreign-Experience type

35 Variable Definition and Sample Statistics Definition Mean (Standard Deviation) PharmaceuticalSemiconductor CITE_FRGN Fraction of citations to patents that are assigned to foreign assignees 0.5505 (0.3319) 0.4760 (0.2850) FRGN_EXP = 1 if at least one inventor is residing or used to reside in one of foreign countries where foreign assignees of cited patents are located 0.0734 (0.2609) 0.0290 (0.1677) INVENTOR Number of all inventors in a patent assignee firm 326.0 (195.7) 923.5 (728.6) EMPLOYEE Number of employees in a patent assignee firm 35,979 (21,833) 41,538 (52,501) R&D/INV Real R&D expenditures in 1996 constant dollars over the number of inventors in a patent assignee firm 31.67 (24.51) 12.04 (27.34) NSIC Number of secondary SIC’s assigned to a firm in a patent assignee firm 3.791 (1.991) 3.154 (1.944) MEXP Median experience of all inventors in a patent assignee firm 5.292 (1.582) 3.832 (1.067) FIRMAGE Years elapsed since the founding year of a patent assignee firm 77.40 (51.51) 36.17 (23.40)

36 Determinants of Citation to Foreign-Assigned Patents PharmaceuticalSemiconductor FRGN_EXP 3.8950 4.95 3.3876 3.92 4.3832 3.87 5.8609 4.18 5.5730 3.66 6.4162 3.75 Log INVENTOR 1.0813 1.10 1.1595 1.19 -1.1918 -2.69 -1.1702 -2.64 Log EMPLOYEE 0.2124 0.38 0.1885 0.34 0.3871 1.24 0.3550 1.14 Log R&D/INV 0.0557 0.66 0.0488 0.59 0.0658 1.14 0.0691 1.18 Log NSIC -0.2723 -0.38 -0.4079 -0.57 1.1469 1.57 1.1562 1.56 Log MEXP -6.5845 -4.41 -6.4702 -4.40 -6.8640 -2.76 -6.8410 -2.66 Log FIRMAGE -1.0956 -1.96 -1.1361 -2.06 2.3439 2.88 2.3771 2.83 Observations R 2 1430 0.0189 1247 0.1462 1215 0.1539 4316 0.0283 4186 0.1280 4112 0.1306 Dependent variable = logit transform of CITE_FRGN Note: Rows show the estimated coefficient and the t statistic for each regressor. The result for a constant term is suppressed. The t statistic is based on the Huber-White sandwich estimator of variance.

37 Conclusion An increase in the extent that U.S. innovators access researchers with foreign R&D experience in recent years An increase in U.S. firms’ employment of foreign-residing researchers; The fraction of research-active U.S. residents with foreign research experience appears to be falling Possibly to capture the geographically dispersed knowledge spillovers. Having researchers with research experience abroad seems to facilitate access to foreign produced knowledge. In the semiconductor industry smaller firms and older firms are more likely to make use of the output of non-U.S. R&D. In the pharmaceutical industry, younger firms are more likely to make use of the output of non-U.S. R&D.

38 Future Extension The consequences of the mobility of R&D personnel on firm R&D. The impact of the arrival of a researcher with a particular set of R&D experiences on the character and quantity R&D done by a firm The importance of inter-firm mobility for technological diffusion. How firms organize the R&D enterprise, the extent of collaboration among scientists geographically dispersed, and the extent of interaction among scientists with different backgrounds.


Download ppt "Creating a Longitudinal Research Worker- Establishment Matched Dataset from Patent Data: Description and Application to Understanding International Knowledge."

Similar presentations


Ads by Google