CRSP Center for Research in Security Prices
The Center for Research in Security Prices (CRSP) is a financial research center at the University of Chicago's Graduate School of Business. CRSP is globally recognized as a premier provider of historical financial data. The unparalleled accuracy of the CRSP data files coupled with historical and unique identification records have made them a staple of academic and commercial research since CRSP
CRSP History CRSP was established by the University of Chicago, GSB in 1960 with the generous support of Merrill Lynch for the purpose of accurately measuring the returns from investing in common stocks listed on the NYSE for the period 1926 to Significant additional support has come from others including Dimensional Fund Advisors (DFA). Since the beginning, CRSP data files have been used to produce highly influential, original research at the University of Chicago.
Major Contributors Eugene Fama, Robert R. McCormick Distinguished Service Professor of Finance at the University of Chicago, CRSP Chairman, Board of Directors and a former Director of CRSP Eugene Fama is considered a “Giant in the Field of Economics” Peter J. Tanous, New York Institute of Finance, 1997 DFA has significantly contributed to CRSP methodology and have generously contributed research funding. Ken French, Nanyant Technological University (NTU) Professor of Finance at MIT, is an expert on the behavior of security prices and investment strategies. He is also a former Director of CRSP.
Typical Users Quantitative Groups Asset and Portfolio Managers Equity Derivatives Groups Academics
CRSP Databases CRSP databases include: US Stock NYSE/Amex/Nasdaq data, US Indices data (with the stock portfolio assignment module), CRSP/COMPUSTAT Merged Database -a link between CRSP Stock and COMPUSTAT Fundamental data, US Government Treasury Bond and T-Bill data, and US Mutual Fund data. Additional products include: proxy graphs for Schedule 14A Proxies, Cap-Based reports, and Custom datasets and extractions.
CRSP US Stock Data The CRSP US Stock Databases, cover end-of- day and month-end prices, volumes, returns, corporate actions, name, and identification history, etc. for common stocks listed on the NYSE, AMEX and Nasdaq Stock Markets. It contains historical data covering more than 23,000 securities, monthly from 1925 and daily from 1962.
CRSP/COMPUSTAT Merged Data The CRSP/COMPUSTAT Merged Database provides a unique and historical link matching CRSP security data with COMPUSTAT fundamental company data. (A subscription to both CRSP stock data and select COMPUSTAT data files are prerequisite subscriptions for this database.) The COMPUSTAT data is provided reformatted as a CRSPAccess database with the link for easier data integration.
CRSP US Indices Data The US Indices Database contains a wide variety of financial and economic indices (value- and equal- weighted returns with and without dividends, cap- based and market benchmarks) and other statistics, which are used to gauge the performance of the broader market and economy in general. This product is designed for use with the CRSP stock databases and contains portfolio assignments for securities, but can be used stand-alone. It contains 11 index groups and their associated deciles totaling 154 different indices.
CRSP US Stock and Indices Data The following chart compares total returns with dividends of IBM, and the CRSP value-weighted index for the S&P 500 universe with total returns on the S&P 500 Composite index over a ten year period. The data was processed using the ts_print report writer utility and Microsoft Excel® was used to create the chart.
CRSP US Treasury Data The US Government Treasury Databases cover end-of-day and month-end quote, debt, coupon rate, maturity and index data are included. Supplemental Fama files are included in the month-end data. The monthly data begins in 1925 and contains over 101, 950 prices on more than 5,100 US Treasury bills, notes, and bonds. The daily data begins in 1962 and contains over 1.5 million prices for over 3,200 US Treasury bills, notes, and bonds.
CRSP US Treasury Data Following is a graph comparing the yields between the risk free rate files and the 20-year bond index over a 10 year period
CRSP US Mutual Fund Data The Survivor-Bias Free US Mutual Fund Database contains returns, total net assets, net asset values and distributions for open-ended US funds as well as general identifying information like asset allocation, associated fees, and fund managers. Historical data on approximately 18,000 funds.
CRSP US Mutual Fund Data This chart graphs returns based on an initial investment of $100 per fun for live-only growth funds and live and dead growth funds.
CRSP Cap-Based Data The CRSP Cap-Based Portfolio Returns are updated monthly. Reports include data for all 17 of the cap-based portfolios, weights, total returns, index levels, and income returns. Quarterly Reports Monthly Reports Historical Reports Monthly Security List
CRSP Cap-Based Data Following is a snapshot of the cap-based security list, which provides the cap-based portfolio assignments of every single security which is used in the calculations of CRSP Cap-Based Portfolio data CUSIP Ticker Company Name Exch SIC Shares Price Portfolio Return PERMNO PERMCO Code Code |AAON |AAON INC |K|3580| | | 9| | 76868| |AIR |A A R CORP |N|5088| | | 8| | 54594| |ABCB |A B C BANCORP |K|6020| | |10| | 80498| |ABCR |A B C N A C O INC |K|3740| | | 9| | 79932| |ABWG |A B WATLEY GROUP INC |K|6210| | | 9| | 86828|16482
CRSP Proxy Product The CRSP Proxy Product includes two comparisons of five-year cumulative total returns that measure the performance of an individual security against market and peer groups. Each index is a market value-weighted daily total return index. The monthly index levels are provided as a text file and a graph. This product is typically used for for SEC Schedule 14A Proxy filing.
CRSP Proxy Product Following is a snapshot of a sample graph.
Why Choose CRSP? Comprehensive Data - CRSP maintains the most complete and wide-ranging historical data available. Data Accuracy. Unique Identifiers - CRSP maps traditional identifiers (like CUSIP, Ticker, Company Name, SIC Codes, etc.) to unique permanent identifiers allowing uninterrupted time-series analysis. Excellent Customer Service.
CRSP US Stock Databases The CRSP US Stock Databases, cover end-of- day and month-end prices, volumes, returns, corporate actions, name, or identification history, etc. for common stocks listed on the NYSE, AMEX and Nasdaq Stock Markets. They contain historical data covering more than 23,000 securities, monthly from 1925 and daily from 1962.
CRSP US Stock Databases There are two CRSP US Stock Databases, Daily and Monthly. They contain the same data items. Daily data is reported on a day-end basis, monthly data on a month-end basis. Each database is available in three update frequencies: * Only available to academics
CRSP US Stock Databases The CRSP US Stock Databases provide a unique issue identifier, PERMNO and a unique company identifier, PERMCO across time. This allows for seamless time series analysis. PERMNO is the primary key in the CRSP stock databases. Other supported database keys include PERMCO, CUSIP, historical CUSIP, Ticker, and SIC Code. PERMNO is the only unique key over time.
Benefits Unique security and company identifiers across time for NYSE, AMEX, and Nasdaq common stocks. CRSPAccess Database: The database is structurally compatible with the CRSP US Indices database and Portfolio Assignments Module and the CRSP/COMPUSTAT Merged Database, enabling queries between the databases. You can browse and extract data using CRSP software. C and FORTRAN programming support for data access. Data Accuracy. CRSP US Stock Databases -Benefits
CRSP US Stock Databases -Database Structure * Monthly data only, ** Daily data only
CRSP US Stock Databases -Header Data Header Identification and Summary Data contains data items which identify a security. These are current identifiers and include: PERMNO PERMCO Nasdaq Company Number Nasdaq Issue Number Exchange Code - Header Share Code - Header Standard Industrial Classification (SIC) Code - Header Begin of Stock Data End of Stock Data Delisting Code - Header CUSIP Identifier - Header Ticker Code - Header Company Name - Header
CRSP US Stock Databases –Event Data Event Data contains unscheduled transactions, or observation or status changes. The time of the event and relevant information are stored for each observation. The status observations and changes usually contain information that is in effect until modified by another similar event. There is a count of the number of events. There are five arrays of Event Data: Name History Data Distribution Event History Delisting History Array Shares Outstanding Observations Nasdaq Information Array
CRSP US Stock Databases -Name History Data Name History Data is similar to Header data, but it contains a full history of identification changes for each security over time. Any time one of the identifiers change, a new row is added, recording the change and the date of the change. Name History items include: Name Effective Date Last Date of Name CUSIP Ticker Symbol Company Name Share Class Share Code Exchange Code Standard Industrial Classification (SIC) Code
CRSP US Stock Databases -Distribution Data Distribution Event History data tracks ordinary dividends, liquidating dividends, exchanges and reorganizations, subscription rights, splits, changes in shares outstanding, and general information announcements for dropped issues. Distribution items include: Distribution Code Dividend Cash Amount Factor to Adjust Price Factor to Adjust Shares Outstanding Distribution Declaration Date Ex-Distribution Date Record Date Payment Date Acquiring PERMNO Acquiring PERMCO
CRSP US Stock Databases -Delisting Data Delisting History Array contains delisting information for each security after it is no longer listed on an exchange in a CRSP file. Delisting data items include: Delisting Date Delisting Code New PERMNO New PERMCO Delisting Date of Next Available Information Amount After Delisting Delisting Return without Dividends Delisting Price Delisting Payment Date Delisting Return
CRSP US Stock Databases -Shares Data Shares Outstanding Observations Array contains the history of a securities shares outstanding. Shares Outstanding data items include: Shares Outstanding Shares Observation Date Shares Observation End Date Shares Outstanding Observation Flag
CRSP US Stock Databases -Nasdaq Info. Data Nasdaq Information Array contains the history of a securities trading status on the Nasdaq Small Cap Market SM. Nasdaq Information data items include: Nasdaq Traits Date Nasdaq Traits End Date Nasdaq Traits Code Nasdaq National Market Indicator Market Maker Count NASD Index Code
CRSP US Stock Databases -Time Series Data Time Series Data Arrays contains data that occurs on a scheduled time period. Each security has one observation for each time series data item per time period. Each time series is it’s own array. Time series arrays include: Bid or Low Price Ask or High Price Price or Bid/Ask Average Holding Period Total Return Volume Traded Bid Ask Return Without Dividends Spread Between Bid and Ask (monthly only) Price Alternate Date (monthly only) Number of Trades, Nasdaq (daily only) Price Alternate (monthly only)
CRSP US Stock Databases -Portfolio Data Portfolio Data contains the portfolio type defined by CRSP and a history of statistics and assignments for a security. Portfolio data includes: Portfolio Assignment Number Portfolio Statistic Value
CRSP US Stock Databases -Indices Six Indices are included in a subscription to the CRSP US stock data: CRSP Value-Weighted Index with dividends CRSP Value-Weighted Index without dividends CRSP Equal-Weighted Index with dividends CRSP Equal-Weighted Index without dividends S&P 500 Composite Nasdaq Composite
CRSP US Indices Database The US Indices Database contains a wide variety of financial and economic indices (value- and equal- weighted returns with and without dividends, cap- based and market benchmarks) and other statistics, which are used to gauge the performance of the broader market and economy in general. This product is designed for use with the CRSP stock databases and contains portfolio assignments for securities, but can be used stand-alone. It contains 11 index groups and their associated deciles totaling 154 different indices.
CRSP US Indices Database Major index types include: CRSP Stock File Indices – CRSP Market Indices Published S&P’s 500 Index Nasdaq Composite Index CRSP Stock File Capitalization Decile Indices CRSP Stock File Risk-Based Decile Indices CRSP Cap-Based Portfolios CRSP Indices for the S&P 500 Universe CRSP US Treasury and Inflation (CTI) Series All indices are available as daily and monthly indices excepting standard deviation and beta indices, which are daily only.
CRSP US Indices Database This database is designed to be used in conjunction with the CRSP stock database, and includes security assignments (by PERMNO) for the indices. It may also be used as a stand-alone database. In addition to these indices and portfolios, you may design your own with our CRSPAccess ts_print and dsxport and msxport software.
CRSP US Indices Database The following graph compares 5 years of total returns with dividends of the CRSP S&P 500 Index, the CRSP 9-10 (small cap) and a user-defined value-weighted portfolio containing tech stocks.
CRSP US Indices Database The high tech stocks included in the user-created portfolio above include: Symantec Corporation Microsoft Corporation International Business Machines Compaq Computer Co. Oracle Corporation Cisco Systems, Inc. Intel Corporation Sun Microsystems Lucent Technologies Apple Computer
CRSP/COMPUSTAT Merged Database (CCM) The CRSP/COMPUSTAT Merged Database (CCM) provides a unique and historical link matching CRSP security data with COMPUSTAT fundamental company data. (A subscription to both CRSP stock data and select COMPUSTAT data files are prerequisite subscriptions for this database.) The COMPUSTAT data is provided reformatted as a CRSPAccess database with the link for easier data integration.
CCM
Benefits CRSPLink: A unique and historical link between CRSP’s PERMNO/PERMCO and COMPUSTAT’s GVKEY CRSPAccess Database: The database is structurally and organizationally compatible with the CRSP Stock and Indices databases. Data items and associated footnotes can be extracted using the cst_print utility program. Data Access: Precompiled software to extract the data C and FORTRAN programming support for data access CCM
CCM: CRSPLink Benefits cont. Joint Data Processing – Screen one database on a set of criteria and extract data from the other database. Seamless integration of historical COMPUSTAT data.
CCM: CRSPLink
CRSPAccess CRSPAccess is a custom database format and method of accessing the data. The database is provided in little- and big-endian (binary) on CD-ROMs. Utility programs (software) are provided to access the data. Sample programs (in C and FORTRAN 77) are provided to access the data.
CRSPAccess Supported Systems Operating System CPU PCWindows NT 4.0Intel x86 PCWindows 98Intel x86 PCWindows 95Intel x86 SunSparc (Unix)Sun Solaris 2.8Sun Sparc Alpha (Unix)Compaq Tru64 4.0AXP HP 715 Compatible (Unix) HP/UX 10.2PARISC 1.1 AIXIBM AIX 4.3PowerPC Alpha (OpenVMS)OpenVMS 7.2AXP *Stock and Indices only. Fortran is not supported with COMPUSTAT data on AIX.
CRSPAccess: Software There are several utility programs that can browse and extract CRSPAccess data Browse software: stksearch indsearch cstsearch Search software: ts_print stk_print cst_print
Browsing Software CRSP has several command line programs that can be used to browse identifying information. There is one for stock data (dstksearch (daily data) or mstksearch (monthly data)), indices data (indsearch), and COMPUSTAT data (cstsearch).
Software: stksearch stksearch searches name history data (using headfile.dat) for a text string. This data includes PERMNO, PERMCO, CUSIP, Company Name, Ticker, Exchange Code and Date Range for the information. Any of these may be used as search criteria. For example IBM on the daily stock file would look like this: CRSP1>stksearch Enter search string: ibm Exchange Codes 1=NYSE, 2=AMEX, 3=NASDAQ PERMNO PERMCO CUSIP Company Name Tick EX date range INTERNATIONAL BUSINESS MACHS CO IBM INTERNATIONAL BUSINESS MACHS CO IBM INTERNATIONAL BUSINESS MACHS CO IBM AMERICUS TR FOR IBM SHS BZP AMERICUS TR FOR IBM SHS BZS AMERICUS TR FOR IBM SHS BZU
Software: indsearch indsearch searches index header data for the search string. This data includes PERMNO (INDNO), Set ID, Index Name, Any of these may be used as search criteria. For example a search for value would look like this: CRSP1>indsearch Enter search string: value Daily Indices PERMNO SETID Index Name CRSP NYSE Value-Weighted Market Index CRSP AMEX Value-Weighted Market Index CRSP NYSE/AMEX Value-Weighted Market Index CRSP Nasdaq Value-Weighted Market Index CRSP NYSE/AMEX/Nasdaq Value-Weighted Market Index CRSP Value-Weighted Index of the S&P 500 Universe
Software: cstsearch cstsearch searches header data (current identification) for the search string. This data includes GVKEY, PERMNO, DNUM, CNUM, CIC, SMBL, Company Name, and the date range. Any of these may be used as search criteria. For example a search for IBM would look like this: CRSP1>cstsearch Enter search string: ibm COMPUSTAT Headers GVKEY PERMNO DNUM CNUM CIC SMBL Company Name ANN/QTR range IBM1 IBM CREDIT CORP IBM INTL BUSINESS MACHINES CORP
Data Access Software CRSP has several programs that can be used to access CRSP stock and Indices data and COMPUSTAT data from the CRSP/COMPUSTAT Merged Database. ts_print – Stock, indices and COMPUSTAT data. *stk_print – Stock data. cst_print – COMPUSTAT data. *sxport – Index Portfolio data. * is a d or an m, for daily or monthly data.
Software: ts_print ts_print is a report writer program that can extract CRSP stock and indices data and COMPUSTAT data, using PERMNO. This is a command line program, with a java interface for Window’s systems.
Software: ts_print ts_print runs a user-created request file with four types of specifications: Company or Security Data Items Requested Date frequency and format for Report Layout of Output File
Software: ts_print You can either create the request file in a text editor, or use the interface to create it for you. The request file must be in the following format: ENTITY entity specifications END ITEMS item specifications END DATE date specifications END OPTIONS option specifications END
Software: ts_print Following is a request file to extract CRSP’s price, returns and volume, and COMPUSTAT’s annual items, 62 (interest income), 24 (price – close), 107 (sale of property, plant & equipment (flow of funds statement)), NAICS, and ticker for IBM, Microsoft, Oracle, and Compaq. ENTITY LIST|PERMNO 12490|ENTFORMAT 4 LIST|PERMNO 10107|ENTFORMAT 4 LIST|PERMNO 10104|ENTFORMAT 4 LIST|PERMNO 68347|ENTFORMAT 4 END ITEM ITEMID prc|SUBNO 0 ITEMID ret|SUBNO 0 ITEMID vol|SUBNO 0 ITEMID iaitem|SUBNO 62 ITEMID iaitem|SUBNO 24 ITEMID iaitem|SUBNO 108 ITEMID naics|SUBNO 0 ITEMID smbl|SUBNO 0 END DATE CALNAME annual|RANGE |CALFORMAT 5 END OPTIONS X ITEM,YES|Y DATE,YES|Z ENTITY,YES,1|OUTNAME fintest.out|ROWDELIM 0,0|REPNAME Sample Test 1|NOFILL END
Software: ts_print The following is the output from the input file on the previous screen. It includes both CRSP data items: price, returns and volume, and COMPUSTAT’s annual items, 62 (interest income), 24 (price – close), 107 (sale of property, plant & equipment (flow of funds statement)), NAICS, and ticker. Sample Test 1 INTERNATIONAL BUSINE Prc Ret Vol IntInc Close SaleStk naics smbl 29-dec IBM 31-dec IBM MICROSOFT CORP Prc Ret Vol IntInc Close SaleStk naics smbl 29-dec MSFT 31-dec MSFT ORACLE CORP Prc Ret Vol IntInc Close SaleStk naics smbl 29-dec ORCL 31-dec ORCL COMPAQ COMPUTER CORP Prc Ret Vol IntInc Close SaleStk naics smbl 29-dec CPQ 31-dec CPQ.
Software: stk_print stk_print is a command line report writer utility program to access CRSP stock data. It can be used to extract all CRSP stock data. stk_print supports PERMNO, PERMCO, CUSIP, Ticker, and SIC Code as database keys.
Software: stk_print stk_print can also be used to sequentially read and output selected data in a delimited text file for all the companies in the database over a selected date range. This is particularly useful when you are extracting the data for further manipulation in other software like SAS.
Software: stk_print There are two versions of stk_print, dstkprint and mstkprint. D for daily data, m for monthly data. The initial input screen for stk_print, using daily data looks like this: CRSP1>dstkprint CRSP NYSE/AMEX/NASDAQ Daily History + Indices, data ending Using default dates Enter identifier or new option beginning with slash. Type ? for help. Data options are entered on the command line. The program default sets PERMNO as the database key.
Software: stk_print A snapshot (stk_print) of prices, volumes and shares, October and December, 1999 for IBM: Date Prices Returns Volumes
Software: cst_print cst_print can extract all CCM data. Output from cst_print can be paired with output from CRSP stock utility programs in a spreadsheet for data comparison and analysis.
Software: cst_print cst_print cst_print is a command line utility program that can be used to access all of the data included in the CRSP/COMPUSTAT merged database. Output can be dumped to the terminal window or to an output text file and imported into a spreadsheet for further data manipulation.
Software: cst_print The initial input screen of cst_print looks like this: >cstprint COMPUSTAT Databases creation date : 08/19/1999 Using default dates: for quarterly data Using default dates: for annual data Using default dates: for monthly data Enter identifier or new option beginning with a slash. Type ? for help. Data options are entered on the command line. The program default sets GVKEY as the database key.
Software: cst_print A snapshot ( cst_print) of Industrial Annual data item #70 and #302, with footnotes, for IBM: Selected Data Items: 70, Accounts Payable -applicable footnotes: Accounts Receivable - Decrease (Increase) (Statement of Cash Flows) -no applicable footnotes Data Fiscal Item 70 FTNTS Item 302 Year Yearend AcctsPay 16 AccRecDec … BK BK …
Software Using cst_print and stk_print, one can write output to a data file, which can then be imported into a spreadsheet for further data manipulation or comparison. For example, the following chart examines reported price differences between CRSP and COMPUSTAT for IBM on an annual basis between CRSP data extracted with ts_print. Graph created in Excel.
Software: dsxport/msxport dsxport and msxport are command line portfolio- building programs. dsxport is used with the daily data, msxport, with the monthly data. These programs require a user-created input file containing a list of CUSIPs (8-characters, without the electronic check digit), historical CUSIPs, or PERMNOs to be included in the portfolio. (CUSIP is the default.) This input file can be created in a text editor. The program prompts for input. Output is written to a log file.
CRSPAccess Programming FORTRAN77 & C Programming Sample programs are available in the C programming language to query both CRSP and COMPUSTAT data. Either C or FORTRAN programs can be used to extract data from the individual databases.
CRSPAccess Supported Compilers FORTRAN CompilerC Compiler PCDigital Visual FORTRAN 6.0 Microsoft Visual C PCDigital Visual FORTRAN 6.0 Microsoft Visual C PCDigital Visual FORTRAN 6.0 Microsoft Visual C SunSparc (Unix)SparcCompiler FORTRAN 5.0 SparcCompiler C 5.0 Alpha (Unix)Compaq FORTRAN 4.1Compaq C 5.5 HP 715 Compatible (Unix) f90gnu C Compiler AIXf77*xlc Compiler Alpha (OpenVMS)Compaq FORTRAN 7.3Compaq C 6.2 *Stock and Indices only. Fortran is not supported with COMPUSTAT data on AIX.
CRSPAccess Summary Conclusion: CRSP provides the most accurate historical financial data available. As part of the University of Chicago’s Graduate School of Business, we bring excellence to the data. Hands-on Research. Extensive Data Clean-up and Analysis. Data Formats You Can Use. Responsive and Friendly Staff. Excellent Technical Support.
Thank You! CRSP thanks you for your time!