Presentation is loading. Please wait.

Presentation is loading. Please wait.

Economic Data Time Travel Adrienne Brennecke September 30, 2011.

Similar presentations


Presentation on theme: "Economic Data Time Travel Adrienne Brennecke September 30, 2011."— Presentation transcript:

1

2 Economic Data Time Travel Adrienne Brennecke September 30, 2011

3

4 New York Times Article

5

6 Quick demo http://alfred.stlouisfed.org/

7 Value of this history Determine the accuracy of early estimates Evaluate policy decisions using information available at the time, not what is known in hindsight Allows economists to model the economy using data that was actually available

8 Users can save data sets to their own account Share Published Data ListsPublished Data Lists Average about 3,000 unique visitors a month Value of this website

9 History of the Project Why? How? Challenges? technical details

10 Looking for revisions, and then solutions Former Research Director was looking for the economic data that were released originally—not the revised data We searched high and low... – Libraries removed news releases when the final version was published – Agencies historically wrote over the data, as the computing storage costs were high

11

12 Help from libraries Searched online catalogs for press releases Called documents librarians all over the country Contacted issuing agencies and the Library of Congress Depository libraries came through for us

13 Challenges How to design ALFRED to store revisions – See Developing Time-Oriented Database Applications in SQL Finding and verifying old data and release dates Early electronic information lost Underestimating amount of work involved Figuring out the best process, and dealing with changing workloads for staff

14 Technical details These data are saved only when there are revisions; each data value has three pieces of information – The time period it applies to (e.g., 2 nd quarter 2011) – The time period it is true for (e.g., from July 30 th to August 26 th ) – The date that the information was entered into the database to allow for tracking of data entry errors

15 Technical details Underneath the hood, FRED and ALFRED are the same application. – ALFRED was populated by collecting historical data for series in FRED, and ALFRED continues to be extended by capturing "expiring" FRED values when new ones are published. – The coverage dates for data series are the same in both FRED and ALFRED

16 Conclusion ALFRED shows revisions to a series and presents data as they were at a particular point in time Unique information, FREE and easily accessed Preserving important data for future research

17 FRASER: Federal Reserve Archival System for Economic Research

18

19 Technical Aspects of FRASER Variation on LAMP software bundle – Linux operating system – Apache web server – PostgreSQL database (rather than the more common MySQL) – PHP programming

20 Google search appliance – Metadata plus full text (OCR) – Basic and advanced search options available – Standard Google search functions, plus a couple filters unique to FRASER

21 Topic Collections Special/ Archival Collections

22 Publications Originally, data publications Now include various types of serials and monographs Statistical releases Available issues, arranged by date Bibliographic information

23 Historical Documents Based on categories Originally “non-data” publications Documents Categories

24 Special Collections

25 Page Stacking Purpose: – View a single data series over time Solution: – Grouped page files – PDFLib+PDI

26 Personnel Center for Economic Documents Digitization (CEDD) consists of – 1 manager – 1 librarian – 5 part-time scanning clerks Additional support from – Web group – Library director

27 Digitization Process Selection and preparation Review paper documents & establish scanning procedures Scan Additional review, page by page Quality check (QC) This is done by a person other than the scanner Clean scanned image Process varies based on project Create PDF OCR Add metadata QC (brief) Transfer to server This must be done by one of the two librarians Post to FRASER Items can be posted as publications, historical documents, special collections – each with their own interface and metadata options Add link to catalog record and OCLC record This is done by the library’s cataloger, outside of the CEDD

28 Locating Paper Copies We scan documents from – Our own library collection, and other Fed libraries – FDLP Needs and Offers lists – Interlibrary loan – Partner institutions But… – As we digitize, libraries throw out paper copies

29 Copyright We focus on public domain materials – Federal Reserve Bank publications Not technically public domain, but we have an agreement to digitize – Federal Government publications – Pre-1923 publications

30 Hardware and Software Hardware Automatic Document Feeder (ADF) – 3 - Fujitsu fi-5650C – 2 - Fujitsu fi-6670 (newer model) Overhead/planetary scanner – 1 - Indus Color Book Scanner 5002 Flatbed scanner – 1 - Epson Expressions Graphic Arts 10000XL Software ImageWare BCS-2 – Indus scanning Techsoft PixEdit7 – Fujitsu and Epson scanning, and all cleaning ABBYY FineReader 10.0 – OCR Adobe Acrobat 9 Pro – Metadata Also: Microsoft Access 2007 – Metadata and tracking purposes for some larger collections PDF Summary Maker – Embedding metadata from Access into pdfs

31 Image/text areas as recognized by OCR software Green=text Blue=table Red=picture Text recognized by OCR software Blue=uncertain character(s)

32 Data Entry Web-based forms for data entry Here: setting up the overall publication (library catalog- level metadata)

33 Data Entry Issue-level metadata – Issue date – Issue title (text- formatted date, or other title) – Attach pdf – Enter table names and page titles for the page stacking described earlier

34 Data Entry Historical and Special Collection documents have both publication- and issue-level metadata Special Collection Document

35 Output 3 image files – Original multipage tiff – Cleaned multipage tiff – PDF 3 types of text/metadata – Underlying text in pdf (OCR) – Title and author embedded in pdf – Other metadata entered in database when posting

36 Contact Us Adrienne Brennecke alfred.stlouisfed.org/alfred.stlouisfed.org/ Data Acquisitions, Reference Librarian 314-444-7479 adrienne.j.brennecke@stls.frb.org Pamela Campbell fraser.stlouisfed.org/ fraser.stlouisfed.org/ Digital Projects Librarian 314-444-8907 pamela.d.campbell@stls.frb.org


Download ppt "Economic Data Time Travel Adrienne Brennecke September 30, 2011."

Similar presentations


Ads by Google