Using Old Streets to Make New Inroads to Data: Part 1

Slides:



Advertisements
Similar presentations
Make the CUT April 30, 2014 Required Readings In House, AERO, Publishers, Ares Supplementary Readings ACE, JMC Transcription Services at.
Advertisements

Microsoft Access  Access is a database software program.  Databases are basically large lists of information of a certain sort for example  Telephone.
Welcome to a brief overview of Project Workout Live! This is the “corporate home page” where you can access the projects, communications, directory and.
Access 2007 Product Review. With its improved interface and interactive design capabilities that do not require deep database knowledge, Microsoft Office.
Linking Electronic Reserves and Library Database Articles in Blackboard John Burke Gardner-Harvey Library or November 3, 2004.
High Volume Production of Alternative Text: Supporting a Statewide System The Alternative Media Access Center.
Rowan County Public Library.  Learn how to register for and log into Weebly.  Set up a Weebly sub-domain.  Instill an understanding of Weebly web-
II. Visiting the Library 1 updated 12/02/09. 2 Pat’s English class visits the BCC Library to locate literary criticism on Charlotte Perkins Gilman’s story,
Diabetes.org/stepout1-888-DIABETES Step 1:Go to diabetes.org/stepout Step 2:Click on the “Join Team” link. If starting a Team Click on “Start a Team” link.
Building and managing class pages on our new Web site School Wires Training.
Using LIRN® Guide Click here to continue. Click here to exit. Click here to go to the Table of Contents.
Diane Richmond - SunGard Higher Education Forms Us HELIX 2006 Presentation (H082) Diane Richmond.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Assignment Research 1 QUT Library Assignment Research.
What if there’s no full text?. Full text is often available in HTML and/or PDF format. Article 3 in this example is not. How can I get a copy?
Microsoft Excel 2007 © Wiley Publishing All Rights Reserved. The L Line The Express Line to Learning L Line.
F l o r e n c e K e l l e y I n C h i c a g o – 9 9 Document Model.
THE LIBRARY WEBSITE General Information. From the HCT portal, click on Library. Then, FALCON Library Catalogue.
Connie Rogers LIS 764 Fall is a free bibliography composer helps you generate, edit and publish a works cited list guides you through punctuation.
Evaluating Web Pages Techniques to apply and questions to ask.
Physical Education & Technology Chevon Mitchell Website:
Find an Essay in a Book Using Sophi Search or the ATLA Religion Database.
Content Management System/ Web Quality Initiative Administrative Departments.
Evaluating Web Pages Techniques to apply and questions to ask.
Odyssey Tutorial. First Step To use Odyssey the Server Enabled box must be checked. Odyssey should be left open (it can be minimized) to be available.
USER GUIDE TO BOOKS AT JSTOR November WHAT IS BOOKS AT JSTOR? Books at JSTOR is a program that offers ebooks from leading scholarly publishers,
Survey Please complete the following survey before we start. I will share my Google folder with you when we are done.
Web development. What is web development? - It is a broad term for the work involved in developing a web site for the Internet - It can range from developing.
Report on parallel session 5A, a.k.a. the DoSSiER session: Database of Scientific Simulation and Experimental Results 9/14/2016 Hans Wenzel 21st Geant4.
Searching the Library Catalogue
A B ? A B Match A to Column B 1) Hello! I´m fine. Thanks.
Wright State University
IFLA Newspapers pre-conference Geneva, Arturs Zogla
2 At the top of the zone in which you want to add the Web Part, click Add a Web Part. In the Add Web Parts to [zone] dialog box, select the check box of.
2 At the top of the zone in which you want to add the Web Part, click Add a Web Part. In the Add Web Parts to [zone] dialog box, select the check box of.
Tutorial Reading in EBSCOhost support.ebsco.com.
Welcome to Genealogy 101 Presented by Nicholas Clayton,
IT.CAS.Web2.0 Kyle Erickson
Lesson 11: Web Services & API's
Table of Contents: Part B
Tutorial Reading in EBSCOhost support.ebsco.com.
OverDrive Digital Library Basics
Welcome to Digital Cookie
DISCOVERY A new way to search for articles and other information
1 2 3 Here we are on the Ohio Web Library’s home page. To get to Business Source Premier, use the following steps: 1. Go to Ohio Web Library 2. Click on.
How to customize your Microsoft SharePoint Online website
OverDrive Digital Library Basics
E-NOTIFY and CAER OnLine Training
How to customize your Microsoft SharePoint Online website
Read all about it Microsoft SharePoint News
Learning Services Induction for Partner Institution Students
SharePoint Essentials Toolkit
Exploring Microsoft® Access® 2016 Series Editor Mary Anne Poatsy
IMAODBC, The Hague, 5-9 sept 2005
Finding Resources Video 1
How to customize your Microsoft SharePoint Online website
Welcome.
PX-Web 2019 and more… Mikael Nordberg Developer Statistics Sweden.
Integrating Koha and IIIF to manage a digital library
Using Subscription Databases
Marion Kelt, Research and Open Access Librarian
Syncing Omeka with Fedora Commons
Ontario Cup Tournament Convenor Guide May 2013
Final Project Display By 曹昕哲 Xinzhe Cao
….part of the OSU Libraries' suite of digital library tools…
Important Resources These resources will help you be successful in US History Class. We’ve used some of them at school, but I’m also asking you to access.
Marion Kelt, Research and Open Access Librarian
The implementation of the HIRMEOS Annotation Service
APE EAD3 introduction - DARIAH - Brussels
Using the Bartlett Diagnostic Sample Submission Program (Plants)
Presentation transcript:

Using Old Streets to Make New Inroads to Data: Part 1 Cole Hudson Wayne State University Hi, my name is Cole Hudson. I’m with Wayne State University, and I’m here to briefly talk about a project we are embarking upon at Wayne State.

Motor City Madness We have lots of streets. They’ve been around for a long time. Their numbering and names have changed. This is a problem for researchers.

A book to the rescue! And nope! There is a book called The Old and new house numbers : : new house numbers effective January 1st, 1921. It’s a book full of tables of streets with a street name at the top, a column for old street number, and a column for new street number. There’s only one copy. It’s at the Detroit Public Library across the street. There is a pdf of it on a random site that lots of people use. It’s a pretty poor scan of the book. We wanted to fix this. Our project, which is just in the beginning stages, will scan this book, transcribe the charts in the book to parseable data that could be plugged into a database, build a web app to allow people to match up pre-1921 addresses with modern street addresses. Oh, and we’re going to apply for a grant to pay a student intern to help us with the data work. Here’s a quick overview of what we’ve done. We asked our buddies there if we could borrow it and they were cool enough to let us scan it for our project.

What we’ve done Scanned the book. Made an interface that helps us cut out the street address tables in book In process of applying for a grant

Book So, here’s what the digitized book looks like. We crop out sections of the book using a javascript tool that capture coordinates for a box we draw. Then it sends that information to a table in a database. And we serve that image out using our Loris IIIF image server, which has no problems reading coordinates from an image. Finally, it’s all for our OCR process, and data checkup.

Crop and Display Here’s what the box drawing tool gets us. A table with a link to the section of the page that we’ve identified that has relevant text.

The URL https://digital.library.wayne.edu/loris/fedora:wayne:detroithousenumbers_Page_3|JP2/144,612,253,194/full/0/default.j pg

Sample Image Then here’s the image itself. Now it’s ready for OCR’ing with our Abbyy Recognition Server software.

Next Steps OCR book chunks with Abbyy Recognition Server Looking for some CSV output Hire a work study student for data cleanup Bulk ingest OCR’d data into DB Build a public web interface Probably Python. Lots of promising bulk comparison ability using pandas Python library

Hope to have more next year! Acknowledgments: Alexandra Sarkozy (team lead) Graham Hukill Jodi Coulter Clayton Hayes Hope to have more next year! Look for an update at Code4Lib Midwest 2019! Thank you! Cole Hudson, Digital Publishing Librarian, Wayne State University Libraries @colehud