The TARO Project Texas Archival Resources Online Fred Gilmore Sr Operating Systems Specialist UT Austin General Libraries April 30, 2004
What It Is... A project to make Texas archive and manuscript collection finding aids available through the Web. “finding aid”: descriptive summary and inventory of a material collection housed at a specific archive; not the materials themselves. Currently: searchable, browsable finding aids, hits / day
How it came to be... Two grant funded phases: – Outsourced scanning, OCR, XML tagging of existing paper finding aids – Training/hardware/software for creation of new finding aids – Phase I (2000 – 2001) : 14 participating repositories – Phase II (2002 – 2003) : additional 11 repositories
Participating Repositories Alexander Architectural Archive (UT Austin) Center For American History (UT Austin) Benson Latin America Collection (UT Austin) Ransom Humanities Research Center (UT Austin) Texas State Library Texas Tech Southwest Collection/University Archives University of Houston Special Collections/University Archives Rice University Texas A&M Houston Public Library Austin History Center UT San Antonio Texas State University Southern Methodist University UT Medical Branch – Galveston MD Anderson UT El Paso UT Pan American UT Arlington
How It Came To Be... Why XML? – Compose once, format many – XML and related standards make data exchange/reuse, description easier through separation.
Creating content for TARO Archives staff: – Edit or compose XML tagged electronic version of finding aid (new finding aids are created using text/XML editor such as Corel XMetaL) – Submit file to UT Austin server
. Thomas J. Rollins Papers, and undated The personal papers of Thomas J. Rollins from and undated. S Southwest Collection/Special Collections Library,.
Creating Content For TARO UT Austin technical staff: – XML file is moved into production, error checked, translated into three HTML varieties for viewing. – HTML content is indexed for searching (keyword and fielded), sorted into repository lists for browsing
Advantages Pages picked up by Google and give content higher visibility. Multiple views of content including ability to customize view by running the XML document against a personal stylesheet. Processing fully automated. HTML translated files can be available within hours. DC metadata and OAI records provide additional access points.
Challenges Relationships – Mediating local needs with federated site requirements. – Encouraging supplemental metadata creation. Resources – Introducing improvements without dedicated staff on either end.
Challenges Realities of the Web – User education. Practically a meta-site. Content expectations not met. – Finding aids can be large. Load times a problem. – XML Unicode requirements make special characters tricky.
Future Plans Searching: search XML directly Content: fund the creation, serving of pictures, sound, video Participation: more repositories = more content Access: Open Archives, RDF metadata Flexibility: provide stylesheet for direct XML browsing, PDF creation for hardcopy