Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,

Slides:



Advertisements
Similar presentations
Where people and ideas meet Genealogy Resources You Can Use Now! Presentation by: Elise C. Cole, Local History Librarian & AskUs Coordinator.
Advertisements

International Institute of Social History replacing its library system: how we learned to love FOSS … and went live with Evergreen this September.
Endangered Languages and Web-Based Archiving Megan J. Crowhurst The University of Texas at Austin & CELP Contributors: Chris Beier, Heidi Johnson, Lev.
Digital Preservation A Matter of Trust. Context * As of March 5, 2011.
Components: How Bibliographic Records Became Grandparents Heather Curtis, Project Manager 2UNITED STATES HOLOCAUST MEMORIAL MUSEUM.
Collections Information Systems at the Smithsonian Ducky Nguyen Special Assistant for Planning & Project Management Office of the Chief Information Officer.
To Enhance Teaching and Learning. Images Documents Maps Audio Files Film Footage Images Documents Maps Audio Files Film Footage Realia Print Hardcopy.
Order from Chaos: Development and Implementation of NMAI's Culture Thesaurus Ann McMullen, Curator, National Museum of the American Indian North American.
MODULE 4 File and Folder Management. Creating file and folder A computer file is a resource for storing information, which is available to a computer.
Collections Management Museums EMu – Data Cleaning with EMu Data Cleaning with EMu Warren Hindley.
Ball State University Libraries A destination for research, learning, and friends Using Google Analytics Data to Expand Discovery and Use of Digital Archival.
Wilma Hodges  Began faculty training and moving content in Nov  Original plan was to be fully migrated to Sakai by.
Welcome Data Imports Instant Imports & How to Create an Import File Ryan McIntire Digital Measures.
ISP 121 Week 1 Introduction to Databases. ISP 121, Winter Why a database and not a spreadsheet? You have too many separate files or too much data.
Catalog: Batch delete old Patron Records How to conduct global/batch updates to records – patron Adding Faculty and Patron/Student Records Manually Standardizing.
COMPOSITION I Native American Resources Frederic Murray Assistant Professor MLIS, University of British Columbia BA, Political Science, University of Iowa.
Information Literacy Jen Earl: Academic Support Librarian- HuLSS.
July 29 and August 11, 2015 How CONTENTdm works: A demonstration Ron Gardner OCLC Digital Services Consultant.
Teaching the Old NMAI New Tricks Redesigning Object Movement and Data Compilation Workflows using IMu and the Statistics module European EMu Users Conference.
The City of Fargo Master Address File Project. Discovering what the heck is out there? The City of Fargo is currently developing a comprehensive, standardized,
LSP 121 Week 1 Intro to Databases. Welcome to LSP 121 Quantitative Reasoning and Technological Literacy II Continuation of quantitative data concepts.
Friday, 5/9 It is 1900 and you want to sell Hawaiian products in your New York City business. List the different paths of travel for your products. What.
Miscellaneous Excel Combining Excel and Access. – Importing, exporting and linking Parsing and manipulating data. 1.
Lecture Four: Steps 3 and 4 INST 250/4.  Does one look for facts, or opinions, or both when conducting a literature search?  What is the difference.
 Implemented EMu in 2006  Brought together 6 different databases  Did not clean data first  No central standards  Resulting errors needed to be cleaned.
Data Wrangling and Interoperability Andrea Denton Research and Data Services Manager Claude Moore Health Sciences Library Ricky Patterson.
WHAT IS A SEARCH ENGINE A search engine is not a physical engine, instead its an electronic code or a software programme that searches and indexes millions.
1 Working with MS SQL Server Textbook Chapter 14.
By: Star Duncan & Hannah Cole Computer Apps 4 th Period.
History of Music I Library Resources Frederic Murray Assistant Professor MLIS, University of British Columbia BA, Political Science, University of Iowa.
The World Wide Web is a great place to find more information about a topic. But there are a lot of sites out there—some are good and some are not so good.
HST 290: Fascism & Antifascism Searching for Sources Dr. Michael Seidman Ms. Sue Cody.
ITGS Databases.
View and Manage corporate files from within Baan and ERP Ln Baan Hot Link Ver 6.2.
History a story or record of important events that happened to a person or a nation, usually with an explanation of cause and effects.
Content Management System/ Web Quality Initiative Administrative Departments.
Research skills Data Bases Find it on the Computer.
PRESERVING YOUR PAST AND YOUR PRESENT FOR THE FUTURE.
Tutorial for Circulation Staff FIU Library
How I Spent My Summer – or – Oxford-Illinois Digital Libraries Placement Program Summer 2015 Jennifer Westrick, MSLIS University of Illinois, OIDLPP.
Multi-institutional collaborative research program. Established in 1988 to document the composition and status of natural vegetation of the Carolinas.
 There is a Family History section on the BYU- I Library home page.  This site includes:  vital records for eastern and western states  Death indexes.
Internet Power Searching: Finding Pearls in a Zillion Grains of Sand By Daniel Arze.
TOPSpro Special Topics I: Database Managemen t. Agenda for Module I: Database Management  TOPSpro Backup/Restore Wizard  TOPS-TOPS Import/Export Wizard.
Grant Writing for Digital Projects September 2012 IODE Project Office IODE Project Office Oostende, Belgium Oostende, Belgium Sustainability and.
Using OpenRefine in Digital Collections: the Spencer Sheet Music Project Bruce J. Evans Cataloging & Metadata Unit Leader/Music and Fine Arts Catalog Librarian.
Ulrich’s International Periodicals Directory. When working on a research assignment, your professor may ask you to use articles that come from peer-reviewed.
Progress and Goals. Goals of HerpNET Produce a distributed database among 36 museums in North America Use DiGIR protocol Georeference all N/S American.
Module 6: Configuring and Managing Windows SharePoint Services 3.0.
Week 1 Intro to the Course Intro to Databases.  Formerly ISP 121  “Continuation” of LSP 120 concepts  Topics include: ◦ Databases ◦ Basic statistics.
Creative Create Lists Elizabeth B. Thomsen Member Services Manager
Where to find online information
Chapter 12 Accessing Databases
Best practice Upgrade process
Digital Stewardship Curriculum
Make Links from your Baan System
Checking for overlap in e-resource collections How and why
Building A Web-based University Archive
RAD-IT Architecture Software Training
Tools and Techniques to Clean Up your Database
Finding Sources Introduction Types of sources Locating sources
Literary reference center
Microsoft Official Academic Course, Access 2016
Human Rights Research.
Manipulating and Sharing Data in a Database
Spreadsheets, Modelling & Databases
Regions of Canada.
Table of Contents – Part B
Get the Most Out of Internet Searching
Kara Lewis CIS Administrator National Museum of the American Indian
Presentation transcript:

Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution, National Museum of the American Indian

Long, long ago…in 2006… NMAI migrated all geographical data from two legacy databases into the Sites module in EMu Much of the data was not standardized Much of the data was “duplicate” information Made the decision to migrate “as is” and use the tools in EMu to clean it up As a result…

The Plot

The Conflict ~39,000 Unique combinations ~90,000 Sites records created ~337,500 Catalog records affected At least half were duplicates, or data was in the wrong field The rest were “variations,” obsolete place names, misspellings, or just plain wrong

The Plan of Attack

“Do or do not. There is no try.” Conventions: –No abbreviations no St. for Saint –Names in language of country –Alternate versions in parentheses Lac Saint-Jean (Lake Saint John) –Use 1 st level political subdivision Ecuador, Manabí Province –Use current names

“Do or do not. There is no try.” Conventions: –Country – Region? – State Pará State, North Region –Subdivisions on case by case basis –Leave blank if can’t determine higher subdivision - Fill it in if known - Most specific info. in Provenience

“You must unlearn what you have learned.” What Pat did not do: –Not a lot of energy spent on US state archaeological site numbers –This was cleanup, not verification

“Control, control, you must learn control!” Started with spreadsheet unique combinations of geographical data Split into smaller spreadsheets by state or country Learn about the country

“Ready are you? What know you of ready?” Content Resources: –General: Wikipedia, Statoids.com –International Travel Maps and Books of Vancouver, Canada –Country’s official website –Archaeological websites –Indigenous peoples’ websites –Government agencies –Maplandia.com –Google, JSTOR –MAI publications

“It is the future you see.” Nomenclature Resources: –US: Geographic Names Information System (GNIS) –Canada: Geographical Names Search Service (GNNS) –Others: GEOnet Names Server (GNS)

The worksheets (83 in total)

The Implementation in EMu Contractors do not have to be content experts Create new Sites, rather than “reuse” Practice first I do the actual deletions

The Confrontation 1.Start in the Sites module 2.Create list view with all fields 3.Search and group “old” Sites

The Confrontation 4.Open 2 nd window and create “new” Sites. 5.Find the unique combos of Sites and Provenience in Catalog 6.Check the “Collection” field

The Confrontation Start with Objects – usually “one to one” replacement Sort & highlight those to receive new Site Replace old IRN with new IRN Replace not Replace All Replace the Provenience in those already changed

The Confrontation Photo Archives is a different story Each record created a new Site record = duplicates Many IRNs to replace per “new” Site record Instead, use periods to represent wildcards…

The Confrontation Replace not Replace All Then go through Provenience as before Start with the number of digits that matches the “new” IRN

The Climax Double check with View>Attachments>Selected Records When spreadsheet completed, retire the “old” Sites

The Climax Contractors let me know what is retired Double check that all are detached DELETE!

Triumph! New data export to check unique values Checked with Pat on questions Final spreadsheet given to contractor

The Resolution We now have just under 15,500 Sites Records We finished in one year Averaged 2 contractors at a time Module is now tightly controlled Data is ready for the web

The End (or is it??) Sites was just the beginning… Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager