LEADS-4-NDP: Fellowship

Slides:



Advertisements
Similar presentations
Sam Hastings University of North Texas School of Library and Information Sciences User Input into Image Retrieval Design.
Advertisements

Helping Helping Interdisciplinary Vocabulary Engineering Ryan Scherle – National Evolutionary Synthesis Center Jose Aguera – University of North Carolina.
Moving Beyond The “Back Room”: The Expanded Role for Metadata and Catalog Librarians in Campus Digital Humanities Efforts Lisa McFall Hamilton College.
The OCLC Metadata Switch Project Jean Godby, Thomas Hickey, Diane Vizine-Goetz OCLC Office of Research Digital Library Federation May 14, 2003.
Package for Learning Fundamental Knowledge on Geospatial Technology Morishige Ota Fellow, Kokusai Kogyo Co., Ltd. Guest Researcher, The University of Tokyo.
IR & Metadata. Metadata Didn’t we already talk about this? We discussed what metadata is and its types –Data about data –Descriptive metadata is external.
Metadata for Heterogeneous Digital Assets Fellow: Yong-Mi Kim Faculty Mentors: Judy Ahronheim and Lynn Johnson.
Module 6a: Intro to Controlled Vocabularies, Taxonomies and Classification IMT530: Organization of Information Resources Winter 2007 Michael Crandall.
Digital textNamed Entities Hovering over a named entity highlights the areas where it appears in the text.
Joan S. Mitchell Executive Director & Editor in Chief Dewey Decimal Classification OCLC WebDewey.
Thesaurus Design and Development
Architecture & Data Management of XML-Based Digital Video Library System Jacky C.K. Ma Michael R. Lyu.
ECAI – CAA Conference, Fargo, April 19, 2006 Geo-temporal Indexing: Events, Lives, and Geographical Features Michael Buckland also Kim Carl, Sarah Ellinger.
 MODERN DATABASE MANAGEMENT SYSTEMS OVERVIEW BY ENGINEER BILAL AHMAD
Databases & Data Warehouses Chapter 3 Database Processing.
Educause October 29, 2001 A GEM of a Resource: The Gateway to Educational Materials Copyright Nancy Virgil Morgan, This work is the intellectual.
Digging into Metadata (abridged) Michael Khoo, Xia Lin, Jae-wook Ahn Drexel University, USA Ceri Binding, Douglas Tudhope Hypermedia Research Unit, University.
What is this collection about? Issues in analysing collection subjects Ritchie Thomson Presented at the Electric Connections 2006 seminar, Dundee.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Data Management Practices: BCO-DMO’s Successes and Challenges Bob Groman BCO-DMO Woods Hole Oceanographic Institution NERACOOS/NeCODP Data Management Workshop.
Final Search Terms: Archiving (digital or data) Authentication (data) Conservation (digital or data) Curation (digital or data) Cyberinfrastructure Data.
Automatic Subject Classification and Topic Specific Search Engines -- Research at KnowLib Anders Ardö and Koraljka Golub DELOS Workshop, Lund, 23 June.
Inside the DDC Dewey goes Europe: On the use and development of the Dewey Decimal Classification (DDC) in European libraries Austrian National Library.
GLOSSARY COMPILATION Alex Kotov (akotov2) Hanna Zhong (hzhong) Hoa Nguyen (hnguyen4) Zhenyu Yang (zyang2)
D4: SKOS and HIVE—Enhancing the Creation, Design and Flow of Information Speakers: Hollie White Jane Greenberg Coordinator: Alan Keely.
Personal information: Name: Naif bin Abdullah bin Ahmed Alhabas. Current job: Librarian, Faculty of Science. Major: Library and Information. Grade: seventh.
Data Types and RunSQLSTM. Agenda Lab 1 demo this week –Bring your lab notes! Create your own Data Types Label on Authority RunSQLstm.
Developing a Concept Extraction Technique with Ensemble Pathway Prat Tanapaisankit (NJIT), Min Song (NJIT), and Edward A. Fox (Virginia Tech) Abstract.
Qatar Content Classification Presenter Mohamed Handosa VT, CS6604 March 6, 2014 Client Tarek Kanan 1.
EnTaG Enhanced (social) Tagging for Discovery Doug Tudhope Hypermedia Research Unit, University of Glamorgan Exeter.
Caroline Williams, Executive Director of Intute Andy Priest, Intute Technical Co-ordinator
1 Automatic indexing Salton: When the assignment of content identifiers is carried out with the aid of modern computing equipment the operation becomes.
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
Iana Atanassova Research: – Information retrieval in scientific publications exploiting semantic annotations and linguistic knowledge bases – Ranking algorithms.
Text Analytics A Tool for Taxonomy Development Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture.
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
CP3024 Lecture 12 Search Engines. What is the main WWW problem?  With an estimated 800 million web pages finding the one you want is difficult!
LIS 204: Introduction to Library and Information Science Week Nine Kevin Rioux, PhD.
Jane Greenberg, Director, Metadata Research Center, and Professor, College of Computing & Informatics Isaac Simmons, Research Engineer, Applied Informatics.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
HIVE-DRYAD Integration. For Curators Use HIVE to generate subject, taxon, and spatial terms suggestion. Curator’s needs: – Get terms suggestion from HIVE.
HIVE as a Machine-aided Indexing Tool Personal Keyword use without vocabulary control Machine-aided indexing term extraction Participant relevant and not.
INFORMATION STROAGE AND RETRIEVAL SYSTEM By Ms. Preeti Patel Lecturer School of Library And Information Science DAVV, Indore
Digitizing Historical Newspapers South Carolina Digital Newspaper Program's participation with the Library of Congress' Chronicling America: Historic American.
Semantic Web Overview Diane Vizine-Goetz OCLC Research.
| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April Machine-based issuing of DNB Subject Categories.
1 CS 430: Information Discovery Lecture 7 Automatic Generation of Catalog Records.
Major Issues n Information is mostly online n Information is increasing available in full-text (full-content) n There is an explosion in the amount of.
1 Discovery Interface Display Logic Yoel Kortick Senior Librarian.
Information Literacy University of Namibia Library 2006.
Theoretical Perspectives: Information, Language and Cognition Week 14 Lecture notes INF 380E: Perspectives on Information Spring
Slides Template for Module 3 Contextual details needed to make data meaningful to others CC BY-NC.
Algorithmic approach to contemporary bibliography generation
Research on Knowledge Element Relation and Knowledge Service for Agricultural Literature Resource Xie nengfu; Sun wei and Zhang xuefu 3rd April 2017.
Libraries as Data-Centers for the Arts and Humanities
VIVO: Faculty Research Information System and Discovery
Introduction to Statistics
RECENT TRENDS IN METADATA GENERATION
Dewey Decimal Classification System
Diane Vizine-Goetz OCLC Research
PREMIS Tools and Services
Carly Schanock, Project Manager
Marcos André Gonçalves
Dewey Decimal Classification System
Dewey Decimal Classification System
Text Format Files Number Files Size(Bytes) Words Number
Library of congress classification
Calibration Infrastructure Design
Presentation transcript:

LEADS-4-NDP: Fellowship LEADS Fellow: Sam Grabus, Drexel University, Metadata Research Center LEADS Site: Temple University Library, Digital Scholarship Center Mentor: Peter Logan

<Indexing the Data Set of 19th Century Knowledge> Sam Grabus, Drexel University; Peter Logan, Temple University; Jane Greenberg PROJECT GOALS Temple’s broad data science question: Investigate how the specification of concepts change over time across 4 historical Encyclopedia Britannicas (1797-1911) Use automatic indexing to create individual entry descriptive metadata that can be used for analysis across all 4 editions APPROACH Identify encyclopedia entry terms that exist in all 4 editions of the Encyclopedia Britannica Automatically index entries with HIVE using contemporary LCSH and keyword extraction algorithms ACCOMPLISHMENTS Data cleaning with R Intersected 4 lists of entry terms to determine which terms appear in all 4 editions of the encyclopedia; created TXT files for each entry Ran sample TXT files through HIVE to generate automatic indexing results Tested 3 keyword extraction algorithms: Kea, Maui, & RAKE Compared LCSH and Agrovoc vocabularies Identified challenges & next steps for optimizing RAKE algorithm parameters & addition of historical controlled vocabularies to HIVE AUTOMATIC INDEXING WITH HIVE *Acknowledgements to Joan Boone

Moving onwards with an NEH grant Next Steps: Moving onwards with an NEH grant Identifying appropriate historical knowledge organization systems Challenges of using contemporary LCSH for 19th century text e.g., Encyclopedia entries for Raleigh (Sir Walter),  SIR (Information Retrieval System) Digitize one or more of the historical vocabularies into XML or SKOS 1910-1914 LCSH Universal Decimal Classification (UDC) Dewey Decimal Classification (DDC) e.g., SKOS for “Rum” in LCSH  Relevancy testing to identify most effective vocabularies for indexing these encyclopedia entries