Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Slides:



Advertisements
Similar presentations
CAPTURE SOFTWARE Please take a few moments to review the following slides. Please take a few moments to review the following slides. The filing of documents.
Advertisements

Client Lunch & Learn (12:15). Association for Information & Image Management Nov Research Scanner Utilization.
GUIDO Music Notation Jordan Smith MUMT January 2008.
1/41 OCVE 2004 Fujinaga Levy Sheet Music Project and Optical Music Recognition introducing Gamut Ichiro Fujinaga McGill University OCVE Workshop (May 2004)
Using Sakai to Support eScience Sakai Conference June 12-14, 2007 Sayeed Choudhury Tim DiLauro, Jim Martino, Elliot Metsger, Mark Patton and David Reynolds.
ELPUB 2006 June Bansko Bulgaria1 Automated Building of OAI Compliant Repository from Legacy Collection Kurt Maly Department of Computer.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
ISP 433/533 Week 8 IR in libraries. Goal Universal Access to Information Vannevar Bush 1945 article Memex A memex is a device in which an individual stores.
WMES3103 : INFORMATION RETRIEVAL
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Aletheia Apostolos Antonacopoulos PRImA Lab, The University of Salford, United Kingdom
LYU 0102 : XML for Interoperable Digital Video Library Recent years, rapid increase in the usage of multimedia information, Recent years, rapid increase.
Libraries and Institutional Content Management Systems
Text-To-Speech System for Marathi Miss. Deepa V. Kadam Indian Institute of Technology, Bombay.
Use of METS in CDL Digital Special Collections Brian Tingle.
Braille Converter For Exam Background What is Braille? Braille is a series of raised dots that can be read with the fingers by people who are.
DigiMuse Digitalizing and Vocalizing Sheet Music for Mobile Devices running on Android OS by GOBİT.
Optical Music Recognition Ichiro Fujinaga McGill University 2003.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
SobekCM’s Community Ecosystems & Socio-Technical Practices Presented by Mark V. Sullivan June 10 th, 2014 Sobek image created by Jeff Dahl and is shared.
The GUIDO Music Notation Format
Framework for Automated Builds Natalia Ratnikova CHEP’03.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche.
Computer Software CSCI N207 Data Analysis Using Spreadsheet Department of Computer and Information Science, IUPUI.
Contactforum: Digitale bibliotheken voor muziek. 3/6/2005 Real music libraries in the virtual future: for an integrated view of music and music information.
Planning a digital library How to Build a Digital Library Ian H. Witten and David Bainbridge.
Project Builder and MediaMatrix: Redefining Access in the Digital Age Dean Rehberger and Michael Fegan MERLOT August 7-10, 2006 New Orleans, LA.
Producción de Sistemas de Información Agosto-Diciembre 2007 Sesión # 8.
Aspects of Music Information Retrieval Will Meurer School of Information University of Texas.
Themes Architecture Content Metadata Interoperability Standards Knowledge Organisation Systems Use and Users Legal and Economic Issues The Future.
Braille Converter For Exam Agenda 1.Introduction 2.Research Problem 3.Objectives 4.Methodology 5.Users & Benefits 6.Expected Outputs 7.References.
Planning a digital library How to Build a Digital Library Ian H. Witten and David Bainbridge.
Institute for Visualization and Perception Research 1 © Copyright 2000 Haim Levkowitz Introduction (Foley & Van Dam Ch 1) Uses of computer graphics … Some.
Jaws Digital Courier Justin Coombes Product Manager Jaws Product Line / Global Graphics.
COLLECTING Software. Why use Software with Hardware? Software used for collecting includes the software that interfaces with hardware collection device.
1 By: Suman Negi, Technical Officer ‘B’ DESIDOC, DRDO, Delhi Presentation at NACLIN 14 (During 9-11 December 2014, Pondicherry) Design and Development.
Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Libraries and Museums Jenn Riley Metadata Librarian Indiana University Digital Library.
INTELLECTUAL RIGHTS AND HISTORIC CORPORA Mark Sandler University of Michigan ICOLC, March, 2003.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
From Manuscript to Printing Press to Computer Chip Studying Early Music in Digital Format Susan Forscher Weiss Johns Hopkins University Ichiro Fujinaga.
Nikola Tesla Museum Clipping Library Saša Malkov Nenad Mitić Žarko Mijajlović 3 rd SEEDI Int.Conf. Cetinje, Montenegro 14. September 2007.
Gamera Optical Music Recognition in a New Shell Michael Droettboom, Karl MacMillan Sheridan Libraries Johns Hopkins University Ichiro Fujinaga McGill University.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Realtime Recognition of Orchestral Instruments Ichiro Fujinaga McGill University.
MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
Mantid Manipulation and Analysis Toolkit for Instrument data.
Collection Management Systems
A Project of the University Libraries Ball State University Libraries A destination for research, learning, and friends.
Jean-Yves Le Meur - CERN Geneva Switzerland - GL'99 Conference 1.
1 CS 430: Information Discovery Lecture 21 Non-Textual Materials 1.
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
CONTENTdm A proven solution September A complete digital collection management software solution Stores, manages and provides access for all digital.
1 CS 430: Information Discovery Lecture 23 Non-Textual Materials.
Using the Gamera framework for the recognition of cultural heritage materials Levy Project II Digital Knowledge Center, Sheridan Libraries, Michael Droettboom,
? What is Institutional Repository for Rutgers University
SowiDataNet - A User-Driven Repository for Data Sharing and Centralizing Research Data from the Social and Economic Sciences in Germany Monika Linne, 30.
Application Software Productivity Tools for Educators
Gamera A Toolkit for Structured Document Recognition including Music
Aspects of Music Information Retrieval
Metadata to fit your needs... How much is too much?
Preserving Our Collective Digital History
Optical Music Recognition
Malte Dreyer – Matthias Razum
BUILDING A DIGITAL REPOSITORY FOR LEARNING RESOURCES
New Platform to Support Digital Humanities in the Czech Republic
Realtime Recognition of Orchestral Instruments
Presentation transcript:

Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury, Tim DiLauro, Mark Patton, Teal Anderson Levy Project II Digital Knowledge Center Sheridan Libraries Johns Hopkins University

Contents  Levy Project  Levy Sheet Music Collection  Digital Workflow Management  Optical Music Recognition  Gamera  Guido / NoteAbility Current goals Digitization completed Under development

Lester S. Levy Collection

Lester S. Levy Collection levysheetmusic.mse.jhu.edu  North American sheet music (1780– 1960)  Digitized 29,000 pieces (130,000 sheets)  Began in 1994  includes “The Star-Spangle Banner” and “Yankee Doodle”

Lester S. Levy Collection levysheetmusic.mse.jhu.edu  North American sheet music (1780– 1960)  Digitized 29,000 pieces (130,000 sheets)  Began in 1994  includes “The Star-Spangle Banner” and “Yankee Doodle”  Database of:  metadata  images of music (8bit gray)  lyrics (first lines of verse and chorus)  color images of cover sheets (32bit)

 Reduce the manual intervention for large-scale digitization projects  Creation of data repository (text, image, sound)  Optical Music Recognition (OMR)  Gamera  XML-based metadata  composer, lyricist, arranger, performer, artist, engraver, lithographer, dedicatee, and publisher  cross-references for various forms of names, pseudonyms  authoritative versions of names and subject terms  Music and lyric search engines  Analysis toolkit Digital Workflow Management

Optical Music Recognition (OMR)  Trainable open-source OMR system in development since 1984  Staff recognition and removal  Lyric removal  Stems and notehead removal  Music symbol classifier  Score reconstruction  Lyric classifier?  Optical Character Recognition (OCR)

The problem  Suitable OCR for lyrics not found  Commercial OCR systems are often inadequate for non-standard documents  The market for specialized recognition of historical documents is very small  Researchers performing document recognition often “re-invent” the basic image processing wheel

The solution  Provide easy to use tools to allow domain experts (people with specialized knowledge of a collection) to create custom recognition applications  Generalize OMR for structured documents

Introducing Gamera  Framework for creation of structured document recognition system  Designed for domain experts  Image processing tools (filters, binarizations, …)  Document segmentation and analysis  Symbol segmentation and classification  Syntactical and semantic analysis Generalized Algorithms and Methods for Enhancement and Restoration of Archives

Features of Gamera  Portability (Unix, Windows, Mac)  Extensibility (Python and C++ plugins)  Easy-to-use (experts and programmers)  Open source  Graphic User Interface  Interactive / Batchable (scripts)

Gamera: Interface (screenshot in Linux)

Histogram (screenshot in Linux)

Thresholding (screenshot in Linux)

Staff removal: Lute tablature

Classifier: Lute (screenshot in Linux)

Staff removal: Neumes

Classifier: Neums (screenshot in Linux)

Greek example

GUIDO Music Notation Format H. Hoos, K. Renz, J. Kilian  “A formal language for score-level representation”  Plain text: readable, platform independent  Extensible and flexible  Adequate representation  NoteServer: Web/Windows  GUIDO/XML  NoteAbility (K. Hamel)

Conclusions  Levy Collection  Searchable Metadata  Online images (public domain) of music and cover  Digital Workflow Management  Optical Music Recognition  Gamera for domain experts  Includes an easy-to-use interactive environment for experimentation  Beta version available on Linux  OS X and Windows version in preparation

Acknowledgements  National Science Foundation  National Endowments for the Humanities  Institute of Museum and Library Services  The Levy Family

OMR: Classifier  Connected-component analysis  Feature extraction, e.g:  Width, height, aspect ratio  Number of holes  Central moments  k-nearest neighbor classifier  Genetic algorithm

Overall Architecture for OMR Staff removal Segmentation Recognition K-NN Classifier Output Symbol Name Knowledge Base Feature Vectors Optimization Genetic Algorithm K-nn Classifier Best Weight Vector Image File Off-line

Graphic User Interface (wxWindows) Architecture of Gamera GAMERA Core (C++) Scripting Environment (Python) Plugins (Python) Automatic Plugin Wrapper (Boost) Plugins (C++)

GUIDO: An example { [ \beamsOff | \clef \key f#*1/8. g*1/16 | a*1/4. d2*1/8 d*1/4. c#*1/8 | e1*1/2 _*1/4 f#*1/8. g*1/16 | c#2*1/4. b1*1/8 a*1/4. g*1/8 | | e#*1/2 f#*1/4 f#*1/8. g*1/16 | a*1/4. d2*1/8 d*1/4. c#*1/8 | e1*1/2 _*1/4 f#*1/8 g | c#2*1/4. b1*1/8 a*1/4. c#*1/8 ], …