Optical Music Recognition

Slides:



Advertisements
Similar presentations
IATI Technical Advisory Group Technical Proposals Simon Parrish IATI Technical Advisory Group, DIPR March 2010.
Advertisements

DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
OpenCV Introduction Hang Xiao Oct 26, History  1999 Jan : lanched by Intel, real time machine vision library for UI, optimized code for intel 
CAPTURE SOFTWARE Please take a few moments to review the following slides. Please take a few moments to review the following slides. The filing of documents.
1/41 OCVE 2004 Fujinaga Levy Sheet Music Project and Optical Music Recognition introducing Gamut Ichiro Fujinaga McGill University OCVE Workshop (May 2004)
E-Science Data Information and Knowledge Transformation The BinX Language.
ELPUB 2006 June Bansko Bulgaria1 Automated Building of OAI Compliant Repository from Legacy Collection Kurt Maly Department of Computer.
0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Aletheia Apostolos Antonacopoulos PRImA Lab, The University of Salford, United Kingdom
LYU 0102 : XML for Interoperable Digital Video Library Recent years, rapid increase in the usage of multimedia information, Recent years, rapid increase.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Presented by IBM developer Works ibm.com/developerworks/ 2006 January – April © 2006 IBM Corporation. Making the most of Creating Eclipse plug-ins.
Text-To-Speech System for Marathi Miss. Deepa V. Kadam Indian Institute of Technology, Bombay.
DigiMuse Digitalizing and Vocalizing Sheet Music for Mobile Devices running on Android OS by GOBİT.
Optical Music Recognition Ichiro Fujinaga McGill University 2003.
The GUIDO Music Notation Format
OpenAlea An OpenSource platform for plant modeling C. Pradal, S. Dufour-Kowalski, F. Boudon, C. Fournier, C. Godin.
Framework for Automated Builds Natalia Ratnikova CHEP’03.
Zhonghua Qu and Ovidiu Daescu December 24, 2009 University of Texas at Dallas.
Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche.
Contactforum: Digitale bibliotheken voor muziek. 3/6/2005 Real music libraries in the virtual future: for an integrated view of music and music information.
1 © Copyright 2009 EMC Corporation. All rights reserved. ISIS and PixTools Toolkits Quickly Enabling Document Capture Solutions EMC Corporation.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
Caravan Business Server a viable alternative development platform niti telecom consultancy april 2002.
Themes Architecture Content Metadata Interoperability Standards Knowledge Organisation Systems Use and Users Legal and Economic Issues The Future.
Overview of IU Digital Collections Search Hui Zhang Jon Dunn Indiana University Digital Library Program IU Digital Library Brown Bag October 19, 2011.
Archivists' Toolkit - CRADLE Presentation, 10 Feb The Archivists’ Toolkit CRADLE Presentation 10 Feb
Data Management BIRN supports data intensive activities including: – Imaging, Microscopy, Genomics, Time Series, Analytics and more… BIRN utilities scale:
1 Bridging the gap between the paper past and digital future.
1 By: Suman Negi, Technical Officer ‘B’ DESIDOC, DRDO, Delhi Presentation at NACLIN 14 (During 9-11 December 2014, Pondicherry) Design and Development.
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
Reading Flash. Training target: Read the following reading materials and use the reading skills mentioned in the passages above. You may also choose some.
Gamera Optical Music Recognition in a New Shell Michael Droettboom, Karl MacMillan Sheridan Libraries Johns Hopkins University Ichiro Fujinaga McGill University.
September 25, 2006 NASA Feasibility Study Status Update.
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University.
Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,
 Before you continue you should have a basic understanding of the following:  HTML  CSS  JavaScript.
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
Expertsfromindia for Joomla Development. Introduction Joomla is an open source and free content management system (CMS) for publishing content on the.
Bucharest, 23 February 2005 CHM PTK technologies Adriana Baciu Finsiel Romania.
Using the Gamera framework for the recognition of cultural heritage materials Levy Project II Digital Knowledge Center, Sheridan Libraries, Michael Droettboom,
Visual Information Processing. Human Perception V.S. Machine Perception  Human perception: pictorial information improvement for human interpretation.
DARE: Domain analysis and reuse environment Minwoo Hong William Frakes, Ruben Prieto-Diaz and Christopher Fox Annals of Software Engineering,
Graph-based Segmentation
AEM Digital Asset Management - DAM Author : Nagavardhan
OCR Reading.
Python Programming Unit -1.
CSC391/691 Intro to OpenCV Dr. Rongzhong Li Fall 2016
PLM, Document and Workflow Management
Topics Introduction Hardware and Software How Computers Store Data
Joseph JaJa, Mike Smorul, and Sangchul Song
OpenOffice. org Extensions Infrastructure What it is –. What it can –
Content Management Systems
Gamera A Toolkit for Structured Document Recognition including Music
Tools of Software Development
.NET and .NET Core 7. XAML Pan Wuming 2017.
Aspects of Music Information Retrieval
Objective Understand web-based digital media production methods, software, and hardware. Course Weight : 10%
Metadata to fit your needs... How much is too much?
Preserving Our Collective Digital History
Multimedia Information Retrieval
How to Improve Releasing Efficiency via i18N/L10n Test Automation.
Malte Dreyer – Matthias Razum
Realtime Recognition of Orchestral Instruments
Realtime Recognition of Orchestral Instruments
Web Application Development Using PHP
Presentation transcript:

Optical Music Recognition Ichiro Fujinaga McGill University 2003

Content Optical Music Recognition Levy Project Gamera Levy Sheet Music Collection Digital Workflow Management Gamera Guido / NoteAbility

Optical Music Recognition (OMR) Trainable open-source OMR system in development since 1984 Staff recognition and removal Run-length coding Projections Lyric removal / classifier Stems and notehead removal Music symbol classifier Score reconstruction Demo

OMR: Classifier Connected-component analysis Feature extraction, e.g: Width, height, aspect ratio Number of holes Central moments k-nearest neighbor classifier Genetic algorithm

Overall Architecture for OMR Image File Staff removal Segmentation Recognition K-NN Classifier Output Symbol Name Optimization Genetic Algorithm K-nn Classifier Knowledge Base Feature Vectors Best Weight Vector Off-line

Lester S. Levy Collection

Lester S. Levy Collection North American sheet music (1780–1960) Digitized 29,000 pieces including “The Star-Spangle Banner” and “Yankee Doodle” Database of: text index records images of music (8bit gray) lyrics (first lines of verse and chorus) color images of cover sheets (32bit) http://levysheetmusic.mse.jhu.edu

Digital Workflow Management Reduce the manual intervention for large-scale digitization projects Creation of data repository (text, image, sound) Optical Music Recognition (OMR) Gamera XML-based metadata composer, lyricist, arranger, performer, artist, engraver, lithographer, dedicatee, and publisher cross-references for various forms of names, pseudonyms authoritative versions of names and subject terms Music and lyric search engines Analysis toolkit

The problem Suitable OCR for lyrics not found Commercial OCR systems are often inadequate for non-standard documents The market for specialized recognition of historical documents is very small Researchers performing document recognition often “re-invent” the basic image processing wheel

The solution Provide easy to use tools to allow domain experts (people with specialized knowledge of a collection) to create custom recognition applications Generalize OMR for structured documents

Introducing Gamera Framework for creation of structured document recognition system Designed for domain experts Image processing tools (filters, binarizations, etc.) Document segmentation and analysis Symbol segmentation and classification Feature extraction and selection Classifier selection and combiners Syntactical and semantic analysis Generalized Algorithms and Methods for Enhancement and Restoration of Archives

Features of Gamera Portability (Unix, Windows, Mac) Extensibility (Python and C++ plugins) Easy-to-use (experts and programmers) Open source Graphic User Interface Interactive / Batchable (scripts)

Architecture of Gamera Graphic User Interface (wxWindows) Scripting Environment (Python) Plugins (Python) Automatic Plugin Wrapper (Boost) Plugins (C++) GAMERA Core (C++)

Example of C++ Plugin // Number of pixels in matrix #include “gamera.hh” #ifdef __area_wrap__ #define NARGS 1 #define ARG1_ONEBIT #endif using namespace Gamera; template <class T> feature_t area(T &m) { return feature_t(m.nrows() * m.ncols()); }

Example of Python Plugin // This filters a list of CC objects import gamera def filter_wide(ccs, max_width): tmp = [] for x in ccs: if x.ncols() > max_width: x.fill_matrix(0) else: tmp.append(x) return tmp

Gamera: Interface (screenshot in Linux)

Gamera: Interface (screenshot in Linux)

Histogram (screenshot in Linux)

Thresholding (screenshot in Linux)

Thresholding (screenshot in Linux)

Staff removal: Lute tablature

Classifier: Lute (screenshot in Linux)

Staff removal: Neums

Classifier: Neums (screenshot in Linux)

Greek example

GUIDO Music Notation Format H. Hoos, K. Renz, J. Kilian “A formal language for score-level representation” Plain text: readable, platform independent Extensible and flexible Adequate representation NoteServer: Web/Windows GUIDO/XML NoteAbility (K. Hamel)

GUIDO: An example { [ \beamsOff | \clef<"treble"> \key<"D"> f#*1/8. g*1/16 | a*1/4. d2*1/8 d*1/4. c#*1/8 | e1*1/2 _*1/4 f#*1/8. g*1/16 | c#2*1/4. b1*1/8 a*1/4. g*1/8 | | e#*1/2 f#*1/4 f#*1/8. g*1/16 | e1*1/2 _*1/4 f#*1/8 g | c#2*1/4. b1*1/8 a*1/4. c#*1/8 ], …

NoteAbility Demo

Conclusions Gamera allows rapid development of domain-specific document recognition applications Domain experts can customize and control all aspects of the recognition process Includes an easy-to-use interactive environment for experimentation Beta version available on Linux OS X version in preparation

Projections X-projections Y-projections back