Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,

Similar presentations


Presentation on theme: "Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,"— Presentation transcript:

1 Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury, Tim DiLauro, Mark Patton, Teal Anderson Levy Project II Digital Knowledge Center Sheridan Libraries Johns Hopkins University

2 Contents  Levy Project  Levy Sheet Music Collection  Digital Workflow Management  Optical Music Recognition  Gamera  Guido / NoteAbility Current goals Digitization completed Under development

3 Lester S. Levy Collection

4 Lester S. Levy Collection levysheetmusic.mse.jhu.edu  North American sheet music (1780– 1960)  Digitized 29,000 pieces (130,000 sheets)  Began in 1994  includes “The Star-Spangle Banner” and “Yankee Doodle”

5

6 Lester S. Levy Collection levysheetmusic.mse.jhu.edu  North American sheet music (1780– 1960)  Digitized 29,000 pieces (130,000 sheets)  Began in 1994  includes “The Star-Spangle Banner” and “Yankee Doodle”  Database of:  metadata  images of music (8bit gray)  lyrics (first lines of verse and chorus)  color images of cover sheets (32bit)

7

8  Reduce the manual intervention for large-scale digitization projects  Creation of data repository (text, image, sound)  Optical Music Recognition (OMR)  Gamera  XML-based metadata  composer, lyricist, arranger, performer, artist, engraver, lithographer, dedicatee, and publisher  cross-references for various forms of names, pseudonyms  authoritative versions of names and subject terms  Music and lyric search engines  Analysis toolkit Digital Workflow Management

9 Optical Music Recognition (OMR)  Trainable open-source OMR system in development since 1984  Staff recognition and removal  Lyric removal  Stems and notehead removal  Music symbol classifier  Score reconstruction  Lyric classifier?  Optical Character Recognition (OCR)

10 The problem  Suitable OCR for lyrics not found  Commercial OCR systems are often inadequate for non-standard documents  The market for specialized recognition of historical documents is very small  Researchers performing document recognition often “re-invent” the basic image processing wheel

11 The solution  Provide easy to use tools to allow domain experts (people with specialized knowledge of a collection) to create custom recognition applications  Generalize OMR for structured documents

12 Introducing Gamera  Framework for creation of structured document recognition system  Designed for domain experts  Image processing tools (filters, binarizations, …)  Document segmentation and analysis  Symbol segmentation and classification  Syntactical and semantic analysis Generalized Algorithms and Methods for Enhancement and Restoration of Archives

13 Features of Gamera  Portability (Unix, Windows, Mac)  Extensibility (Python and C++ plugins)  Easy-to-use (experts and programmers)  Open source  Graphic User Interface  Interactive / Batchable (scripts)

14 Gamera: Interface (screenshot in Linux)

15

16 Histogram (screenshot in Linux)

17 Thresholding (screenshot in Linux)

18

19 Staff removal: Lute tablature

20

21 Classifier: Lute (screenshot in Linux)

22 Staff removal: Neumes

23 Classifier: Neums (screenshot in Linux)

24 Greek example

25 GUIDO Music Notation Format H. Hoos, K. Renz, J. Kilian  “A formal language for score-level representation”  Plain text: readable, platform independent  Extensible and flexible  Adequate representation  NoteServer: Web/Windows  GUIDO/XML  NoteAbility (K. Hamel)

26

27 Conclusions  Levy Collection  Searchable Metadata  Online images (public domain) of music and cover  Digital Workflow Management  Optical Music Recognition  Gamera for domain experts  Includes an easy-to-use interactive environment for experimentation  Beta version available on Linux  OS X and Windows version in preparation

28 Acknowledgements  National Science Foundation  National Endowments for the Humanities  Institute of Museum and Library Services  The Levy Family

29 OMR: Classifier  Connected-component analysis  Feature extraction, e.g:  Width, height, aspect ratio  Number of holes  Central moments  k-nearest neighbor classifier  Genetic algorithm

30 Overall Architecture for OMR Staff removal Segmentation Recognition K-NN Classifier Output Symbol Name Knowledge Base Feature Vectors Optimization Genetic Algorithm K-nn Classifier Best Weight Vector Image File Off-line

31 Graphic User Interface (wxWindows) Architecture of Gamera GAMERA Core (C++) Scripting Environment (Python) Plugins (Python) Automatic Plugin Wrapper (Boost) Plugins (C++)

32 GUIDO: An example { [ \beamsOff | \clef \key f#*1/8. g*1/16 | a*1/4. d2*1/8 d*1/4. c#*1/8 | e1*1/2 _*1/4 f#*1/8. g*1/16 | c#2*1/4. b1*1/8 a*1/4. g*1/8 | | e#*1/2 f#*1/4 f#*1/8. g*1/16 | a*1/4. d2*1/8 d*1/4. c#*1/8 | e1*1/2 _*1/4 f#*1/8 g | c#2*1/4. b1*1/8 a*1/4. c#*1/8 ], …


Download ppt "Digitization of the Lester S. Levy Collection of Sheet Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury,"

Similar presentations


Ads by Google