Exploring a million hours of sounds Richard Ranft, The British Library 27 November 2014 Search Solutions 2014
2 the British Library’s audio collections discovery and access finding one in a million Outline
3 The British Library’s audio collections originated in 1955 national collection of UK record industry selected publications from overseas radio broadcasts unpublished recordings
4 Subjects music spoken word environments & nature
5 Extent 6 million tracks from 1857 to this morning many formats 115 years of listening
6 Obstacles to exploring and access copyrights analogue or offline digital many non-digital tracks time-based = time consuming limited, text-based search no serendipity high expectations (c.f. iTunes, Spotify)
Online consumer audio services
‘opacity’ of audio (no freeze- frames!)
9 Human-led enrichment description transcription annotation category tagging rating, recommendation & review
Machine enrichment/search Categorisation Music genre, language/dialect detection, mood Synchronisation Score following Transcript following Identification Speaker/vocalist ID Melody recognition Query by humming/tapping Non-text browsing Map browse Timeline browse Recommendation & matching melody matching Cross-media linking Speaker/ tune matching Feature extraction Pitch, tempo, chord, time signature, rhythm Segmentation/event detection Music/speech segments Speaker/ lead instrument change Laughter, applause, emotion detection Transcription Speech-to-text Score generation
11 Discovery and access Sound & Moving Image Catalogue sami.bl.uk sami.bl.uk onsite listening: –Appointments service –SoundServer (200,000 tracks, 3% of total) off site listening: –BL Sounds website (50,000 tracks, 1%) streaming downloading
12 Sound & Moving Image Catalogue sami.bl.uk sami.bl.uk
BL Sounds
Improving access and discovery
Visualisation and analysis
21 Current BL projects ‘Metable’ software: acquire / describe UK’s digital music, searching via APIs across open music databases (MusicBrainz, Decibel, Discogs) COMMA: cloud-based media analysis project with BBC Digital Music Lab: analysing and visualising big music data collections
22 Digital Music Lab example Chord detection using Chordino VAMP Plugin (Queen Mary University of London)
23 English conversation: At the Tobacconist's (1929) Linguaphone 78rpm shellac disc spoken-word-recordings/024M-1CS XX-0200V0
25 Thanks for listening!