Download presentation
Presentation is loading. Please wait.
Published byLindsay Wilcox Modified over 9 years ago
1
8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University of Limerick Ireland
2
8th Annual CSIS Research Conference 2 Introduction ?- how to classify sound resources and how to provide an interface to browse these resources. !- provide a browsable sound database for users via intranet / Internet environments
3
8th Annual CSIS Research Conference Overview of Research Areas Sound Classification Sound Representation Sound Browsing
4
8th Annual CSIS Research Conference Sound Classification Two levels of classification Course level –Distinguish whether Speech, Music, Environmental, Silence or Other category Fine level –Use human perceptual features
5
8th Annual CSIS Research Conference Coarse-level classification of audio (1) –Audio signals are classified into basic types, including speech, music, several types of environmental sounds, and silence –Take morphological and statistical analyses of short- time feature curves (energy function, average zero- crossing rate, fundamental frequency), as well as a rule- based heuristic classification procedure
6
8th Annual CSIS Research Conference Coarse-level classification of audio (2) Short-time energy function –Short-time energy of audio signal reflects the amplitude variations over time Short-time average zero-crossing rate –ZCR is the number of times the signal passes through zero in a given time interval Spectral Centroid
7
8th Annual CSIS Research Conference Fine-level classification of audio Further classification will be conducted within each basic type: –music: classify music played by different instruments, different types of music, singing, plain song –speech: differentiate voices of man, woman, and child, speech with music background –environmental sound: divide them into classes such as applause, bell ring, footstep, windstorm, laughter, bird’s sound, and so on
8
8th Annual CSIS Research Conference Sound Representation Previous work has concentrated on –Visual star-field type display New novel visual representations –Visualisations on spheres (non-Euclidean spaces) –Hyper tree –Excentric labeling
9
8th Annual CSIS Research Conference Star-field Display Virtual University - Uni. Vienna
10
8th Annual CSIS Research Conference Visualisations on Spheres H3: Laying Out Large Directed Graphs in 3D Hyperbolic Space - Munzer
11
8th Annual CSIS Research Conference Hyper Tree www.inxight.com
12
8th Annual CSIS Research Conference Excentric Labeling HCIL – Uni. Maryland
13
8th Annual CSIS Research Conference Sound Browsing Iterative & Interactive Activity: –Opportunistic & Serendipitous Enable users’ to explore a data set External & internal properties of objects: –Context & Content Evaluate and revise understanding of relationships
14
8th Annual CSIS Research Conference 14 The Sonic Browser Application Audio: Direct representation of tunes (exploting the cocktailparty effect) Sounds are panned out in a stereo field controlled by the visual location of the tunes nearest to the cursor. The volume of the tunes playing concurrently is proportional to the visual distance between the objects and the cursor
15
8th Annual CSIS Research Conference 16 The Sonic Browser Application
16
8th Annual CSIS Research Conference Client – Server Issues let the server do the mixing and spatialisation analysis and classification on server lightweight client - Java. different network topologies and protocols. –Latency issues –Use of a floating ‘Aura’
17
8th Annual CSIS Research Conference Cue Points Use Cue Points as Marker Points –Mark a specific point or section of a sound Play only significant portion of sound while browsing Reduce time to identify sound by playing characteristic or significant part Found in many common sound file formats *Technical Report UL-IDC-01-02
18
8th Annual CSIS Research Conference 22 Application Platform: HW & OS Normal Multimedia PC –(Pentium II/III w. SB Live, etc) Server –MS Windows 98/2000 Client –Any O/S with Java Runtime
19
8th Annual CSIS Research Conference Conclusion Facilitate different visualisation tools, e.g. for non-Euclidean space. Address payment and copyright issues Investigate other file types, e.g. MPEG-7.
20
8th Annual CSIS Research Conference References (1) Brazil, E. (2001). Cue Points: An Examination Of Common Sound File Formats. Limerick, University of Limerick. Fekete, J. D., Plaisant, C. (1999). Excentric Labeling: Dynamic Neighborhood Labeling for Data Visualization. Conference on Human factors in Computer Systems, New York, ACM. Fernström, M., Brazil, E. (2001). Sonic Browsing: An Auditory Tool For Multimedia Asset Management. International Conference on Auditory Display, Espoo, Finland. Ó Maidín, D. and M. Fernström (2000). The Best of Two Worlds: Retrieving and Browsing. COST-G6 Conference on Digital Audio Effects DAFx-00, Verona, Universita degli Studi Verona.
21
8th Annual CSIS Research Conference References (2) Shneiderman, B. (1996). The eyes have it: A task by data type taxonomy for information visualizations. IEEE, Visual Languages, Boulder, CO, USA. Zhang, T., Kuo, C.C. (1998). Content-based Classification and Retrieval of Audio. SPIE's 43rd Annual Meeting - Conference on Advanced Signal Processing Algorithms, Architectures, and Implementations VIII, San Diego. Zhang, T., Kuo, C.C. (1998). Hierarchical System for Content- Based Audio Classification and Retrieval. SPIE's Conference on Multimedia Storage and Archiving Systems III, Boston.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.