ONZEminer Margaret Maclagan, ONZE director Robert Fromont, designer
What is ONZEminer? A freeware tool for searching and analysing data bases containing speech recordings Speech recordings can be organised into different sub collections or corpora within one data base The corpora can contain a lot of material –ONZE in NZ contains over 1000 hours
What is the sample format? Sound samples are digitised in.wav format –Divided into approx 5 minute segments Samples are transcribed using Transcriber –Transcriber is freeware transcription software –Enables the sound file and the transcription to be time aligned –When you click into any section of the transcript, it plays the sound file
How do you transcribe? Transcriptions can be done directly into Transcriber Or an existing transcription can be copied into Transcriber and time-aligned (a hand out is available at To make a new transcription, you open the sound file and type, breaking at a natural point Make breaks about each line on the screen
How do you create a data base? Transcripts are stored on a server Speech.wav files may be stored on the server or on CDs/DVDs In ONZEminer, you set up as many separate corpora as you want within your data base Samples are uploaded to the server from within ONZEminer
How do you organise a data base? You can create as many separate corpora as you need within your data base You search one corpus at a time Speakers can belong to more than one corpus, since a corpus is collection whose parameters you decide Speakers can be classified by age, gender, birth date etc
How do you use ONZEminer? ONZEminer lets you search the data base –For all examples of a given word or phrase in All speakers or A subset of speakers A list of all examples found is shown You can click on any example and listen to it You can thus listen to every example of a particular word or phrase in the whole corpus
How do you use ONZEminer? The results can be saved as a csv file which can be opened in Excel Analysis can be saved –Directly in the Excel file –Or in any way that suits your project
What analyses can you do? Listening: auditory impressionistic analysis Acoustic analysis in Praat –If you click the Praat icon, the transcript line opens in a Praat window –Praat textgrids can be automatically generated Linguistic analysis if Celex is installed –Phonology, syntax and morphology can be searched All words with a particular phoneme A particular syntactic or morphological structure
Where is online help? There is help within ONZEminer. Its website is The ONZE website is Transcription guidelines and how to use Transcriber can be found at The ONZE Transcription Guide is at the bottom of the page