Honours Project Proposal Sanvir Manilal Lebogang Molwantoa Kyle Williams Project Supervisor: Dr Hussein Suleman 22 May 2009 Bushman OnLine Dictionary
The Bleek and Lloyd Collection is set of indigenous artefacts. Project aims to integrate a collection of digital scans corresponding to a dictionary to the existing Bleek and Lloyd Collection.
Image-based dictionary matches a word to an image. Image-based dictionary for indigenous languages – used as live reference. Aim to integrate collection of digital scans as image-based dictionary.
Is it possible to create a reusable, generic archival system that allows users to access an image-based dictionary?
Key-features: Archive Collection Management Lebogang Molwantoa Search and Browsing functionality Sanvir Manilal Image-based translation. Kyle Williams
High-level overview of system design Lebogang Molwantoa Sanvir Manilal Kyle Williams
Research Question tackled: Can we develop a useful and efficient archival system for an image-based dictionary? Archive management collection component of the project can be considered as back end to the system.
Archive is a repository for scanned images and associated metadata. Extensible and allow easy update via API. Size and complexity presents a challenge
Can we develop a useful and efficient archival system? Efficient archive capable of processing and archiving large numbers of images efficiently.
Research Question: Can image based searching be done accurately and efficiently? Needs to return correct results Needs to return results in short amount of time
Subset of CBIR specifically for handwritten historical documents Performs image matching on images of words Used to find repeat occurrences of words in manuscripts
Look at features of word/phrase Create signature for word/phrase based on feature vector Insert signature into database
User selects word/phrase in collection Signature is calculated Signature is used to search for matches in archive
Accuracy? Evaluation based on controlled tests as well as random tests Key success will be the ability to accurately and efficiently provide image based translation
Requirements gathering UCT Fine arts Iterative design and development Three prototyping phases Carrying out periodic evaluation with users Final system integration Carrying out evaluation of the overall system functioning and performance
Research question tackled: Can searching and browsing a visual dictionary be done in an efficient and effective way? Unique problem
A few techniques: Live Search Inexact pattern matching Thumbnails Hyperlinks Scrollable list of words
Evaluation: Efficiency Performance Effectiveness Aesthetics
Fully working system with three key components Archive – Lebogang Molwantoa Tools for searching and displaying – Sanvir Manilal Image matching tool – Kyle Williams Reports, website, poster, reflections – Everyone
Three component project Iterative design Evaluation Questions?