Googalize your Search with DirectInfo Documents DirectInfo Documents - New Features Author: Kiril Rusev Software Architect Semantec Bulgaria OOD Semantec GmbH Benzstr. 32 D Herrenberg, Germany
Agenda Motivation What is DirectInfo Documents? What's new? Live Demo Future development
Motivation - The Need ? ? ?
Motivation - The Challenge Database Data Local Files Internet Intranet
Motivation - The Answer Oracle Text Index DirectInfo Document Files Database Data Web Contents Structured Search Results
What is DirectInfo? A framework based on Oracle Text Can index and search into various data sources Can be extended Can be adjusted to the customer’s needs
Oracle Text - how does indexing work?
DirectInfo and Oracle Text Oracle Text Context indexes with USER_DATASTORE Full control over the indexing Flexible and extensible filtering Custom defined document grouping Regular index management Effective caching mechanism Fast and flexible searching A lot of context information Summarizing capabilities Oracle DirectInfo
DirectInfo Architecture
What is DirectInfo Documents? Based on DirectInfo platform A powerful document searching tool A web based “google-like” application Easily managed and deployed
What's new? Speed improvement Robustness Manageability Functional improvements LF and search results presentation improved
Speed improvement – Document Cache User Datastore PL/SQL Procedure NullF ilter PDF HTML Filtering HTML Document Cache Store/Retrieve HTML Filtering is done only once The HTML version of the document is cached
Speed improvement – Faster Crawling DirectInfo Interne t Local Files Crawler Interface File Crawler Web Crawler Other… Crawlers are adjusted according to the target document sources
Robustness – Better Filtering Before: Datastore INSO Filter PDF HTML XFilter After: Datastore PDFHTML NULL Filter HTML Filter 1Filter 2Filter N …
Manageability - Indexing in Chunks Before: Dtx_Ddl.Sync_Index Index Unstoppable !!! After: Index Dtx_Ddl.Sync_Index ………
Functional improvements - Duplicated Files Detection Before: Found FilesIndexed Files After: Found Files Indexed Files
Functional improvements - Summarizer
LF and search results presentation improved Deferred fragments loading Skins support, XP look and feel Visual and functional redesign - HTML Frames Searching made more simple
Live Demo
Future development Defining and searching of meta data Search results clustering Improved flexibility Improved administration Improved caching Better summarizing
Thank You!