Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2012 Deep Web Technologies, Inc. Swetswise Searcher Powered by Explorit Research Accelerator By Abe Lederman President and CTO Copenhagen, Denmark 11.

Similar presentations


Presentation on theme: "© 2012 Deep Web Technologies, Inc. Swetswise Searcher Powered by Explorit Research Accelerator By Abe Lederman President and CTO Copenhagen, Denmark 11."— Presentation transcript:

1 © 2012 Deep Web Technologies, Inc. Swetswise Searcher Powered by Explorit Research Accelerator By Abe Lederman President and CTO Copenhagen, Denmark 11 June 2012

2 © 2012 Deep Web Technologies, Inc. 2 About Deep Web Technologies... Founded by Abe Lederman in 2002 – A co-founder of Verity, acquired by Autonomy – BS & MS Degrees in Computer Science from MIT – 25 years experience in Information Retrieval 20 person company based in Santa Fe, New Mexico Over $5M in DOE SBIR Grants (2002- 2011) Pioneer/trailblazer in federated search

3 © 2012 Deep Web Technologies, Inc. 3 Customers Include... Academic: Stanford University George Mason University Texas Medical Center University College of Cork Tennessee Community College Consortia Public Portals: WorldWideScience.org Science.gov Biznar Mednar ScienceResearch.com Government: Defense Technical Info Center (DTIC) Office of Sci. & Tech. Info (DOE-OSTI) UNECA European Space Agency Corporate: Boeing BASF Intel HP P&G

4 © 2012 Deep Web Technologies, Inc. 4 What is the Deep Web? The Deep Web is a collection of internet information sources that are generally not accessible to web spiders or crawlers and can not, therefore, be indexed for search by popular search engines such as Google, Yahoo! or Bing (the Surface Web). It is estimated that there is more than 500 times more content in the Deep Web than the Surface Web.

5 © 2012 Deep Web Technologies, Inc. 5 What is “Federated Search”? “Federated Search is an application or service that allows users to submit a real-time search in parallel to multiple, distributed information sources and retrieve aggregated, ranked and de-duplicated results.”

6 © 2012 Deep Web Technologies, Inc. 6 Public Web Sources Public Web Sources One Search, Many Sources Blogs eBooks Enter Your Search… Begin Search Internal Databases Internal Databases Journals Wikis Subscription Sources Subscription Sources

7 © 2012 Deep Web Technologies, Inc. 7 Why Federated Search? 4 Big Reasons… 1. Provides greater efficiency than searching sources one by one 2. Returns the most current information because sources are searched in real-time 3. Eliminates learning disparate publisher interfaces 4. Simplifies discovery of the most relevant results

8 © 2012 Deep Web Technologies, Inc. 8 Best Science-Focused Engines 5 of 9 created by DWT Science.gov WorldWideScience.org ScienceResearch.com ScienceAccelerator Scitopia.org

9 © 2012 Deep Web Technologies, Inc. 9 Science.gov (2002)

10 © 2012 Deep Web Technologies, Inc. 10 WorldWideScience.org (2007)

11 © 2012 Deep Web Technologies, Inc. 11 Science Accelerator (2006)

12 © 2012 Deep Web Technologies, Inc. 12 ScienceResearch.com (2005)

13 © 2012 Deep Web Technologies, Inc. 13 Scitopia.org (2007-2011)

14 © 2012 Deep Web Technologies, Inc. 14 Presentation available at: www.deepwebtech.com/ala2011.ppt Presentation available at: www.deepwebtech.com/ala2011.ppt

15 © 2012 Deep Web Technologies, Inc. 15 It is too slow Connectors break Brings back too few results from each source Brings back too many results Unable to rank results well (meta- data differences, lack of info) Federated Search Has Gotten a Bad Reputation

16 © 2012 Deep Web Technologies, Inc. SW Searcher vs. Discovery Services SwetsWise SearcherDiscovery Service Real-time search of multiple collections Multiple collections are indexed to one database Initial results returned in 3-4 seconds – Remaining results incrementally returned in up to 30 seconds Results returned within 1-3 seconds New results are available as soon as on publisher’s site New results are available only after re-indexing Searches full text where possible Mostly indexes just metadata Search any collection regardless of publisher Search only collections the service subscribes to

17 © 2012 Deep Web Technologies, Inc. 17 Drawbacks of Discovery Services Lack of transparency of what’s in Service Incomplete coverage of publisher content Lag between when content appears on publisher site and when available on Discovery Service Normalized metadata loses content source-specific metadata Content in Service limited by relationships, content of general interest

18 © 2012 Deep Web Technologies, Inc. 18 Landscape is Not So Clear Summon (ProQuest) – Discovery Service EDS (EBSCO) –Discovery Service + Federated Search WorldCat Local (OCLC) –Discovery Service + Federated Search Primo (Ex Libris) –Discovery Service + Federated Search Encore Synergy (Innovative Interfaces) –Limited Discovery Service + Federated Search Explorit (Deep Web Technologies) –Federated Search

19 © 2012 Deep Web Technologies, Inc. 19 When Should You Choose Federated Search? Access to up-to-date information is important. You want control of your sources. You want to search internal/non- mainstream sources Your research is specialized (ex. Medical and legal) You have a wide range of subscribed content (ex. EBSCO and ProQuest)

20 © 2012 Deep Web Technologies, Inc. 20 Partners since January 2010

21 © 2012 Deep Web Technologies, Inc. 21 Major Advantages of SwetsWise Searcher Rich, easy-to-use interface Incremental display of results Sophisticated connector technology Retrieve 50-100 results or more per source Relevance ranking Smart clustering Alerts and Search Builder Metrics

22 © 2012 Deep Web Technologies, Inc. 22 Easy-to-use Interface Simple Search Box – One-Search, “Google-like” box – Can be embedded in your home page, blog or intranet.

23 © 2012 Deep Web Technologies, Inc. 23 Advanced Search Page – Unlimited categories (sources can be in multiple categories) – Select sources to search – One or Two columns – Fielded Searching – Boolean Searching AND, OR, NOT

24 © 2012 Deep Web Technologies, Inc. 24 Incremental Results

25 © 2012 Deep Web Technologies, Inc. 25 Connectors: Think “Connections” Connectors make it possible to talk to other data sources –Each source is unique so connectors “normalize” a query –Submit proper authentication to sources –Extract the right results –Parse results to display the data

26 © 2012 Deep Web Technologies, Inc. 26 Connector Monitoring Proactively monitor connectors Monitor: source health, speed, responsiveness and errors Evaluated by dedicated software maintenance engineers Generally errors are discovered by our team before users ever notice a problem

27 © 2012 Deep Web Technologies, Inc. 27 Relevance Ranking Occurance of search terms within titles & snippets Assigning weight to sources More current reults are assigned greater weight Read: “Ranking: The Secret Sauce for Searching the Deep Web”“Ranking: The Secret Sauce for Searching the Deep Web”

28 © 2012 Deep Web Technologies, Inc. 28 Clustering Real-time semantic analysis of results creates clusters on-the-fly. Discover relationships behind the results, not just “keywords.” Read: “Clusters That Think”“Clusters That Think”

29 © 2012 Deep Web Technologies, Inc. 29 Alerts – Delivery online or via email – Daily, Weekly, Monthly – Pick and choose your sources – Export to RSS reader – Maintain database of past results Alerts – Delivery online or via email – Daily, Weekly, Monthly – Pick and choose your sources – Export to RSS reader – Maintain database of past results

30 © 2012 Deep Web Technologies, Inc. 30 Search Builder – Create search pages easily – Choose collections and search fields – Integrates with Course Management Software – Embed search box using built-in widget

31 © 2012 Deep Web Technologies, Inc. 31 SwetsWise Searcher Metrics Graphics-based or tabular Single day (hourly breakdown) or entire month Downloadable to spreadsheet Reports include: – Number of queries run – Number of results retrieved per source – Average time to retrieve results from a source – Average rank of results retrieved per source – Timeouts/errors by source – Searches run (query strings) – Clickthrough stats

32 © 2012 Deep Web Technologies, Inc. 32

33 © 2012 Deep Web Technologies, Inc. Deep Web Technologies hosts the application Client hosts the application Technical support through Deep Web Technologies Client IT staff must support application Deep Web Technologies can access application at any time Deep Web Technologies has limited or no access to the application Deep Web Technologies monitors and maintains connectors Deep Web Technologies monitors and maintains accessible connectors Limited or no ability to access internal sources Can access internal sources Hosted vs. Installed Solutions Hosted Installed

34 © 2012 Deep Web Technologies, Inc. 34 Multilingual WorldWideScience.org

35 © 2012 Deep Web Technologies, Inc. 35 WorldWideScience.org is an Excellent Candidate for Multilingual Search A global gateway to international science databases and portals All content is from national governments or vetted by national governments Developed in partnership with the DOE Office of Scientific and Technical Information (OSTI), WWS Alliance and Microsoft Research One-stop searching Includes databases from China, Japan, Korea, Germany, and other non-English countries

36 © 2012 Deep Web Technologies, Inc. 36 How Multilingual Federated Search Works Ranked results translated by Microsoft to user’s language Results returned to user EXPLORIT Microsoft Translator German Chinese Russian Query in user’s language Ranked results in user’s language Query to be translated for each source Query in source’s language Foreign language search engines Results in source’s language Ranking

37 © 2012 Deep Web Technologies, Inc. 37

38 © 2012 Deep Web Technologies, Inc. 38 Coming in the Fall Visualization Full-Faceted Navigation Mendeley Integration Document Type and Document Format Clusters Full Text Filter

39 © 2012 Deep Web Technologies, Inc. 39 Visualization Using our clustering technology, results visualization allows users to see relationships between topics easily.

40 © 2012 Deep Web Technologies, Inc. 40 Mendeley

41 © 2012 Deep Web Technologies, Inc. 41 Document Type and Document Format Clusters

42 © 2012 Deep Web Technologies, Inc. 42 Full Text Filter Access Full Text!

43 © 2012 Deep Web Technologies, Inc. 43 Future - Mobile Searching

44 © 2012 Deep Web Technologies, Inc. 44 Thank you! Abe Lederman abe@deepwebtech.com


Download ppt "© 2012 Deep Web Technologies, Inc. Swetswise Searcher Powered by Explorit Research Accelerator By Abe Lederman President and CTO Copenhagen, Denmark 11."

Similar presentations


Ads by Google