Abe Lederman, President and CTO Deep Web Technologies, Inc. ScienceEducation.gov Meeting National Academy of Sciences, March 18, 2009 A Look at the Technology.

Slides:



Advertisements
Similar presentations
Open Source Intelligence: Presented by Abe Lederman, President and CTO Deep Web Technologies, LLC IOP 06 Sheraton Premier, Tysons Corner, Virginia January.
Advertisements

Wincite Knowledge Warehousing and Networking Sophisticated Simplicity.
Welcome to the World of Knovel The Interactive Service Provider.
Lorrie Apple Johnson Lead Librarian, Information Analysis & Services Office of Scientific and Technical Information (OSTI) National Academy of Sciences.
© 2014 IHS IHS K NOWLEDGE C OLLECTIONS Solve Engineering Problems Faster with Technical Knowledge at Your Fingertips.
Environmental Science 2012
1 Presented By Avinash Gutte Under The Guidance of Mrs. Hemangi Kulkarni Department of Computer Engineering Pimpri-Chinchwad College of Engineering, Pune.
© 2009 Deep Web Technologies, Inc. Federated Search: A Tool for Knowledge Discovery iGroup Online Education Conference Presented by Abe Lederman Founder.
Lesson 2 Technology: Federated Searching Explained.
University of Kansas Data Discovery on the Information Highway Susan Gauch University of Kansas.
 Search engines are programs that search documents for specified keywords and returns a list of the documents where the keywords were found.  A search.
Wrecks of the World II Presentation, the Economic Impact Presented by: Avalon Mayor Martin L. Pagliughi Avalon Public Information Officer Scott Wahl Wrecks.
Federated Search: True Enterprise Search Abe Lederman, President and CTO Deep Web Technologies Search Engine Meeting – April 28-29, 2008.
Global Discovery: Turning Vision into Reality Presented by Abe Lederman, President and CTO Deep Web Technologies, LLC Symposium: Global Discovery on the.
Standards Aligned System April 21, 2011 – In-Service.
EARTH DAY. Background  Before Earth Day was founded, air, water, and land pollution were rampant as industries paid little attention to the environment.
Abe Lederman, President and CTO Deep Web Technologies 2008 STIP Working Meeting, April 23, 2008 Federated Search: The Technology For Making Global Discovery.
Divide and Conquer: Challenges in Scaling Federated Search Presented by Abe Lederman, President and CTO Deep Web Technologies, LLC SearchEngine Meeting.
The Latest in Information Technology for Research Universities.
Enrich Social Studies Exit Project with Technolgoy Using TrackStar for Exit Project Research iLearn Social Studies Session 3 of 8.
© 2011 Deep Web Technologies, Inc. By Abe Lederman President and CTO June 26, 2011 Understanding Differences Between Federated Search and Discovery Services.
© 2012 Deep Web Technologies, Inc. 03 December 2012 By Abe Lederman, CEO Deep Web Technologies Show and Tell Presentation to.
Period 3 Kiana Brayton Lyle Swallows
Five Years InterLab ’07 Los Alamos, New Mexico October 1–3, 2007 Valerie S. Allen, MSLIS U.S. Department of Energy Office of Scientific and.
Science Research: Journey to 10,000 Sources Presented by: Abe Lederman, President and Founder Deep Web Technologies, Inc. Special Libraries Association.
WaterSection 3 Water Pollution Water pollution is the introduction of chemical, physical, or biological agents into water that degrade water quality. The.
energypedia Introduction 23th July 2012.
YAHOO! DIRECTORY Andreja Borin. Web directory a link database on the World Wide Web it links onto the other web sites organized into categories and subcategories.
© 2010 Deep Web Technologies, Inc. By Abe Lederman President and CTO Explorit Federated Search.
© 2009 Deep Web Technologies, Inc. Federated Search Presentation Explorit Research Accelerator Focus Deep. Get Results.
© 2013 Deep Web Technologies, Inc. Abe Lederman President and CTO Deep Web Technologies ANKOS 2013 Annual Meeting April 26, 2013 Federated Search: A Discovery.
Applying Grid Computing Research to Commercial IR Applications Presented by Carl Sylvia, SBIR Project Manager Deep Web Technologies, LLC GGF-14 – June.
Not All Federated Searches are Created Equal Abe Lederman, President and CTO Deep Web Technologies Thomson Scientific Government Event, April 10, 2008.
© 2012 Deep Web Technologies, Inc. SwetsWise Medical Searcher Powered by Explorit Research Accelerator By Abe Lederman President and CTO July 15, 2012.
Google Directory By, Dixie E. Oyola. Google Directory The Google Web Directory integrates Google's sophisticated search technology with Open Directory.
Under the Microscope: Science Resources in GALILEO and the National Science Digital Library. Modified from a Galileo presentation, June 2009.
McLean HIGHER COMPUTER NETWORKING Lesson 7 Search engines Description of search engine methods.
© 2009 Deep Web Technologies, Inc. Federated Search for Academic Libraries Explorit Research Accelerator Focus Deep. Get Results.
1 Deep Web Searching Carl Heine, Ph.D. Illinois Mathematics and Science Academy.
The MSR-UR Curriculum Repository Tom Healy Lead Program Manager Microsoft Research University Relations.
Search & Searchability. Presentation from David Hawking – CSIRO Ineffectual corporate search tools can be the biggest drag on employee productivity. Knowledge.
© 2009 Deep Web Technologies, Inc. Federated Search for Government Agencies Explorit Research Accelerator Focus Deep. Get Results.
Uniting Global Information with Federated Search Abe Lederman, President, Deep Web Technologies Dr. Rosanne Hessmiller, CEO, Ferguson-Lynch Presentation.
R ead T he B oard !  Please bring your field journal in the box on Lab Table #3  Please get out your Venn Diagram for a stamp. Compare your diagram with.
1 OSTI - Accelerating Science Information Dr. Walter L. Warnick Director U.S. Department of Energy Office of Scientific and Technical Information Federal.
 Please get out objectives #8-11 for a stamp. Read the Board!
RSC Publishing Platform Amanda Sun
Why Navigation is So Important in Your Website Design.
Federated Search: The Good and the Bad Abe Lederman, President and CTO Deep Web Technologies, Inc. APLA May 9, 2008.
Dr. Walter L. Warnick Director Office of Scientific and Technical Information Office of Science ARPA-E June 24, 2010 Innovative Web Resources Can Advance.
© 2010 Deep Web Technologies, Inc. Taking the Library Back from Google Abe Lederman, President and CTO Deep Web Technologies May 12, 2010.
SEO Friendly Website Building a visually stunning website is not enough to ensure any success for your online presence.
Leveraging Publisher’s Search Engines to Deliver Relevant Results to Users Presented by Abe Lederman, President and CTO Deep Web Technologies, LLC 28 th.
Advancing Science: OSTI’s Current and Future Search Strategies Jeff Given IT Operations Manager Computer Protection Program Manager Office of Scientific.
Eric W. Wohlers, PE Env. Health Director Chris Crawford, Ph.D. Water Resource Specialist Cattaraugus County.
Saving Time with Federated Search Abe Lederman, President, Deep Web Technologies Terry Colby, Director of Sales, Deep Web Technologies Websearch University,
Taking the Library Back from Google Abe Lederman, President and CTO October 18-20, 2007.
Science Exit Projects Using PageKeeper to Build a Library of Online Resources iLearn Science Session 8.3.
Search Engine Optimization Miami (SEO Services Miami in affordable budget)
You’re one of a kind. Your career opportunity should be too.
Crawling When the Google visit your website for the purpose of tracking, Google does this with help of machine, known as web crawler, spider, Google bot,
Search Engine Optimization(S.E.O)
Understand Internet Search Tools
Fred Dirkse CEO, OIC Group, Inc.
Agenda What is SEO ? How Do Search Engines Work? Measuring SEO success ? On Page SEO – Basic Practices? Technical SEO - Source Code. Off Page SEO – Social.
By Abe Lederman President and CTO June 26, 2011
Uniting Global Information with Federated Search
Uniting Global Information with Federated Search
Access to Quality, Deep Web Research Content
Social Media Google+ Marketing.
Presentation transcript:

Abe Lederman, President and CTO Deep Web Technologies, Inc. ScienceEducation.gov Meeting National Academy of Sciences, March 18, 2009 A Look at the Technology Under the Hood

Content Integration Technologies for ScienceEducation.gov Crawling and Indexing (Part of Science.gov, E-Print Network) Federated Search (Science.gov, WorldWideScience.org) ScienceEducation.gov Needs to successfully integrate content from a variety of websites and databases requiring custom tools other search engines are unable to provide.

Drawing on the Experience of the E-Print Network Gateway to 30,000 websites and databases worldwide, containing over 5 million e-prints in basic and applied sciences.

Drawing on the Experience of the E-Print Network Initially developed in 2001 Crawls and indexes 30,000 websites Uses sophisticated filters to ensure that only quality e-prints are included in the Network Contains full-text index of over 1.5 million e-prints Uses an Admin Tool to manage websites in the E-Print Network

What is Federated Search? Federated Search is an application or service that allows a user to submit a search in parallel to multiple, distributed information sources and retrieve aggregated, ranked and de- duped results.

In Other Words… One Search, Many Sources DOD Search EPA NASA FDA NIH DOE NSF Other Agencies

Assembling the ScienceEducation.gov Search Engine- Part I Assemble Starting URLs Education Experts

Assembling the ScienceEducation.gov Search Engine- Part II Starting URLs Crawl Websites Filter Bad URLs And Remove Duplicates Build Index Assign Learning Levels ScienceEducation.gov Index

Challenges Ahead Determining what sites to crawl Filtering undesirable URLs Assigning appropriate learning level to content Categorizing content

To Crawl or Not To Crawl? Would miss these Don’t crawl these pages Will crawl these

Filtering Undesirable URLs All Crawled URLs Filter Good URLs Calendar Contact Feedback Housing. Registration Survey

Removing Duplicate Web Pages URL: DUP: TITLE: Ocean Planet: Threats SNIPPET: Threats to the health of the oceans Oil spills account for only about five percent of the oil entering the oceans The Coast Guard estimates that for United States waters sewage treatment plants discharge twice as much oil each year as tanker spills Each year industrial household cleaning gardening and automotive products pollute water About chemicals are used commercially in the United States today with about new ones added each year Only about 300 have been extensively tested for toxicity It is estimated that medical waste that washed up onto Long Island and New Jersey beaches in the summer of 1988 cost as much as 3 billion in lost revenue from tourism and recreation.

Learning Level Stratification

Categorizing Content Audience: Student or Teacher Grade Level: K-3, 4-6, 7-9, 10-12, College Content Type: Interactive Activities, Lesson Plans, Reference Materials, Science Fair Projects, Videos Subject Area: Chemistry, Computer Science, Energy, Life Sciences, Mathematics, Physics

A Look at the Technology Under the Hood Thank you! Abe Lederman