1 QA for the Web Language Computer Corporation Dallas, Texas PI: Dan Moldovan

Slides:



Advertisements
Similar presentations
Semantically Grounded Briefings Bob Balzer, Neil Goldman, Marcelo Tallis Teknowledge
Advertisements

ASIAES Project Overview Satellite Image Network for Natural Hazard Management in ASEAN+3 region Pakorn Apaphant Geo-Informatics and Space Technology Development.
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
The Semantic Web. The Web Today Designed for Human to read Cannot express meaning Architecture: URL –Decentralized: Link structure Language: html.
What is the Internet? Internet: The Internet, in simplest terms, is the large group of millions of computers around the world that are all connected to.
Search Engines and Information Retrieval
Information Retrieval in Practice
1 Exploring Marketing Research William G. Zikmund Chapter 2: Global Information Systems.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
1 Computing for Todays Lecture 22 Yumei Huo Fall 2006.
Interpret Application Specifications
© Prentice Hall CHAPTER 3 Computer Software.
Tutorial 8 Sharing, Integrating and Analyzing Data
Your Website Chat & Live Customer Support Solution "Instant Customer GratificationSM" Brought to you by: Affordable Business Productivity and Communications.
Internet Research Search Engines & Subject Directories.
What’s The Difference??  Subject Directory  Search Engine  Deep Web Search.
A Product of Enterprise Content Management System (CMS) Web & Portal Content Management Systems for faster web publishing Copyright.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice An FAQ on FAQs for Libraries Pamela.
Section 13.1 Add a hit counter to a Web page Identify the limitations of hit counters Describe the information gathered by tracking systems Create a guest.
Welcome to the Sinclair Community College Online Employment Applicant Tutorial.
1 Introduction to Web Development. Web Basics The Web consists of computers on the Internet connected to each other in a specific way Used in all levels.
IT 210 The Internet & World Wide Web introduction.
Agents Know-bots, Robots & A.I. By: Brandy S.N. Ervin.
Paul Mundy and Bob Huggan 1 Websites.
Enterprise & Intranet Search How Enterprise is different from Web search What to think about when evaluating Enterprise Search How Intranet use is different.
Chapter 16 The World Wide Web. 2 The Web An infrastructure of information combined and the network software used to access it Web page A document that.
Search Engines and Information Retrieval Chapter 1.
Postacademic Interuniversity Course in Information Technology – Module C1p1 Contents Data Communications Applications –File & print serving –Mail –Domain.
Multi-agent Research Tool (MART) A proposal for MSE project Madhukar Kumar.
Btec National Diploma Level 31 IT Systems Troubleshooting and Repair Identify and select remedies.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
Upgrading to IBM Cognos 10
The Internet Industry Week Four. RISE OF THE INTERNET THE INTERNET – a global system of interconnected private, public, academic, business, and government.
The INTERNET how it works. the internet: defined So, what is it?
WHAT IS A SEARCH ENGINE A search engine is not a physical engine, instead its an electronic code or a software programme that searches and indexes millions.
SEO ENRICH YOUR MARKET BY SMART SEARCH SOLUTIONS1.
1 Search Engines Emphasis on Google.com. 2 Discovery  Discovery is done by browsing & searching data on the Web.  There are 2 main types of search facilities.
Distributed Information Retrieval Using a Multi-Agent System and The Role of Logic Programming.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
Restricted Search Engine Laurent Balat Christophe Decis Thomas Forey Sebastien Leclercq ESSI2 Project Supervisor: Johny BOND June 2002.
SEARCH OPTIMIZER By JAGANI RAJ 7 th /I.T. Guided By: Mrs. Darshana H. Patel.
IT-522: Web Databases And Information Retrieval By Dr. Syed Noman Hasany.
AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation Dallas, Texas.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
The World Wide Web: Information Resource. Hock, Randolph. The Extreme Searcher’s Internet Handbook. 2 nd ed. CyberAge Books: Medford. (2007). Internet.
Ceridian Time Solutions Hourly/Non-Exempt & (Non- Contracted) Per Diems Key Entry.
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
HCI Meeting 23 Thurs, November 18. Looking Ahead Tuesday, 11/23Exam Reprise due Tuesday, 11/30 Thursday, 12/2Research Paper due Tuesday, 12/7Research.
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
Web Server.
1 End User Support Introduction Identify and select remedies.
Copyright © 2002 Pearson Education, Inc. Slide 3-1 Internet II A consortium of more than 180 universities, government agencies, and private businesses.
From XML to DAML – giving meaning to the World Wide Web Katia Sycara The Robotics Institute
The Internet is a Big Collection of Computers and Cables. -"interconnection of computer networks". Millions of personal, business, and governmental.
Integrated Departmental Information Service IDIS provides integration in three aspects Integrate relational querying and text retrieval Integrate search.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
General Architecture of Retrieval Systems 1Adrienn Skrop.
XP Creating Web Pages with Microsoft Office
Session 5: How Search Engines Work. Focusing Questions How do search engines work? Is one search engine better than another?
E-Business Infrastructure PRESENTED BY IKA NOVITA DEWI, MCS.
Chapter 10: Web Basics.
Search Engines & Subject Directories
Robotic Search Engines for the Physical World
Information Technology Ms. Abeer Helwa
Unit# 5: Internet and Worldwide Web
CSE 635 Multimedia Information Retrieval
Search Engines & Subject Directories
Content Augmentation for Mixed-Mode News Broadcasts Mike Dowman
Search Engines & Subject Directories
Computer Terms 1 Terms Internet Terms 1 Internet Terms 2 Computer
Presentation transcript:

1 QA for the Web Language Computer Corporation Dallas, Texas PI: Dan Moldovan

2 Motivation In the US alone, there are more than 100 million Internet users per day Each user asks on average 5 questions Each user spends about half an hour to find answers

3 Tasks Task 1 – Adapt the QA technology to the universality of the Web hypertexts Task 2 – Interface the QA system with the emerging Semantic Web technologies

4 Task 1 Adapt QA Technology to the Web Two approaches: use available Search Engines gather documents from the Web and form a local collection

5 QA on Top of a Search Engine Search Engine Paragraph Retrieval Answer Processing Question Processing Format Manager Documents Normalized Documents Keywords

6 QA on Top of a Database Engine Database Engine Paragraph Retrieval Answer Processing Question Processing Format Manager Database Records Normalized Documents Keywords Query Builder Query

7 Technical challenges Different formats: pdf, html, doc, ps Document layout Pages dynamically generated Password protection Subscription required Cookies

8 Build local collections of documents Gather documents from a specific site, and cache locally Transform in text canonical form, then index documents Maintain document collection: constantly update, avoid redundant documents, garbage collection, etc.

9 Experiments Business: InterVoice Brite Product Manuals Community: City of Irving NEWS: cnn.com, abcnews.com, dallasnews.com, time.com, washingtonpost.com

10 InterVoiceBrite Collection: product manuals size: 38MB files: 802 format: PDF layout: specific to manuals changes occur at large time intervals

11 PECULIARITIES OF THEIR NEEDS The Question is in the form of a problem description The expected answer is a solution to the problem The answer is compiled from different parts of documents and given in the form of a procedure to be followed Follow-ups are frequently leading to dialogue

12 An Example Question: “I would like to have the caller be able to control the playback of a long set of instructions with speech recognition. While the message is playing the caller may say “stop”, “go back”, “forward”, “start over” and have the system respond appropriately. Can this be done? The SpeechAccess engine is Nuance. Answer: “Yes this can be done. Play a lead in message to tell the caller to say “next” “backup” or “done”. Then with the loop play the first instruction you want the caller to hear in keyover mode. To obtain line balancing procedure and the required files please visit the continuing engineering web page”

13 Our Demo Q: How can I obtain line balancing information ? A: READ DSLACRequest AI1 DSLAC line balancing information Q:How can I modify a message ? A: Your Voice The feature that enables a voice mail user to change specific voice messages Q: What is the runtime engine ? A: ISINIT, the runtime engine,

14 Our Demo Q: What type of error is HH ? A: Hardware Handler (HH) error Q: What causes telephony connection problems ? A: Telephony connection problems can be caused by the InterSoft system or by the telephony equipment (PBX) Q: What does FUSE mean ? A: FUSEIndicates a problem with the fuse

15 City of Irving Collection: heterogeneous, city information size: 96MB files: 1097 format: HTML, PDF, DOC layout: WWW space small daily changes

16 Examples Q: When does the Farmer’s Market take place ? A: Irving Farmers ‘ Market: 1 st and 3 rd Saturdays in Downtown Irving Q: What is Irving ‘s news source ? A: Irving ‘s news source is the City Spectrum Q: Where does Irving’ s water supply come from ? A: The City of Irving purchases its entire water supply from the City of Dallas

17 Examples Q: Where can I pay traffic fines ? A: Irving Municipal Court Criminal Justice Center 305 N. O’Connor Rd Q: How do I apply for a job with the City ? A: Applications are accepted from 8a.m. to 5p.m. Monday – Friday at the Civic Center Complex, 825 W. Irving Blvd. Job listings are available on the city ‘s Web site, or by calling the city ‘s 24 –hour job line at (972) www.ci.irving.tx.us

18 NEWS Collection: sources: CNN.COM, TIME.COM, ABCNEWS.COM, DALLASNEWS.COM, WASHINGTONPOST.COM size: 531MB files: format: HTML, PDF, DOC frequent changes

19 Issues broken links garbage collection for obsolete files cumulative NEWS updates depending on the type of source (TIME.COM - weekly)

20 Examples Q: How many soldiers died in Afghanistan? A: The US military has opened an investigation into last week’s friendly fire incident in Afghanistan that killed four Canadian soldiers and injured eight others Q: How much did President Bush increase aid for poor countries ? A: Bush said the US will increase its initial pledge of $ 200 million only after the fund proves successful Q: Who is the owner of Dallas Mavericks ? A: Mark Cuban, Internet entrepreneur and owner of the NBA ‘s Dallas Mavericks

21 QA and Semantic Web QA Technology can contribute to the development of Semantic Web Possible architectures: 1. QA as an interface between Intelligent Agent and the Semantic Web Human Agent QA Web

22 QA and Semantic Web 2. QA works on a local collection Human Agent QA Web Local Collection Human Agent QA Local Collection

23 Technical Challenges to be Addressed 1. Make QA system compatible with semantic web language (i.e. XML, RDF, DAML, OIL, etc.) 2. Make QA ontologies compatible with the Semantic Web ontology 3. Interface QA system with Intelligent Agents

24 Thank you!