Aakarsh Malhotra (2011002) Gandharv Kapoor(2011047)

Slides:



Advertisements
Similar presentations
Watson and the Jeopardy! Challenge Michael Sanchez
Advertisements

An investigation into the security features offered by Oracle 10g Enterprise Edition Author: Keletso Nyathi Supervisor: Mr John Ebden Computer Science.
LIBRA: Lightweight Data Skew Mitigation in MapReduce
UIMA David Gondek Knowledge Capture and Learning DeepQA IBM Research.
Watson Systems By- Team 7 : Pallav Dhobley Vihang Gosavi Ashish Yadav
Netscape Application Server Application Server for Business-Critical Applications Presented By : Khalid Ahmed DS Fall 98.
Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
Leveraging Community-built Knowledge For Type Coercion In Question Answering Aditya Kalyanpur, J William Murdock, James Fan and Chris Welty Mehdi AllahyariSpring.
Showcasing work by IBM on IBM’s Watson and Jeopardy!
Web Server Hardware and Software
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Parallel and Distributed IR
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
WHAT HAVE WE DONE SO FAR?  Weeks 1 – 8 : various components of an information retrieval system  Now – look at various examples of information retrieval.
CLOUD COMPUTING.
Watson Robert Yates Watson Core Development.  A brief History of Watson  What is it good for?  How does it work?  Current Focus Agenda.
Chapter 2 Computer Clusters Lecture 2.1 Overview.
U.S. Department of the Interior U.S. Geological Survey David V. Hill, Information Dynamics, Contractor to USGS/EROS 12/08/2011 Satellite Image Processing.
Server System. Introduction A server system is a computer, or series of computers, that link other computers or electronic devices together. They often.
Ihr Logo Data Explorer - A data profiling tool. Your Logo Agenda  Introduction  Existing System  Limitations of Existing System  Proposed Solution.
1 Developing Native Device for MPJ Express Advisor: Dr. Aamir Shafi Co-advisor: Ms Samin Khaliq.
IBM’s Watson. IBM’s Watson represents an innovation in Data Analysis Computing called Deep QA (Question Answering) Their project is a hybrid technology.
CSC 9010 Spring Paula Matuszek A Brief Overview of Watson.
© 2012 International Business Machines Corporation IBM Watson in Health Care Joel Farrell, IBM MedBiquitous Annual Conference 2013.
1 Wikification CSE 6339 (Section 002) Abhijit Tendulkar.
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
W HAT IS H ADOOP ? Hadoop is an open-source software framework for storing and processing big data in a distributed fashion on large clusters of commodity.
Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering April 4, 2011 Marco Valtorta How Does Watson Work?
1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information.
Slide 1 Physical Architecture Layer Design Chapter 13.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
The Internet 8th Edition Tutorial 4 Searching the Web.
4 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Computer Software Chapter 4.
Computer Hardware and Software Yong Choi School of Business CSU, Bakersfield.
Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –
Apache Hadoop Daniel Lust, Anthony Taliercio. What is Apache Hadoop? Allows applications to utilize thousands of nodes while exchanging thousands of terabytes.
Grid Computing Framework A Java framework for managed modular distributed parallel computing.
Spatiotemporal Saliency Map of a Video Sequence in FPGA hardware David Boland Acknowledgements: Professor Peter Cheung Mr Yang Liu.
RDFPath: Path Query Processing on Large RDF Graph with MapReduce Martin Przyjaciel-Zablocki et al. University of Freiburg ESWC May 2013 SNU IDB.
Site Technology TOI Fest Q Celebration From Keyword-based Search to Semantic Search, How Big Data Enables That?
HADOOP Carson Gallimore, Chris Zingraf, Jonathan Light.
DAVID CALAWA IBM DATA MINING TOOLS. PRODUCTS Cognos A suite of products focusing on analyzing and displaying data Watson A cloud based analytics service.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
Cloud Computing project NSYSU Sec. 1 Demo. NSYSU EE IT_LAB2 Outline  Our system’s architecture  Flow chart of the hadoop’s job(web crawler) working.
By, Rutika R. Channawar. Content Introduction Open Handset Alliance Minimum Hardware Requirements Versions Feature Architecture Advantages Disadvantages.
UIC at TREC 2006: Blog Track Wei Zhang Clement Yu Department of Computer Science University of Illinois at Chicago.
Next Generation of Apache Hadoop MapReduce Owen
Learn Hadoop and Big Data Technologies. Hadoop  An Open source framework that stores and processes Big Data in distributed manner on a large groups of.
Data Centers and Cloud Computing 1. 2 Data Centers 3.
Online School Management System Supervisor Name: Ashraful Islam Juwel Lecturer of Asian University of Bangladesh Submitted By: Bikash Chandra SutrodhorID.
By: Joel Dominic and Carroll Wongchote 4/18/2012.
© 2007 IBM Corporation IBM Software Strategy Group IBM Google Announcement on Internet-Scale Computing (“Cloud Computing Model”) Oct 8, 2007 IBM Confidential.
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
IBM WATSON IT-01 Hiral Patel IT-04 Charmy Adhyaru.
Compute and Storage For the Farm at Jlab
Chapter 10: Web Basics.
SAS users meeting in Halifax
MapReduce Compiler RHadoop
Netscape Application Server
Cognitive Computing for Democratizing Domain-Specific Knowledge.
Introduction to HDFS: Hadoop Distributed File System
Hadoop Clusters Tess Fulkerson.
Charles Tappert Seidenberg School of CSIS, Pace University
MapReduce: Simplified Data Processing on Large Clusters
Presentation transcript:

Aakarsh Malhotra ( ) Gandharv Kapoor( )

 Introduction A cognitive technology that processes information more like a human than a computer. Named after IBM founder Thomas J. Watson. It is IBM’s Question Answering (QA) project led by David Ferrucci.David Ferrucci It answers questions asked in natural language with speed, accuracy and confidence.

 Natural Language ◦ Watson can read and understand natural language, important in analyzing unstructured data that make up as much as 80 percent of data today.  Hypothesis Generation ◦ When asked a question, Watson relies on hypothesis generation and evaluation to rapidly parse relevant evidence and evaluate responses from disparate data.  Dynamic Learning ◦ Through repeated use, Watson literally gets smarter by tracking feedback from its users and learning from both successes and failures.

 Pizza box sized.  Understands complex human language well, including slangs, metaphors and badly framed question and answers precisely with confidence.  Keyword-based search to intuitive, personalised search, with confidence- ranked response.

Watson competed on Jeopardy! against former winners Brad Rutter and Ken Jenning.Brad RutterKen Jenning

A short video….

 IBM developed DeepQA, a massively parallel software architecture that examined natural language content in both the clues set by Jeopardy and in Watson's own stored data.  DeepQA works out what the question is asking, then works out some possible answers based on the information.  It then generated a ranked list of answers, with evidence for each of its options.  All the information had to be locally stored.  Watson wasn't allowed to connect to the Internet during the quiz.

 Articles  Wikipedia  Internal organization documents ◦ Say, like a doctor’s notes, his experiences!!  Encyclopaedias, Dictionaries, Thesauri

 Hardware Watson is composed of : o Cluster of ninety IBM Power 750 servers. o Each of which uses a 3.5 GHz POWER7 eight core processor, with four threads per core. o In total, the system has 2,880 POWER7 processor cores and has 16 TB of RAM.  Software o Watson uses IBM's DeepQA software and the Apache UIMA (Unstructured Information Management Architecture) framework. o System is written in various languages, including Java, C++ and Prolog. o Runs on the Linux Enterprise Server 11 operating system using Apache Hadoop framework to provide distributed computing.

 All the components after DeepQA give features of computation and confidence. Hierarchical machine learning is used to combine these features.  After searching using these keywords, candidate answers are found out. Those with very low confidence are rejected. Those with high confidence are moved to final merging stage. Medium confidence results are checked in soft filtering phase based on machine learning on training data.

 you-need-to-know-about-the-technology-behind- watson/   son 

Thanks!!