Presentation is loading. Please wait.

Presentation is loading. Please wait.

Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –

Similar presentations


Presentation on theme: "Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –"— Presentation transcript:

1 Making Watson Fast Daniel Brown HON111

2 Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds – Initial application speed: 1-2 hours processing time per question

3 Unstructured Information Management Architecture (UIMA): framework for NLP applications; facilitates parallel processing – UIMA-AS: Asynchronous Scaleout UIMA chosen at start for these reasons; other optimization work only began after 2 years (after QA accuracy/confidence improved)

4 UIMA implementation of DeepQA

5 Type System Common Analysis Structure (CAS) Annotator – CAS multiplier (CM): creates new “children” CASes Flow Controller CASes can be spread across multiple systems (processed in parallel) for efficiency

6 Scaling out Two systems: – Development (+question processing) Meant to analyze many questions accurately – Production (+speed) Meant to answer one question quickly

7 Scaling out: UIMA-AS (UIMA-AS: Asynchronous Scaleout) – Manages multithreading, communication between processes necessary for parallel processing Feasibility test: simulated production system with 110 processes, 110 8-core machines – Goal: less than 3 seconds; actual: more than 3 seconds – Two sources of latency: CAS serialization, network communication – Optimizing CAS serialization resulted in runtime of <1s

8 Scaling out: Deployment 400 processes, 72 machines

9 How to find time bottlenecks in such a system? – Monitoring tool – Integrated timing measurements (in flow controller component)

10 RAM Optimizations Wanted to avoid disk read/write time delays, so all (production system) data was put into RAM Some optimizations: – Reference size reduction – Java object size reduction – Java object overhead – String size – Special hash tables – Java garbage collection with large heap sizes *Full GC between games

11 Indri Search Optimizations Indri search: used to find most relevant 1-2 sentences from Watson database Using single processor, primary search takes too long (i.e. 100s) – Supporting evidence search even longer Solution? – Divide corpus (body of information to search) into chunks, then assign each search daemon a chunk – (specifically, 50GB corpus of 6.8 million documents, 79 chunks of 100000 documents each, 79 Indri search daemons with 8 CPU cores each; end result, 32 passage queries could be run at once)

12 Preprocessing and Custom Content Services Watson must first analyze the passage texts before being able to use them – Deep NLP analysis - semantic/structural parsing, etc. Since Watson had to be self-contained, this analysis could be done before run time (preprocessed) – Used Hadoop (distributed file system software) – 50 machines, 16GB/8 cores each

13 Preprocessing and Custom Content Services Retrieving the preprocessed data? – Preprocessed data much larger than unprocessed corpus (~300GB total) – Built custom content server – allocated data to 14 machines, ~20GB each – Documents then were accessed from these servers

14 End result Parallel processing combined with a number of other performance optimizations resulted in a final average latency of less than 3 seconds. – No one “silver bullet” solution

15


Download ppt "Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –"

Similar presentations


Ads by Google