Download presentation
Presentation is loading. Please wait.
Published byPhilip Floyd Modified over 9 years ago
1
L.A.S.I. Feasibility Presentation Presented by: CS410 Red Group November 12, 2012 Linguistic Analysis for Subject Identification
2
Team Red Staff Chart Introduction Societal Problem Case Study Proposed Solution Major Component Diagram Algorithm The Competition Risk Conclusion Outline 2 November 12, 2012 410 Red Group
3
Team Red Staff Chart 3 Scott Minter Project Co Leader Software Specialist Brittany Johnson Project Co Leader Documentation Specialist Dustin Patrick Algorithm Specialist Expert Liaison Richard Owens Documentation Specialist Communication Specialist Aluan Haddad Algorithm Specialist Software Specialist Erik Rogers Marketing Specialist GUI Developer November 12, 2012 410 Red Group
4
What is a theme? 4 November 12, 2012 410 Red Group
5
A specific and distinctive quality, characteristic, or concern. 1 1 “Theme” Merriam Webster 5 November 12, 2012 410 Red Group
6
What are you looking for when you are identifying a theme? 6 November 12, 2012 410 Red Group
7
Who What When Where Why How 5 W’s & 1 H 7 November 12, 2012 410 Red Group
8
Bill’s stove was broken. He has been saying for months that he would go to the appliance store to buy a new one. He had some free time yesterday, so he drove to the store to buy a new stove. 8 410 Red Group November 12, 2012
9
WhoBill WhatHe travelled to some place WhenYesterday WhereThe store WhyTo buy a stove because his broke HowBy driving 9 410 Red Group November 12, 2012
10
Bill drove to the store yesterday to buy a new stove because his broke. 10 410 Red Group November 12, 2012 The Theme from the 5 W’s & 1 H
11
Why are themes important? Comprehension Summarization Assists in communication between people 11 November 12, 2012 410 Red Group
12
Societal Problem It is difficult for people to identify a common theme over a large set of documents in a timely, consistent, and objective manner. 12 November 12, 2012 410 Red Group
13
How long does it take? Finding a theme over multiple documents is a time-consuming process. The average reading speed of an adult is 250 words per minute. 2 2 Thomas "What Is the Average Reading Speed and the Best Rate of Reading?" 13 November 12, 2012 410 Red Group
14
Consistency and Objectivity The criteria for evaluation may vary from person to person. Large quantities of documents must be mentally digested, assessed, and interrelated. 14 November 12, 2012 410 Red Group
15
Dr. Patrick Hester “My research interests include multi-objective decision making under uncertainty, probabilistic and non probabilistic uncertainty analysis, critical infrastructure protection, and decision making using modeling and simulation.” 3 - Dr. Hester Ph. D. from Vanderbilt University, 2007 Major: Risk and Reliability Engineering and Management 15 3 Patrick Hester Website November 12, 2012 410 Red Group
16
Dr. Hester is a systems analyst and researcher ▫He Must Conduct extensive research Quickly become familiar with client systems Formulate concise, objective assessments LASI will help with all of this 16 410 Red Group November 12, 2012
17
Assessment Improvement Design (A.I.D.) Preliminary Problem statement Identified from document Problem statement then used to find Critical Operational Issues (COI’s) COIs used to find Measures of Effectiveness (MOE’s) MOE’s used to find Measures of Performance (MOP’s) 17 November 12, 2012 410 Red Group
18
Customer Contact Situational Awareness Meeting Will NCSOSE be needed? Client Goes Elsewhere no yes Document Gathering Process Document Analysis Is Customer satisfied? no Problem Statement Presentation yes Current Method 18 Continue on to the rest of the A.I.D Process November 12, 2012 410 Red Group
19
LASI: Linguistic Analysis for Subject Identification THEMES LASI 19 November 12, 2012 410 Red Group
20
Our Proposed Solution LASI is a linguistic analysis decision support tool used to help determine a common theme across multiple documents. It is our goal with LASI to: ▫accurately find themes ▫be system efficient ▫provide consistent results 20 November 12, 2012 410 Red Group
21
What do we mean by “linguistic analysis”? The contextual study of written works and how the words combine to form an overall meaning. 21 November 12, 2012 410 Red Group
22
Linguistic analysis involves SyntacticSemantic Logical grammar Statistical Data Alphabetical Frequencies Word Counts Parts of Speech Word Dependencies Relating syntactic structures to language- independent meanings Extracting meaning and conceptional arguments Summarization 22 November 12, 2012 410 Red Group
23
The Wills and Will Nots of LASI What LASI Will DoWhat LASI Will Not Do Analyze multiple documents to find common themes Provide statistical data to help a user make a decision Provide a concise synopsis Provide a single theme 23 November 12, 2012 410 Red Group
24
Who Would This Appeal To? Researchers Consultants Academics Students 24 November 12, 2012 410 Red Group
25
Benefits To The Customer Time saving Objective output Consistent output Cost saving solution 25 November 12, 2012 410 Red Group
26
How does LASI fit into our Case Study? 26 November 12, 2012 410 Red Group
27
Customer Contact Situational Awareness Meeting Will NCSOSE be needed? Client Goes Elsewhere no yes Document Gathering Process Document Analysis Is the Customer satisfied? no Problem Statement Presentation yes Before LASI 27 November 12, 2012 Continue on to the rest of the A.I.D Process 410 Red Group
28
Customer Contact Situational Awareness Meeting Will NCSOSE be needed? Client Goes Elsewhere no yes Document Gathering Process LASI Aided Document Analysis Is the Customer satisfied? no Problem Statement Presentation yes 28 After LASI November 12, 2012 Continue on to the rest of the A.I.D Process 410 Red Group
29
Major Functional Components User Interface: - Multi-Level Views - Weighted Phrase List - Detailed Breakdown - Step by Step Justification Software High End Notebook PC - Computation Quad-Core CPU - Primary Memory 8.0 GB DDR3 RAM - Document Storage Solid State Storage ~$1500 USD Algorithm: Extrapolates the most likely congruence of themes and ideas across all documents in the input domain Hardware 29 November 12, 2012 410 Red Group
30
Linguistic Analysis Algorithm Secondary Analysis: Associative Identification Bind Pronouns to Nouns, Updating Frequency Identify Potential Noun Phrases Bind Adjectives to Nouns Primary Analysis: Word Count and Syntactic Assessment Identify Corresponding Parts of Speech Determine Frequency by Grammatical Role Traverse Document in Word-Wise Manner Tertiary Analysis: Semantic Relationship Assessment Identify Potential Synonyms Assess Potential Subject- Object-Verb Relationships Output List of Weighted Themes 30 November 12, 2012 410 Red Group
31
The Competition 31 November 12, 2012 410 Red Group
32
The Competition 32 November 12, 2012 410 Red Group
33
WordStat 33 November 12, 2012 410 Red Group
34
Stanford CoreNLP 34 November 12, 2012 410 Red Group
35
ReadMe 35 November 12, 2012 410 Red Group
36
Automap 36 November 12, 2012 410 Red Group
37
Risk Matrix Customer Risks C1 -- Product Interest C2 -- Maintenance C3 -- Trust Technical Risks T1 -- System Limitations T2 -- Scanned Text Recognition T3 -- Jargon Recognition T4 – Illegal Character Handling 37 November 12, 2012 410 Red Group
38
Customer Risks C1. Product Interest Probability 2 Impact 4 Mitigation: LASI offers unique functionality and user friendliness. C2. Maintenance Probability 3 Impact 2 Mitigation: LASI will be a free, open source application allowing the community to maintain and extend it over time. C3. Trust Probability 3Impact 3 Mitigation: LASI will provide a step by step breakdown of output analysis and algorithm reasoning 38 November 12, 2012 410 Red Group
39
Technical Risks T1. System Limitations Probability 4 Impact 2 Mitigation: LASI will be designed from the ground up in native C++ for memory and CPU efficient code. T2. Scanned Text Recognition Probability 4 Impact 3 Mitigation: LASI will implement an optical character recognition algorithm to handle scanned text 39 November 12, 2012 410 Red Group
40
Technical Risks T3. Jargon Recognition Probability 3 Impact 2 Mitigation: LASI will have domain specific dictionaries and feature intuitive contextual inference. T4. Illegal Character Handling Probability 4 Impact 2 Mitigation: LASI will providers contextual inference, synonym recognition and statistical methods 40 November 12, 2012 410 Red Group
41
LASI is feasible. LASI is a decision support tool not a decision making tool. Implications of success affect a wide area of study and professions. In order for LASI to succeed the output needs to immediately usable and the interface user- friendly. Conclusion 41 November 12, 2012 410 Red Group
42
References 1."Theme." Def. 1b. Merriam Webster. N.p., n.d. Web. 19 Oct. 2012.. 2.Thomas, Mark. "What Is the Average Reading Speed and the Best Rate of Reading?" What Is the Average Reading Speed and the Best Rate of Reading? Web. 19 Oct. 2012.. 3.“Patrick Hester" Old Dominion University. N.p., n.d. Web. 24 Sept. 2012. Stanislaw Osinski, Dawid Weiss. 13 August, 2012. Carrot 2. 9/25/2012. ”WordStat” Provalis Research. Web. 24 Sept. 2012.. “ReadMe: Software for Automated Content Analysis” Web. 24 Sept. 2012. "AlchemyAPI Overview." AlchemyAPI. N.p., n.d. Web. 19 Oct. 2012.. "AutoMap:." Project. N.p., n.d. Web. 19 Oct. 2012.. "CL Research Home Page." CL Research Home Page. N.p., n.d. Web. 19 Oct. 2012.. 42 November 12, 2012 410 Red Group
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.