Download presentation
Presentation is loading. Please wait.
1
Linguistic Analysis for Subject Identification
LASI Linguistic Analysis for Subject Identification Milestone Presentation Presented by: CS410 Red Group September 20, 2018
2
Outline Team Red Staff Chart Document Parsing Introduction Weighter
September 20, 2018 Outline Team Red Staff Chart Document Parsing Introduction Weighter Problem Statement GUI Flow LASI in our Case Study GUI Screenshots Risk Matrix Functional Components Competition Matrix Conclusion Algorithms Milestones
3
Team Red Staff Chart Scott Minter Dustin Patrick Richard Owens
September 20, 2018 Team Red Staff Chart Scott Minter Project Co Leader Software Specialist Brittany Johnson Project Co Leader Documentation Specialist Dustin Patrick Algorithm Specialist Expert Liaison Richard Owens Documentation Specialist Communication Specialist Aluan Haddad Algorithm Specialist Software Specialist Erik Rogers Marketing Specialist GUI Developer
4
September 20, 2018 What is LASI?
5
LASI: Linguistic Analysis for Subject Identification
September 20, 2018 LASI: Linguistic Analysis for Subject Identification THEMES LASI LASI
6
LASI Identifies Themes (5 W’s & 1 H)
September 20, 2018 LASI Identifies Themes (5 W’s & 1 H) Who What When Where Why How
7
Why are themes important?
September 20, 2018 Why are themes important? Comprehension Summarization Assists in communication between people
8
September 20, 2018 Societal Problem It is difficult for people to identify a common theme over a large set of documents in a timely, consistent, and objective manner.
9
September 20, 2018 Our Proposed Solution LASI is a linguistic analysis decision support tool used to help determine a common theme across multiple documents. It is our goal with LASI to: accurately find themes be system efficient provide consistent results
10
What do we mean by “linguistic analysis”?
September 20, 2018 What do we mean by “linguistic analysis”? The contextual study of written works and how the words combine to form an overall meaning.
11
September 20, 2018 Dr. Patrick Hester & Dr. Tom Meyers: The AID Process Assessment Improvement Design Dr. Hester & Dr. Meyers are systems analysts and researchers for NCSOSE Conduct extensive research Quickly become familiar with client systems Formulate concise, objective assessments Dr. Hester Dr. Meyers
12
Before LASI yes Is the Customer satisfied?
September 20, 2018 Before LASI Continue on to the rest of the A.I.D Process Customer Contact yes Is the Customer satisfied? Situational Awareness Meeting Problem Statement Presentation no Will NCSOSE be needed? Document Gathering Process Document Analysis yes Client Goes Elsewhere no
13
After LASI yes Is the Customer satisfied?
September 20, 2018 After LASI Continue on to the rest of the A.I.D Process Customer Contact yes Is the Customer satisfied? Situational Awareness Meeting Problem Statement Presentation no Will NCSOSE be needed? Document Gathering Process Document Analysis yes Client Goes Elsewhere no
14
Major Functional Components
September 20, 2018 Major Functional Components Hardware Software Algorithm: Extrapolates the most likely congruence of themes and ideas across all documents in the input domain High End Notebook PC - Computation Quad-Core CPU - Primary Memory 8.0 GB DDR3 RAM - Document Storage Solid State Storage ~$1500 USD User Interface: - Multi-Level Views - Weighted Phrase List - Detailed Breakdown - Step by Step Justification
15
Linguistic Analysis Algorithm
September 20, 2018 Linguistic Analysis Algorithm Primary Analysis: Word Count and Syntactic Assessment Secondary Analysis: Associative Identification Tertiary Analysis: Semantic Relationship Assessment Traverse Document in Word-Wise Manner Bind Pronouns to Nouns, Updating Frequency Identify Potential Synonyms Identify Corresponding Parts of Speech Bind Adjectives to Nouns Assess Potential Subject-Object-Verb Relationships Determine Frequency by Grammatical Role Identify Potential Noun Phrases Output List of Weighted Themes
16
September 20, 2018 LASI Milestones
17
September 20, 2018 Document Parsing
18
September 20, 2018 Weighter
19
September 20, 2018 GUI Flow
20
September 20, 2018 Splash Screen
21
September 20, 2018 New Project Screen
22
September 20, 2018 Results Page
23
Risk Matrix Customer Risks C1 -- Product Interest C2 -- Maintenance
September 20, 2018 Risk Matrix Customer Risks C1 -- Product Interest C2 -- Maintenance C3 -- Trust Technical Risks T1 -- System Limitations T2 -- Scanned Text Recognition T3 -- Jargon Recognition T4 – Illegal Character Handling
24
Customer Risks C1. Product Interest C2. Maintenance C3. Trust
September 20, 2018 Customer Risks C1. Product Interest Probability 2 Impact 4 Mitigation: LASI offers unique functionality and user-friendliness. C2. Maintenance Probability 3 Impact 2 Mitigation: LASI will be a free, open source application allowing the community to maintain and extend it over time. C3. Trust Probability 3 Impact 3 Mitigation: LASI will provide a step by step breakdown of output analysis and algorithm reasoning
25
Technical Risks T1. System Limitations T2. Scanned Text Recognition
September 20, 2018 Technical Risks T1. System Limitations Probability 4 Impact 2 Mitigation: LASI will be designed from the ground up in native C++ for memory and CPU efficient code. T2. Scanned Text Recognition Probability 4 Impact 3 Mitigation: LASI will implement an optical character recognition algorithm to handle scanned text
26
Technical Risks T3. Jargon Recognition Probability 3 Impact 2
September 20, 2018 Technical Risks T3. Jargon Recognition Probability 3 Impact 2 Mitigation: LASI will have domain specific dictionaries and feature intuitive contextual inference. T4. Illegal Character Handling Probability 4 Impact 2 Mitigation: LASI will providers contextual inference, synonym recognition and statistical methods
27
September 20, 2018 The Competition
28
Conclusion There is a need for LASI LASI is an algorithm heavy program
September 20, 2018 Conclusion There is a need for LASI LASI is an algorithm heavy program Success is beneficial to anyone needing to analyze large sets of documents in a timely, consistent and objective manner
29
September 20, 2018 References “Patrick Hester" Old Dominion University. N.p., n.d. Web. 24 Sept. 2012 < "Tom Meyers." NCSOSE. N.p., n.d. Web. 22 Nov < Stanislaw Osinski, Dawid Weiss. 13 August, Carrot 2. 9/25/2012 < ”WordStat” Provalis Research. Web. 24 Sept < “ReadMe: Software for Automated Content Analysis” Web. 24 Sept < readme.pdf> "AlchemyAPI Overview." AlchemyAPI. N.p., n.d. Web. 19 Oct < "AutoMap:." Project. N.p., n.d. Web. 19 Oct < "CL Research Home Page." CL Research Home Page. N.p., n.d. Web. 19 Oct <
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.